Partners: KDAB Whole Tomato Software CppDepend

06 August 2018

How to Initialize a String Member

How to init string member

How do you initialise a string member in the constructor? By using const string&, string value and move, string_view or maybe something else?

Let’s have a look at possible options.

Intro

Below there’s a simple class with one string member. We’d like to initialise it.

For example:

class UserName {
    std::string mName;

public:
    UserName(const std::string& str) : mName(str) { }
};

As you can see a constructor is taking const std::string& str.

You could potentially replace a constant reference with string_view:

UserName(std::string_view sv) : mName(sv) { }

And also you can pass a string by value and move from it:

UserName(std::string s) : mName(std::move(s)) { }

Which one alternative is better?

The Series

This article is part of my series about C++17 Library Utilities. Here’s the list of the other topics that I’ll cover:

Resources about C++17 STL:

Analysing the Cases

Let’s now compare those alternative string passing methods in three cases: creating from a string literal, creating from lvalue and creating from rvalue reference:

// creation from a string literal
UserName u1{"John With Very Long Name"};

// creation from l-value:
std::string s1 { "Marc With Very Long Name"};
UserName u2 { s1 };

// from r-value reference
std::string s2 { "Marc With Very Long Name"};
UserName u3 { std::move(s2) };

And now we can analyse each version - with a string reference a string_view or a value. Please note that allocations/creation of s1 and s2 are not taken into account, we only look at what happens for the constructor call.

For const std::string&:

  • u1 - two allocations: the first one creates a temp string and binds it to the input parameter, and then there’s a copy into mName.
  • u2 - one allocation: we have a no-cost binding to the reference, and then there’s a copy into the member variable.
  • u3 - one allocation: we have a no-cost binding to the reference, and then there’s a copy into the member variable.
  • You’d have to write a ctor taking r-value reference to skip one allocation for the u1 case, and also that could skip one copy for the u3 case (since we could move from r-value reference).

For std::string_view:

  • u1 - one allocation - no copy/allocation for the input parameter, there’s only one allocation when mName is created.
  • u2 - one allocation - there’s cheap creation of a string_view for the argument, and then there’s a copy into the member variable.
  • u3 - one allocation - there’s cheap creation of a string_view for the argument, and then there’s a copy into the member variable.
  • You’d also have to write a constructor taking r-value reference if you want to save one allocation in the u3 case, as you could move from r-value reference.
  • You also have to pay attention to dangling string_views - if the passed string_view points to deleted string object...

For std::string:

  • u1 - one allocation - for the input argument and then one move into the mName. It’s better than with const std::string& where we got two memory allocations in that case. And similar to the string_view approach.
  • u2 - one allocation - we have to copy the value into the argument, and then we can move from it.
  • u3 - no allocations, only two move operations - that’s better than with string_view and const string&!

When you pass std::string by value not only the code is simpler, there’s also no need to write separate overloads for r-value references.

The approach of passing by value is consistent with item 41 - “Consider pass by value for copyable parameters that are cheap to move and always copied” from Effective Modern C++ by Scott Meyers.

However, is std::string cheap to move?

When String is Short

Although the C++ Standard doesn’t specify that, usually, strings are implemented with Small String Optimization (SSO) - the string object contains extra space (in total it might be 24 or 32 bytes), and it can fit 15 or 22 characters without additional memory allocation. That means that moving such string is the same as copy. And since the string is short, the copy is also fast.

Let’s reconsider our example of passing by value when the string is short:

UserName u1{"John"}; // fits in SSO buffer

std::string s1 { "Marc"}; // fits in SSO buffer
UserName u2 { s1 };

std::string s2 { "Marc"}; // fits in SSO buffer
UserName u3 { std::move(s2) };

Remember that each move is the same as copy now.

For const std::string&:

  • u1 - two copies: one copy from the input string literal into a temporary string argument, then another copy into the member variable.
  • u2 - one copy: existing string is bound to the reference argument, and then we have one copy into the member variable.
  • u3 - one copy: rvalue reference is bound to the input parameter at no cost, later we have a copy into the member field.

For std::string_view:

  • u1 - one copy: no copy for the input parameter, there’s only one copy when mName is initialised.
  • u2 - one copy: no copy for the input parameter, as string_view creation is fast, and then one copy into the member variable.
    • u3 - one copy: string_view is cheaply created, there’s one copy of the argument into mName.

For std::string:

  • u1 - two copies: the input argument is created from a string literal, and then there’s copy into mName.
  • u2 - two copies: one copy into the argument and then the second copy into the member.
  • u3 - two copies: one copy into the argument (move means copy) and then the second copy into the member.

As you see for short strings passing by value might be “slower” when you pass some existing string - because you have two copies rather than one. On the other hand, the compiler might optimise the code better when it sees a value. What’s more, short strings are cheap to copy so the potential “slowdown” might not be even visible.

Sorry for a little interruption in the flow :)
I've prepared a little bonus if you're interested in C++17, check it out here:

A Note on Universal (Forwarding) References

There’s also another alternative:

class UserName {
    std::string mName;

public:
    template<typename T>
    UserName(T&& str) : mName(std::<T>forward(str)) { }
};

In this case we ask the compiler to do the hard work and figure out all the proper overloads for our initialization case. It’s not only working for input string arguments, but actually other types that are convertible to the member object.

For now, I’d like to stop here and don’t go into details. You may experiment with that idea and figure out is this the best option for string passing? what are the pros and cons of that approach?

Some more references:

Summary

All in all, passing by value and then moving from a string argument is the preferred solution in Modern C++. You have a simple code and better performance for larger strings. There’s also no risk with dangling references as in the string_view case.

I’ve also asked a question @Twitter about preferences, here’s the summary:

What do you think? Which one do you use in your code? Maybe there’s some other option?

Get my free ebook about C++17!

More than 50 pages about the new Language Standard.

C++17 in detail, by Bartlomiej Filipek

© 2017, Bartlomiej Filipek, Blogger platform
Any opinions expressed herein are in no way representative of those of my employers.
This site contains ads or referral links, which provide me with a commission. Thank you for your understanding.