cpp11 language

C艹11 Language Extensions — General Features

auto

Consider

    auto x = 7;

Here x will have the type int because that’s the type of its initializer. In general, we can write

    auto x = expression;

and the type of x will be the type of the value computed from “expression”.

The use of auto to deduce the type of a variable from its initializer is obviously most useful when that type is either hard to know exactly or hard to write. Consider:

    template<class T> void printall(const vector<T>& v)
    {
        for (auto p = v.begin(); p!=v.end(); ++p)
            cout << *p << "\n";
    }

In C艹98, we’d have to write

    template<class T> void printall(const vector<T>& v)
    {
        for (typename vector<T>::const_iterator p = v.begin(); p!=v.end(); ++p)
            cout << *p << "\n";
    }

When the type of a variable depends critically on template argument it can be really hard to write code without auto. For example:

    template<class T, class U> void multiply(const vector<T>& vt, const vector<U>& vu)
    {
        // ...
        auto tmp = vt[i]*vu[i];
        // ...
    }

The type of tmp should be what you get from multiplying a T by a U, but exactly what that is can be hard for the human reader to figure out, though of course the compiler knows once it has figured out what particular T and U it is dealing with.

The auto feature has the distinction to be the earliest to be suggested and implemented: Stroustrup had it working in his Cfront implementation in early 1984, but was forced to take it out because of C compatibility problems. Those compatibility problems disappeared when C艹98 and C99 accepted the removal of “implicit int”; that is, both languages now require every variable and function to be defined with an explicit type. The old meaning of auto (namely “this is a local variable”) is now illegal. Several committee members trawled through millions of lines of code finding only a handful of uses – and most of those were in test suites or appeared to be bugs.

Being primarily a facility to simplify notation in code, auto does not affect the standard library specification.

See also:

decltype

decltype(E) is the type (“declared type”) of the name or expression E and can be used in declarations. For example:

    void f(const vector<int>& a, vector<float>& b)
    {
        typedef decltype(a[0]*b[0]) Tmp;
        for (int i=0; i<b.size(); ++i) {
            Tmp* p = new Tmp(a[i]*b[i]);
            // ...
        }
        // ...
    }

This notion has been popular in generic programming under the label “typeof” for a long time, but the “typeof” implementations in actual use were incomplete and incompatible, so the standard version is named decltype.

Note: Prefer just using auto when you just need the type for a variable that you are about to initialize. You really need decltype if you need a type for something that is not a variable, such as a return type.

See also:

  • the C艹 draft 7.1.6.2 Simple type specifiers
  • [Str02] Bjarne Stroustrup. Draft proposal for “typeof”. C艹 reflector message C艹std-ext-5364, October 2002. (original suggestion).
  • [N1478=03-0061] Jaakko Jarvi, Bjarne Stroustrup, Douglas Gregor, and Jeremy Siek: Decltype and auto (original proposal).
  • [N2343=07-0203] Jaakko Jarvi, Bjarne Stroustrup, and Gabriel Dos Reis: Decltype (revision 7): proposed wording.

Range-for statement

A range for statement allows you to iterate through a “range”, which is anything you can iterate through like an STL-sequence defined by a begin() and end(). All standard containers can be used as a range, as can a std::string, an initializer list, an array, and anything for which you define begin() and end(), e.g. an istream. For example:

void f(vector<double>& v)
{
    for (auto x : v) cout << x << '\n';
    for (auto& x : v) ++x;  // using a reference to allow us to change the value
}

You can read that as “for all x in v” going through starting with v.begin() and iterating to v.end(). Another example:

    for (const auto x : { 1,2,3,5,8,13,21,34 }) cout << x << '\n';

The begin() (and end()) can be a member to be called as x.begin() or a free-standing function to be called as begin(x). The member version takes precedence.

See also:

Initializer lists

Consider

    vector<double> v = { 1, 2, 3.456, 99.99 };
    list<pair<string,string>> languages = {
        {"Nygaard","Simula"}, {"Richards","BCPL"}, {"Ritchie","C"}
    }; 
    map<vector<string>,vector<int>> years = {
        { {"Maurice","Vincent", "Wilkes"},{1913, 1945, 1951, 1967, 2000} },
        { {"Martin", "Ritchards"}, {1982, 2003, 2007} }, 
        { {"David", "John", "Wheeler"}, {1927, 1947, 1951, 2004} }
    }; 

Initializer lists are not just for arrays any more. The mechanism for accepting a {}-list is a function (often a constructor) accepting an argument of type std::initializer_list<T>. For example:

    void f(initializer_list<int>);
    f({1,2});
    f({23,345,4567,56789});
    f({});  // the empty list
    f{1,2}; // error: function call ( ) missing

    years.insert({{"Bjarne","Stroustrup"},{1950, 1975, 1985}});

The initializer list can be of arbitrary length, but must be homogeneous (all elements must be of the template argument type, T, or convertible to T).

A container might implement an initializer-list constructor like this:

    template<class E> class vector {
    public:
        vector (std::initializer_list<E> s) // initializer-list constructor
        {
                reserve(s.size());  // get the right amount of space
                uninitialized_copy(s.begin(), s.end(), elem);   // initialize elements (in elem[0:s.size()))
            sz = s.size();  // set vector size
        }

        // ... as before ...
    };

The distinction between direct initialization and copy initialization is maintained for {}-initialization, but becomes relevant less frequently because of {}-initialization. For example, std::vector has an explicit constructor from int and an initializer_list constructor:

    vector<double> v1(7);   // ok: v1 has 7 elements
    v1 = 9;                 // error: no conversion from int to vector
    vector<double> v2 = 9;  // error: no conversion from int to vector

    void f(const vector<double>&);
    f(9);                           // error: no conversion from int to vector

    vector<double> v1{7};           // ok: v1 has 1 element (with its value 7.0)
    v1 = {9};                       // ok v1 now has 1 element (with its value 9.0)
    vector<double> v2 = {9};        // ok: v2 has 1 element (with its value 9.0)
    f({9});                         // ok: f is called with the list { 9 }

    vector<vector<double>> vs = {
        vector<double>(10),         // ok: explicit construction (10 elements)
        vector<double>{10},         // ok explicit construction (1 element with the value 10.0)
        10                          // error: vector's constructor is explicit
    };  

The function can access the initializer_list as an immutable sequence. For example:

    void f(initializer_list<int> args)
    {
        for (auto p=args.begin(); p!=args.end(); ++p) cout << *p << "\n";
    }

A constructor that takes a single argument of type std::initializer_list is called an initializer-list constructor.

The standard library containers, string, and regex have initializer-list constructors, assignment, etc. An initializer-list can be used as a range, e.g. in a range for statement.

The initializer lists are part of the scheme for uniform and general initialization. They also prevent narrowing. In general, you should usually prefer initializing using {} instead of () unless you want to share code with a C艹98 compiler or (rarely) need to use () to call a non-initializer_list overloaded constructor.

See also:

Uniform initialization syntax and semantics

C艹98 offers several ways of initializing an object depending on its type and the initialization context. When misused, the error can be surprising and the error messages obscure. Consider:

    string a[] = { "foo", " bar" };          // ok: initialize array variable
    vector<string> v = { "foo", " bar" };    // error: initializer list for non-aggregate vector
    void f(string a[]);
    f( { "foo", " bar" } );                  // syntax error: block as argument

and

    int a = 2;              // "assignment style"
    int aa[] = { 2, 3 };    // assignment style with list
    complex z(1,2);         // "functional style" initialization
    x = Ptr(y);             // "functional style" for conversion/cast/construction

and

    int a(1);   // variable definition
    int b();    // function declaration
    int b(foo); // variable definition or function declaration

It can be hard to remember the rules for initialization and to choose the best way.

The C艹11 solution is to allow {}-initializer lists for all initialization:

    X x1 = X{1,2}; 
    X x2 = {1,2};   // the = is optional
    X x3{1,2}; 
    X* p = new X{1,2}; 

    struct D : X {
        D(int x, int y) :X{x,y} { /* ... */ };
    };

    struct S {
        int a[3];
        S(int x, int y, int z) :a{x,y,z} { /* ... */ }; // solution to old problem
    };

Importantly, X{a} constructs the same value in every context, so that {}-initialization gives the same result in all places where it is legal. For example:

    X x{a}; 
    X* p = new X{a};
    z = X{a};         // use as cast
    f({a});           // function argument (of type X)
    return {a};       // function return value (function returning X)

C艹11 uniform initialization is not perfectly uniform, but it’s very nearly so. C艹11’s {} initialization syntax and semantics provide a much simpler and consistent way to perform initialization, that is also more powerful (e.g., vector<int> v = { 1, 2, 3, 4 }) and safer (e.g., {} does not allow narrowing conversions). Prefer initializing using {}.

See also:

Rvalue references and move semantics

The distinction between lvalues (what can be used on the left-hand side of an assignment) and rvalues (what can be used on the right-hand side of an assignment) goes back to Christopher Strachey (the father of C艹’s distant ancestor CPL and of denotational semantics). In C艹98, references to non-const can bind to lvalues and references to const can bind to lvalues or rvalues, but there is nothing that can bind to a non-const rvalue. That’s to protect people from changing the values of temporaries that are destroyed before their new value can be used. For example:

    void incr(int& a) { ++a; }
    int i = 0;
    incr(i);    // i becomes 1
    incr(0);    // error: 0 in not an lvalue

If that incr(0) were allowed either some temporary that nobody ever saw would be incremented or – far worse – the value of 0 would become 1. The latter sounds silly, but there was actually a bug like that in early Fortran compilers that set aside a memory location to hold the value 0.

So far, so good, but consider

    template<class T> swap(T& a, T& b)      // "old style swap"
    {
        T tmp(a);   // now we have two copies of a
        a = b;      // now we have two copies of b
        b = tmp;    // now we have two copies of tmp (aka a)
    } 

If T is a type for which it can be expensive to copy elements, such as string and vector, swap becomes an expensive operation (for the standard library, we have specializations of string and vector swap() to deal with that). Note something curious: We didn’t want any copies at all. We just wanted to move the values of a, b, and tmp around a bit.

In C艹11, we can define “move constructors” and “move assignments” to move rather than copy their argument:

    template<class T> class vector {
        // ...
        vector(const vector&);          // copy constructor
        vector(vector&&);           // move constructor
        vector& operator=(const vector&);   // copy assignment
        vector& operator=(vector&&);        // move assignment
    };  // note: move constructor and move assignment takes non-const &&
        // they can, and usually do, write to their argument

The && indicates an “rvalue reference”. An rvalue reference can bind to an rvalue (but not to an lvalue):

    X a;
    X f();
    X& r1 = a;      // bind r1 to a (an lvalue)
    X& r2 = f();        // error: f() is an rvalue; can't bind

    X&& rr1 = f();  // fine: bind rr1 to temporary
    X&& rr2 = a;    // error: bind a is an lvalue

The idea behind a move assignment is that instead of making a copy, it simply takes the representation from its source and replaces it with a cheap default. For example, for strings s1=s2 using the move assignment would not make a copy of s2’s characters; instead, it would just let s1 treat those characters as its own and somehow delete s1’s old characters (maybe by leaving them in s2, which presumably is just about to be destroyed).

How do we know whether it’s ok to simply move from a source? We tell the compiler:

    template<class T> 
    void swap(T& a, T& b)   // "perfect swap" (almost)
    {
        T tmp = move(a);    // could invalidate a
        a = move(b);        // could invalidate b
        b = move(tmp);      // could invalidate tmp
    }

move(x) is just a cast that means “you can treat x as an rvalue”. Maybe it would have been better if move() had been called rval(), but by now move() has been used for years. The move() template function can be written in C艹11 (see the “brief introduction”) and and uses rvalue references.

Rvalue references can also be used to provide perfect forwarding.

In the C艹11 standard library, all containers are provided with move constructors and move assignments, and operations that insert new elements, such as insert() and push_back(), have versions that take rvalue references. The net result is that the standard containers and algorithms quietly – without user intervention – improve in performance because they copy less.

See also:

Lambdas

A lambda expression is a mechanism for specifying a function object. The primary use for a lambda is to specify a simple action to be performed by some function. For example:

    vector<int> v = {50, -10, 20, -30};

    std::sort(v.begin(), v.end());  // the default sort
    // now v should be { -30, -10, 20, 50 }

    // sort by absolute value:
    std::sort(v.begin(), v.end(), [](int a, int b) { return abs(a)<abs(b); });
    // now v should be { -10, 20, -30, 50 }

The argument [](int a, int b) { return abs(a)<abs(b); } is a “lambda” (or “lambda function” or “lambda expression”), which specifies an operation that given two integer arguments a and b returns the result of comparing their absolute values.

A lambda expression can access local variables in the scope in which it is used. For example:

    void f(vector<Record>& v)
    {
        vector<int> indices(v.size());
        int count = 0;
        generate(indices.begin(),indices.end(),[&count](){ return count++; });

        // sort indices in the order determined by the name field of the records:
        std::sort(indices.begin(), indices.end(), [&](int a, int b) { return v[a].name<v[b].name; });
        // ...
    }

Some consider this “really neat!”; others see it as a way to write dangerously obscure code. Both are right.

The [&] is a “capture list” specifying that local names used will be passed by reference. Had we wanted to “capture” only v, we could have said so: [&v]. Had we wanted to pass v by value, we could have said so: [=v] or [v]. Capture nothing is [], capture all by reference is [&], and capture all by value is [=].

If an action is neither common nor simple, consider using a named function object or function. For example, the example above could have been written:

    void f(vector<Record>& v)
    {
        vector<int> indices(v.size());
        int count = 0;
        generate(indices.begin(),indices.end(),[&](){ return ++count; });

        struct Cmp_names {
            const vector<Record>& vr;
            Cmp_names(const vector<Record>& r) :vr(r) { }
            bool operator()(int a, int b) const { return vr[a].name<vr[b].name; }
        };

        // sort indices in the order determined by the name field of the records:
        std::sort(indices.begin(), indices.end(), Cmp_names(v));
        // ...
    }

For a tiny function, such as this Record name field comparison, the function object notation is verbose, though the generated code is likely to be identical. In C艹98, such function objects had to be non-local to be used as template argument; in C艹 this is no longer necessary.

To specify a lambda you must provide

  • Its capture list: the list of variables it can use (in addition to its arguments), if any ([&] meaning “all local variables passed by reference” in the Record comparison example). If no names need to be captured, a lambda starts with plain [].
  • (optionally) Its arguments and their types (e.g, (int a, int b))
  • The action to be performed as a block (e.g., { return v[a].name<v[b].name; }).
  • (optionally) The return type using the new suffix return type syntax; but typically we just deduce the return type from the return statement. If no value is returned then void is deduced.

See also:

noexcept to prevent exception propagation

If a function cannot throw an exception or if the program isn’t written to handle exceptions thrown by a function, that function can be declared noexcept. For example:

    extern "C" double sqrt(double) noexcept;    // will never throw

    vector<double> my_computation(const vector<double>& v) noexcept // I'm not prepared to handle memory exhaustion
    {
        vector<double> res(v.size());   // might throw
        for(int i; i<v.size(); ++i) res[i] = sqrt(v[i]);
        return res;
    }

If a function declared noexcept throws (so that the exception tries to escape the noexcept function) the program is terminated by a call to std::terminate(). The call of terminate() cannot rely on objects being in well-defined states; that is, there is no guarantee that destructors have been invoked, no guaranteed stack unwinding, and no possibility for resuming the program as if no problem had been encountered. This is deliberate and makes noexcept a simple, crude, and very efficient mechanism – much more efficient than the old dynamic throw() exception specification mechanism.

It is possible to make a function conditionally noexcept. For example, an algorithm can be specified to be noexcept if (and only if) the operations it uses on a template argument are noexcept:

    template<class T>
    void do_f(vector<T>& v) noexcept(noexcept(f(v.at(0)))) // can throw if f(v.at(0)) can
    {
        for(int i; i<v.size(); ++i)
            v.at(i) = f(v.at(i));
    }

Here, we first use noexcept as an operator: noexcept(f(v.at(0))) is true if f(v.at(0)) can’t throw, that is if the f() and at() used are noexcept.

The noexcept() operator is a constant expression and does not evaluate its operand.

The general form of a noexcept declaration is noexcept(expression) and “plain noexcept” is simply a shorthand for noexcept(true). All declarations of a function must have compatible noexcept specifications.

A destructor shouldn’t throw; a generated destructor is implicitly noexcept (independently of what code is in its body) if all of the members of its class have noexcept destructors (which, ahem, they too will have by default).

It is typically a bad idea to have a move operation throw, so declare those noexcept wherever possible. A generated copy or move operation is implicitly noexcept if all of the copy or move operations it uses on members of its class have noexcept destructors.

noexcept is widely and systematically used in the standard library to improve performance and clarify requirements.

See also:

constexpr

The constexpr mechanism

  • provides more general constant expressions
  • allows constant expressions involving user-defined types
  • provides a way to guarantee that an initialization is done at compile time

Consider

    enum Flags { good=0, fail=1, bad=2, eof=4 };

    constexpr int operator|(Flags f1, Flags f2) { return Flags(int(f1)|int(f2)); }

    void f(Flags x)
    {
        switch (x) {
        case bad:         /* ... */ break;
        case eof:         /* ... */ break;
        case bad|eof:     /* ... */ break;
        default:          /* ... */ break;
        }
    }

Here constexpr says that the function must be of a simple form so that it can be evaluated at compile time if given constant expressions arguments.

In addition to be able to evaluate expressions at compile time, we want to be able to require expressions to be evaluated at compile time; constexpr in front of a variable definition does that (and implies const):

    constexpr int x1 = bad|eof; // ok

    void f(Flags f3)
    {
        constexpr int x2 = bad|f3;  // error: can't evaluate at compile time
        int x3 = bad|f3;        // ok
    }

Typically we want the compile-time evaluation guarantee for global or namespace objects, often for objects we want to place in read-only storage.

This also works for objects for which the constructors are simple enough to be constexpr and expressions involving such objects:

    struct Point {
        int x,y;
        constexpr Point(int xx, int yy) : x(xx), y(yy) { }
    };

    constexpr Point origo(0,0);
    constexpr int z = origo.x;

    constexpr Point a[] = {Point(0,0), Point(1,1), Point(2,2) };
    constexpr int x = a[1].x;   // x becomes 1

Please note that constexpr is not a general purpose replacement for const (or vice versa):

  • const’s primary function is to express the idea that an object is not modified through an interface (even though the object may very well be modified through other interfaces). It just so happens that declaring an object const provides excellent optimization opportunities for the compiler. In particular, if an object is declared const and its address isn’t taken, a compiler is often able to evaluate its initializer at compile time (though that’s not guaranteed) and keep that object in its tables rather than emitting it into the generated code.
  • constexpr’s primary function is to extend the range of what can be computed at compile time, making such computation type safe and also usable in compile-time contexts (such as to initialize enumerator or integral template parameters). Objects declared constexpr have their initializer evaluated at compile time; they are basically values kept in the compiler’s tables and only emitted into the generated code if needed.

See also:

nullptr – a null pointer literal

nullptr is a literal denoting the null pointer; it is not an integer:

    char* p = nullptr;
    int* q = nullptr;
    char* p2 = 0;           // 0 still works and p==p2

    void f(int);
    void f(char*);

    f(0);                   // call f(int)
    f(nullptr);             // call f(char*)

    void g(int);
    g(nullptr);             // error: nullptr is not an int
    int i = nullptr;        // error: nullptr is not an int

See also:

Copying and rethrowing exceptions

How do you catch an exception and then rethrow it on another thread? Use a bit of library magic as described in the standard 18.8.5 Exception Propagation:

  • exception_ptr current_exception(); Returns: An exception_ptr object that refers to the currently handled exception (15.3) or a copy of the currently handled exception, or a null exception_ptr object if no exception is being handled. The referenced object shall remain valid at least as long as there is an exception_ptr object that refers to it. …
  • void rethrow_exception(exception_ptr p);
  • template<class E> exception_ptr copy_exception(E e); Effects: as if
    try {
        throw e;
    } catch(...) {
        return current_exception();
    }

This is particularly useful for transmitting an exception from one thread to another.

Inline namespaces

The inline namespace mechanism is intended to support library evolution by providing a mechanism that supports a form of versioning. Consider:

    // file V99.h:
    inline namespace V99 {
        void f(int);    // does something better than the V98 version
        void f(double); // new feature
        // ...
    }

    // file V98.h:
    namespace V98 {
        void f(int);    // does something
        // ...
    }

    // file Mine.h:
    namespace Mine {
    #include "V99.h"
    #include "V98.h"
    }

We here have a namespace Mine with both the latest release (V99) and the previous one (V98). If you want to be specific, you can:

    #include "Mine.h"
    using namespace Mine;
    // ...
    V98::f(1);  // old version
    V99::f(1);  // new version
    f(1);       // default version

The point is that the inline specifier makes the declarations from the nested namespace appear exactly as if they had been declared in the enclosing namespace.

This is a very “static” and implementer-oriented facility in that the inline specifier has to be placed by the designer of the namespaces – thus making the choice for all users. It is not possible for a user of Mine to say “I want the default to be V98 rather than V99.”

See

  • Standard 7.3.1 Namespace definition [7]-[9].

User-defined literals

C艹 has always provided literals for a variety of built-in types (2.14 Literals):

    123 // int
    1.2 // double
    1.2F    // float
    'a' // char
    1ULL    // unsigned long long
    0xD0    // hexadecimal unsigned
    "as"    // string

However, in C艹98 there are no literals for user-defined types. This can be a bother and also seen as a violation of the principle that user-defined types should be supported as well as built-in types are. In particular, people have requested:

    "Hi!"s          // std::string, not ``zero-terminated array of char''
    1.2i            // imaginary
    123.4567891234df    // decimal floating point (IBM)
    101010111000101b    // binary
    123s            // seconds
    123.56km        // not miles! (units)
    1234567890123456789012345678901234567890x   // extended-precision

C艹11 supports “user-defined literals” through the notion of literal operators that map literals with a given suffix into a desired type. For example:

    constexpr complex<double> operator "" i(long double d)  // imaginary literal
    {
        return {0,d};   // complex is a literal type
    }

    std::string operator""s (const char* p, size_t n)   // std::string literal
    {
        return string(p,n); // requires free store allocation
    }

Note the use of constexpr to enable compile-time evaluation. Given those, we can write

    template<class T> void f(const T&);
    f("Hello"); // pass pointer to char*
    f("Hello"s);    // pass (5-character) string object
    f("Hello\n"s);  // pass (6-character) string object

    auto z = 2+1i;  // complex(2,1)

The basic (implementation) idea is that after parsing what could be a literal, the compiler always checks for a suffix. The user-defined literal mechanism simply allows the user to specify a new suffix and what is to be done with the literal before it. It is not possible to redefine the meaning of a built-in literal suffix or augment the syntax of literals. A literal operator can request to get its (preceding) literal passed “cooked” (with the value it would have had if the new suffix hadn’t been defined) or “uncooked” (as a string).

To get an “uncooked” string, simply request a single const char* argument:

    Bignum operator"" x(const char* p)
    {
        return Bignum(p);
    }

    void f(Bignum);
    f(1234567890123456789012345678901234567890x);

Here the C-style string "1234567890123456789012345678901234567890" is passed to operator"" x(). Note that we did not explicitly put those digits into a string.

There are four kinds of literals that can be suffixed to make a user-defined literal.

  • Integer literal: accepted by a literal operator taking a single unsigned long long or const char* argument.
  • Floating-point literal: accepted by a literal operator taking a single long double or const char* argument.
  • String literal: accepted by a literal operator taking a pair of (const char*, size_t) arguments.
  • Character literal: accepted by a literal operator taking a single char argument.

Note that you cannot make a literal operator for a string literal that takes just a const char* argument (and no size). For example:

    string operator"" S(const char* p);     // warning: this will not work as expected

    "one two"S; // error: no applicable literal operator

The rationale is that if we want to have “a different kind of string” we almost always want to know the number of characters anyway.

Suffixes will tend to be short (e.g. s for string, i for imaginary, m for meter, and x for extended), so different uses could easily clash. Use namespaces to prevent clashes:

    namespace Numerics { 
        // ...
        class Bignum { /* ... */ }; 
        namespace literals { 
            operator"" X(char const*); 
        } 
    } 

    using namespace Numerics::literals; 

See also: