cpp11 language misc

C艹11 Language Extensions — Miscellaneous Language Features

What is the value of __cplusplus for C艹11?

In C艹11 the macro __cplusplus is set to the value 201103L. (Before C艹11, it was 199711L.)

Suffix return type syntax

Consider:

    template<class T, class U>
    ??? mul(T x, U y)
    {
        return x*y;
    }

What can we write as the return type? It’s “the type of x*y”, of course, but how can we say that? First idea, use decltype:

    template<class T, class U>
    decltype(x*y) mul(T x, U y) // scope problem!
    {
        return x*y;
    }

That won’t work because x and y are not in scope. However, we can write:

    template<class T, class U>
    decltype(*(T*)(0)**(U*)(0)) mul(T x, U y)   // ugly! and error prone
    {
        return x*y;
    }

However, calling that “not pretty” would be overly polite.

The solution is put the return type where it belongs, after the arguments:

    template<class T, class U>
    auto mul(T x, U y) -> decltype(x*y)
    {
        return x*y;
    }

We use the notation auto to mean “return type to be deduced or specified later.”

The suffix syntax is not primarily about templates and type deduction, it is really about scope.

    struct List {
        struct Link { /* ... */ };
        Link* erase(Link* p);   // remove p and return the link before p
        // ...
    };

    List::Link* List::erase(Link* p) { /* ... */ }

The first List:: is necessary only because the scope of List isn’t entered until the second List::. Better:

    auto List::erase(Link* p) -> Link* { /* ... */ }

Now neither Link needs explicit qualification.

See also:

Preventing narrowing

The problem: C and C艹 implicitly truncate:

    int x = 7.3;        // Ouch!
    void f(int);
    f(7.3);         // Ouch!

However, in C艹11, {} initialization doesn’t narrow:

    int x0 {7.3};   // error: narrowing
    int x1 = {7.3}; // error: narrowing
    double d = 7;
    int x2{d};      // error: narrowing (double to int)
    char x3{7};     // ok: even though 7 is an int, this is not narrowing
    vector<int> vi = { 1, 2.3, 4, 5.6 };    // error: double to int narrowing

The way C艹11 avoids a lot of incompatibilities is by relying on the actual values of initializers (such as 7 in the example above) when it can (and not just type) when deciding what is a narrowing conversion. If a value can be represented exactly as the target type, the conversion is not narrowing.

    char c1{7};      // OK: 7 is an int, but it fits in a char
    char c2{77777};  // error: narrowing (assuming 8-bit chars)

Note that floating-point to integer conversions are always considered narrowing – even 7.0 to 7.

See also:

Right-angle brackets

Consider

    list<vector<string>> lvs;

In C艹98 this is a syntax error because there is no space between the two >s. C艹11 recognizes such two >s as a correct termination of two template argument lists.

Why was this ever a problem? A compiler front-end is organized parses/stages. This is about the simplest model:

  • lexical analysis (make up tokens from characters)
  • syntax analysis (check the grammar)
  • type checking (find the type of names and expressions)

These stages are in theory and sometimes in practice strictly separate, so the lexical analyzer that determines that >> is a token (usually meaning right-shift or input) has no idea of its meaning; in particular, it has no idea of templates or nested template argument lists. However, to get that example “correct” the three stages must somehow cooperate. The key observation that led to the problem being resolved was that every C艹 compiler already did understand the problem so that it could give decent error messages.

See also:

static_assert compile-time assertions

A static (compile time) assertion consists of a constant expression and a string literal:

    static_assert(expression,string);

The compiler evaluates the expression and writes the string as an error message if the expression is false (i.e., if the assertion failed). For example:

    static_assert(sizeof(long)>=8, "64-bit code generation required for this library.");
    struct S { X m1; Y m2; };
    static_assert(sizeof(S)==sizeof(X)+sizeof(Y),"unexpected padding in S");

A static_assert can be useful to make assumptions about a program and its treatment by a compiler explicit. Note that since static_assert is evaluated at compile time, it cannot be used to check assumptions that depends on run-time values. For example:

    int f(int* p, int n)
    {
        static_assert(p==0,"p is not null");    // error: static_assert() expression not a constant expression
        // ...
    }

(Instead, use a normal assert(p==0 && "p is not null"); or test and throw an exception in case of failure.)

See also:

Raw string literals

In many cases, such as when you are writing regular expressions for the use with the standard regex library, the fact that a backslash (\) is an escape character is a real nuisance, because in regular expressions backslash is used to introduce special characters representing character classes. Consider how to write the pattern representing two words separated by a backslash (\w\\\w):

    string s = "\\w\\\\\\w";    // I hope I got that right

Note that the backslash character is represented as two backslashes in a regular expression. Basically, a “raw string literal” is a string literal where a backslash is just a backslash so that our example becomes:

    string s = R"(\w\\\w)"; // I'm pretty sure I got that right

The original proposal for raw strings presents this as a motivating example

    "('(?:[^\\\\']|\\\\.)*'|\"(?:[^\\\\\"]|\\\\.)*\")|" // Are the five backslashes correct or not?
                            // Even experts become easily confused. 

The R"(...)" notation is a bit more verbose than the “plain” "..." but “something more” is necessary when you don’t have an escape character: How do you put a quote in a raw string? Easy, unless it is preceded by a ):

    R"("quoted string")"    // the string is "quoted string"

So, how do we get the character sequence )" into a raw string? Fortunately, that’s a rare problem, but "(...)" is only the default delimiter pair. We can add delimiters before and after the (...) in "(...)". For example

    R"***("quoted string containing the usual terminator (")")***"  // the string is "quoted string containing the usual terminator (")"

The character sequence after ) must be identical to the sequence before the (. This way we can cope with (almost) arbitrarily complicated patterns.

The initial R of a raw string can be preceded by an encoding-prefix: u8, u, U, or L. For example u8R"(fdfdfa)" is a UTF-8 string literal.

See also:

Attributes

“Attributes” is a new standard syntax aimed at providing some order in the mess of facilities for adding optional and/or vendor specific information into source code (e.g. __attribute__, __declspec, and #pragma). C艹11 attributes differ from existing syntaxes by being applicable essentially everywhere in code and always relating to the immediately preceding syntactic entity. For example:

    void f [ [ noreturn ] ] ()  // f() will never return
    {
        throw "error";  // OK
    }

    struct foo* f [ [ carries_dependency ] ] (int i);   // hint to optimizer
    int* g(int* x, int* y [ [ carries_dependency ] ] );

As you can see, an attribute is placed within double square brackets: [ [] ]. noreturn and carries_dependency are the two attributes defined in the standard.

There is a reasonable fear that attributes will be used to create language dialects. The recommendation is to use attributes to only control things that do not affect the meaning of a program but might help detect errors (e.g. noreturn) or help optimizers (e.g. carries_dependency).

One planned use for attributes is improved support for OpenMP. For example:

    for [ [ omp::parallel() ] ] (int i=0; i<v.size(); ++i) {
        // ...
    }

(Note that this very example again illustrates the concern that attributes will be (mis)used to hide language extensions dressed up as [ [ keywords ] ] … the semantics of a parallel loop are decidedly not the same as a sequential loop.)

As shown, attributes can be qualified.

See also:

Alignment

Occasionally, especially when we are writing code that manipulate raw memory, we need to specify a desired alignment for some allocation. For example:

    alignas(double) unsigned char c[1024];   // array of characters, suitably aligned for doubles
    alignas(16) char[100];          // align on 16 byte boundary

There is also an alignof operator that returns the alignment of its argument (which must be a type). For example

    constexpr int n = alignof(int);     // ints are aligned on n byte boundaries

See also:

C99 features

To preserve a high degree of compatibility, a few minor changes to the language were introduced in collaboration with the C standards committee:

  • long long.
  • Extended integral types (i.e. rules for optional longer int types).
  • UCN changes [N2170==07-0030] “lift the prohibitions on control and basic source universal character names within character and string literals.”
  • concatenation of narrow/wide strings.
  • Not VLAs (Variable Length Arrays) a better version of which is being considered for standardization.

Some extensions of the preprocessing rules were added:

  • __func__ a macro that expands to the name of the lexically current function
  • __STDC_HOSTED__
  • _Pragma: _Pragma( X ) expands to #pragma X
  • vararg macros (overloading of macros with different number of arguments), for example:
    #define report(test, ...) ((test)?puts(#test):printf(_ _VA_ARGS_ _))
  • empty macro arguments

A lot of standard library facilities were inherited from C99 (essentially all changes to the C99 library from its C89 predecessor):

See: