Miscellaneous Technical Issues
What is a function object?
An object that in some way behaves like a function, of course. Typically, that would mean an object of a class that defines the application operator – operator()
.
A function object is a more general concept than a function because a function object can have state that persist across several calls (like a static local variable) and can be initialized and examined from outside the object (unlike a static local variable). For example:
class Sum {
int val;
public:
Sum(int i) :val(i) { }
operator int() const { return val; } // extract value
int operator()(int i) { return val+=i; } // application
};
void f(vector<int> v)
{
Sum s = 0; // initial value 0
s = for_each(v.begin(), v.end(), s); // gather the sum of all elements
cout << "the sum is " << s << "\n";
// or even:
cout << "the sum is " << for_each(v.begin(), v.end(), Sum(0)) << "\n";
}
Note that a function object with an inline application operator inlines beautifully because there are no pointers involved that might confuse optimizers. To contrast: current optimizers are rarely (never?) able to inline a call through a pointer to function.
Function objects are extensively used to provide flexibility in the standard library.
How do I convert a value (a number, for example) to a std::string
?
Call to_string
.
For advanced and corner-case uses that aren’t covered by that answer, read on…
There are two easy ways to do this: you can use the <cstdio>
facilities or the <iostream>
library. In general, you
should prefer the <iostream>
library.
The <iostream>
library allows you to convert pretty much anything to a std::string
using the following syntax (the
example converts a double
, but you could substitute pretty much anything that prints using the <<
operator):
// File: convert.h
#include <iostream>
#include <sstream>
#include <string>
#include <stdexcept>
class BadConversion : public std::runtime_error {
public:
BadConversion(const std::string& s)
: std::runtime_error(s)
{ }
};
inline std::string stringify(double x)
{
std::ostringstream o;
if (!(o << x))
throw BadConversion("stringify(double)");
return o.str();
}
The std::ostringstream
object o
offers formatting facilities just like those for std::cout
. You can use
manipulators and format flags to control the formatting of the result, just as you can for other std::cout
.
In this example, we insert x
into o
via the overloaded insertion operator, <<
. This invokes the iostream
formatting facilities to convert x
into a std::string
. The if
test makes sure the conversion
works correctly — it should always succeed for built-in/intrinsic types, but the if
test is good style.
The expression o.str()
returns the std::string
that contains whatever has been inserted into stream o
, in this
case the string value of x
.
Here’s how to use the stringify()
function:
#include "convert.h"
void myCode()
{
double x = /*...*/ ;
// ...
std::string s = "the value is " + stringify(x);
// ...
}
How do I convert a std::string
to a number?
Call stoi
.
For advanced and corner-case uses that aren’t covered by that answer, read on…
There are two easy ways to do this: you can use the <cstdio>
facilities or the <iostream>
library. In general, you
should prefer the <iostream>
library.
The <iostream>
library allows you to convert a std::string
to pretty much anything using the following syntax (the
example converts a double
, but you could substitute pretty much anything that can be read using the >>
operator):
// File: convert.h
#include <iostream>
#include <sstream>
#include <string>
#include <stdexcept>
class BadConversion : public std::runtime_error {
public:
BadConversion(const std::string& s)
: std::runtime_error(s)
{ }
};
inline double convertToDouble(const std::string& s)
{
std::istringstream i(s);
double x;
if (!(i >> x))
throw BadConversion("convertToDouble(\"" + s + "\")");
return x;
}
The std::istringstream
object i
offers formatting facilities just like those for std::cin
. You can use
manipulators and format flags to control the formatting of the result, just as you can for other std::cin
.
In this example, we initialize the std::istringstream
i
passing the std::string
s
(for example, s
might be the
string "123.456"
), then we extract i
into x
via the overloaded extraction operator, >>
. This invokes the
iostream formatting facilities to convert as much of the string as possible/appropriate based on the type of x
.
The if
test makes sure the conversion works correctly. For example, if the string contains
characters that are inappropriate for the type of x
, the if
test will fail.
Here’s how to use the convertToDouble()
function:
#include "convert.h"
void myCode()
{
std::string s = /*...a string representation of a number...*/ ;
// ...
double x = convertToDouble(s);
// ...
}
You probably want to enhance convertToDouble()
so it optionally checks that there aren’t any left-over characters:
inline double convertToDouble(const std::string& s,
bool failIfLeftoverChars = true)
{
std::istringstream i(s);
double x;
char c;
if (!(i >> x) || (failIfLeftoverChars && i.get(c)))
throw BadConversion("convertToDouble(\"" + s + "\")");
return x;
}
Can I templatize the above functions so they work with other types?
Yes — for any types that support iostream
-style input/output.
For example, suppose you want to convert an object of class Foo
to a std::string
, or perhaps the reverse: from a
std::string
to a Foo
. You could write a whole family of conversion functions based on the ones shown in the previous
FAQs, or you could write a template function so the compiler does the grunt work.
For example, to convert an arbitrary type T
to a std::string
, provided T
supports syntax like std::cout << x
,
you can use this:
// File: convert.h
#include <iostream>
#include <sstream>
#include <string>
#include <typeinfo>
#include <stdexcept>
class BadConversion : public std::runtime_error {
public:
BadConversion(const std::string& s)
: std::runtime_error(s)
{ }
};
template<typename T>
inline std::string stringify(const T& x)
{
std::ostringstream o;
if (!(o << x))
throw BadConversion(std::string("stringify(")
+ typeid(x).name() + ")");
return o.str();
}
Here’s how to use the stringify()
function:
#include "convert.h"
void myCode()
{
Foo x;
// ...
std::string s = "this is a Foo: " + stringify(x);
// ...
}
You can also convert from any type that supports iostream
input by adding this to file convert.h
:
template<typename T>
inline void convert(const std::string& s, T& x,
bool failIfLeftoverChars = true)
{
std::istringstream i(s);
char c;
if (!(i >> x) || (failIfLeftoverChars && i.get(c)))
throw BadConversion(s);
}
Here’s how to use the convert()
function:
#include "convert.h"
void myCode()
{
std::string s = /*...a string representation of a Foo...*/ ;
// ...
Foo x;
convert(s, x);
// ...
// ...code that uses x...
}
To simplify your code, particularly for light-weight easy-to-copy types, you probably want to add a return-by-value
conversion function to file convert.h
:
template<typename T>
inline T convertTo(const std::string& s,
bool failIfLeftoverChars = true)
{
T x;
convert(s, x, failIfLeftoverChars);
return x;
}
This simplifies your “usage” code some. You call it by explicitly specifying the template parameter T
:
#include "convert.h"
void myCode()
{
std::string a = /*...string representation of an int...*/ ;
std::string b = /*...string representation of an int...*/ ;
// ...
if (convertTo<int>(a) < convertTo<int>(b))
/*...*/ ;
}
Why do my compiles take so long?
You may have a problem with your compiler. It may be old, you may have it installed wrongly, or your computer might be an antique. I can’t help you with such problems.
However, it is more likely that the program that you are trying to compile is poorly designed, so that compiling it involves the compiler examining hundreds of header files and tens of thousands of lines of code. In principle, this can be avoided. If this problem is in your library vendor’s design, there isn’t much you can do (except changing to a better library/vendor), but you can structure your own code to minimize re-compilation after changes. Designs that do that are typically better, more maintainable, designs because they exhibit better separation of concerns.
Consider a classical example of an object-oriented program:
class Shape {
public: // interface to users of Shapes
virtual void draw() const;
virtual void rotate(int degrees);
// ...
protected: // common data (for implementers of Shapes)
Point center;
Color col;
// ...
};
class Circle : public Shape {
public:
void draw() const;
void rotate(int) { }
// ...
protected:
int radius;
// ...
};
class Triangle : public Shape {
public:
void draw() const;
void rotate(int);
// ...
protected:
Point a, b, c;
// ...
};
The idea is that users manipulate shapes through Shape
’s public interface, and that implementers of derived classes (such as Circle
and Triangle
) share aspects of the implementation represented by the protected members.
There are three serious problems with this apparently simple idea:
- It is not easy to define shared aspects of the implementation that are helpful to all derived classes. For that reason, the set of protected members is likely to need changes far more often than the public interface. For example, even though “center” is arguably a valid concept for all
Shape
s, it is a nuisance to have to maintain a point “center” for aTriangle
– for triangles, it makes more sense to calculate the center if and only if someone expresses interest in it. - The protected members are likely to depend on “implementation” details that the users of
Shape
s would rather not have to depend on. For example, much (most?) code using aShape
will be logically independent of the definition of a “color”, yet the presence ofColor
in the definition ofShape
will probably require compilation of header files defining the operating system’s notion of color. - When something in the protected part changes, users of
Shape
have to recompile – even though only implementers of derived classes have access to the protected members.
Thus, the presence of “information helpful to implementers” in the base class that also acts as the interface to users is the source of instability in the implementation, spurious recompilation of user code (when implementation information changes), and excess inclusion of header files into user code (because the “information helpful to implementers” needs those headers). This is sometimes known as the “brittle base class problem.”
The obvious solution is to omit the “information helpful to implemeters” for classes that are used as interfaces to users. That is, to make interfaces, pure interfaces. That is, to represent interfaces as abstract classes:
class Shape {
public: // interface to users of Shapes
virtual void draw() const = 0;
virtual void rotate(int degrees) = 0;
virtual Point center() const = 0;
// ...
// no data
};
class Circle : public Shape {
public:
void draw() const;
void rotate(int) { }
Point center() const { return cent; }
// ...
protected:
Point cent;
Color col;
int radius;
// ...
};
class Triangle : public Shape {
public:
void draw() const;
void rotate(int);
Point center() const;
// ...
protected:
Color col;
Point a, b, c;
// ...
};
The users are now insulated from changes to implementations of derived classes. I have seen this technique decrease build times by orders of magnitudes.
But what if there really is some information that is common to all derived classes (or simply to several derived classes)? Simply make that information a class and derive the implementation classes from that also:
class Shape {
public: // interface to users of Shapes
virtual void draw() const = 0;
virtual void rotate(int degrees) = 0;
virtual Point center() const = 0;
// ...
// no data
};
struct Common {
Color col;
// ...
};
class Circle : public Shape, protected Common {
public:
void draw() const;
void rotate(int) { }
Point center() const { return cent; }
// ...
protected:
Point cent;
int radius;
};
class Triangle : public Shape, protected Common {
public:
void draw() const;
void rotate(int);
Point center() const;
// ...
protected:
Point a, b, c;
};
What should be done with macros that contain if
?
Ideally you’ll get rid of the macro. Macros are evil in 4 different ways:
evil#1, evil#2, evil#3, and
evil#4, regardless of whether they contain an if
(but they’re especially evil if they
contain an if
).
Nonetheless, even though macros are evil, sometimes they are the lesser of the other evils. When that happens, read this FAQ so you know how to make them “less bad,” then hold your nose and do what’s practical.
Here’s a naive solution:
#define MYMACRO(a,b) \ (Bad)
if (xyzzy) asdf()
This will cause big problems if someone uses that macro in an if
statement:
if (whatever)
MYMACRO(foo,bar);
else
baz;
The problem is that the else baz
nests with the wrong if
: the compiler sees this:
if (whatever)
if (xyzzy) asdf();
else baz;
Obviously that’s a bug.
The easy solution is to require {...}
everywhere, but there’s another solution that I prefer even if there’s a coding
standard that requires {...}
everywhere (just in case someone somewhere forgets): add a balancing else
to the macro
definition:
#define MYMACRO(a,b) \ (Good)
if (xyzzy) asdf(); \
else (void)0
(The (void)0
causes the compiler to generate an error message if you forget to put the ;
after the ‘call’.)
Your usage of that macro might look like this:
if (whatever)
MYMACRO(foo,bar);
↑ // This ; closes off the else (void)0 part
else
baz;
which will get expanded into a balanced set of if
s and else
s:
if (whatever)
if (xyzzy)
asdf();
else
(void)0;
↑↑↑↑↑↑↑↑ // A do-nothing statement
else
baz;
Like I said, I personally do the above even when the coding standard calls for {...}
in all the if
s. Call me
paranoid, but I sleep better at night and my code has fewer bugs.
There is another approach that old-line C programmers will remember:
#define MYMACRO(a,b) \ (Okay)
do { \
if (xyzzy) asdf(); \
} while (false)
Some people prefer the do {...} while (false)
approach, though if you choose to use that, be aware that it might cause
your compiler to generate less efficient code. Both approaches cause the compiler to give you
an error message if you forget the ;
after MYMACRO(foo,bar)
.
What should be done with macros that have multiple lines?
Avoid macros wherever possible. But yes, sometimes you need to use them anyway, and when you do, read this to learn some safe ways to write a macro that has multiple statements.
Here’s a naive solution:
#define MYMACRO(a,b) \ (Bad)
statement1; \
statement2; \
/*...*/ \
statementN;
This can cause problems if someone uses the macro in a context that demands a single statement. E.g.,
while (whatever)
MYMACRO(foo, bar);
The naive solution is to wrap the statements inside {...}
, such as this:
#define MYMACRO(a,b) \ (Bad)
{ \
statement1; \
statement2; \
/*...*/ \
statementN; \
}
But this will cause compile-time errors with things like the following:
if (whatever)
MYMACRO(foo, bar);
else
baz;
…since the compiler will see:
if (whatever)
{
statement1;
statement2;
// ...
statementN;
} ; else
↑↑↑↑↑↑↑↑ // Compile-time error!
baz;
One solution is to use a do {
<statements go here> } while (false)
pseudo-loop. This executes the body of
the “loop” exactly once. The macro might look like this:
#define MYMACRO(a, b) \ (Okay)
do { \
statement1; \
statement2; \
/*...*/ \
statementN; \
} while (false)
↑ // Intentionally not adding a ; here!
The ;
gets added by the macro’s user, such as:
if (whatever)
MYMACRO(foo, bar);
↑ // The user of MYMACRO() adds the ; here
else
baz;
After expansion, the compiler will see this:
if (whatever)
do {
statement1;
statement2;
// ...
statementN;
} while (false);
↑ // From user's code, not from MYMACRO() itself
else
baz;
There is an unlikely but possible downside to the above approach: historically some C艹 compilers have refused to
inline-expand any function containing a loop. If your C艹 compiler has that limitation, it will not inline-expand
any function that uses MYMACRO()
. Chances are this won’t be a problem, either because you don’t use MYMACRO()
in any
inline functions, or because your compiler (subject to all its other constraints) is willing to inline-expand functions
containing loops (provided the inline function meets all your compiler’s other requirements). However, if you are
concerned, do some tests with your compiler: examine the resulting assembly code and/or perform a few simple timing
tests.
If you have any problems with your compiler’s willingness to inline-expand functions containing loops, you can change
MYMACRO()
’s definition to if (true) {
…} else (void)0
#define MYMACRO(a, b) \
if (true) { \
statement1; \
statement2; \
/*...*/ \
statementN; \
} else
(void)0
↑ // Intentionally not adding a ; here!
After expansion, the compiler will see a balanced set of if
s and else
s):
if (whatever)
if (true) {
statement1;
statement2;
// ...
statementN;
} else
(void)0;
↑↑↑↑↑↑↑ // A do-nothing statement
else
baz;
The (void)0
in the macro definition forces users to remember the ;
after any usage of the macro. If you forgot the
;
like this…
foo();
MYMACRO(a, b)
↑ // Whoops, forgot the ; here
bar();
baz();
…then after expansion the compiler would see this:
foo();
if (true) {
statement1; \
statement2; \
/*...*/ \
statementN; \
} else
(void)0 bar();
↑↑↑↑↑ // Fortunately(!) this will produce a compile-time error-message
baz();
Even though the specific error message is likely to be confusing, it will at least cause the programmer to notice that
something is wrong. That’s a lot better than the alternative: without the (void)0
in the MYMACRO()
definition,
the compiler would silently generate the wrong code: the bar()
call would never be called, since it would erroneously
be on the unreachable else
branch of the if
.
What should be done with macros that need to paste two tokens together?
Groan. I really hate macros. Yes they’re useful sometimes, and yes I use them. But I always wash my hands afterwards. Twice. Macros are evil in 4 different ways: evil#1, evil#2, evil#3, and evil#4.
Here we go again, desperately trying to make an inherently evil thing a little less evil.
First, the basic approach is use the ISO/ANSI C and ISO/ANSI C艹 “token pasting” feature: ##
. On the surface this
would look like the following:
Suppose you have a macro called “MYMACRO”, and suppose you’re passing a token as the parameter of that macro, and
suppose you want to concatenate that token with the token “Tmp” to create a variable name. For example, the use of
MYMACRO(Foo)
would create a variable named FooTmp
and the use of MYMACRO(Bar)
would create a variable named
BarTmp
. In this case the naive approach would be to say this:
#define MYMACRO(a) \
/*...*/ a ## Tmp /*...*/
However you need a double layer of indirection when you use ##
. Basically you need to create a special macro for
“token pasting” such as:
#define NAME2(a,b) NAME2_HIDDEN(a,b)
#define NAME2_HIDDEN(a,b) a ## b
Trust me on this — you really need to do this! (And please nobody write me saying it sometimes works without the
second layer of indirection. Try concatenating a symbol with __LINE__
and see what happens then.)
Then replace your use of a ## Tmp
with NAME2(a,Tmp)
:
#define MYMACRO(a) \
/*...*/ NAME2(a,Tmp) /*...*/
And if you have a three-way concatenation to do (e.g., to paste three tokens together), you’d create a NAME3()
macro
like this:
#define NAME3(a,b,c) NAME3_HIDDEN(a,b,c)
#define NAME3_HIDDEN(a,b,c) a ## b ## c
Why can’t the compiler find my header file in #include "c:\test.h"
?
Because "\t"
is a tab character.
You should use forward slashes ("/"
) rather than backslashes ("\"
) in your #include
filenames, even on operating
systems that use backslashes such as DOS, Windows, OS/2, etc. For example:
#if 1
#include "/version/next/alpha/beta/test.h" // RIGHT!
#else
#include "\version\next\alpha\beta\test.h" // WRONG!
#endif
Note that you should use forward slashes ("/"
) on all your filenames, not just on your
#include
files.
Note that your particular compiler might not treat a backslash within a header-name the same as it treats a backslash
within a string literal. For instance, your particular compiler might treat #include "foo\bar\baz"
as if the '\'
chars were quoted. This is because header names and string literals are different: your compiler will always parse
backslashes in string literals in the usual way, with '\t'
becoming a tab character, etc., but it might not parse
header names using those same rules. In any case, you still shouldn’t use backslashes in your header names since there’s
something to lose but nothing to gain.
What are the C艹 scoping rules for for
loops?
Loop variables declared in the for
statement proper are local to the loop body.
The following code used to be legal, but not any more, since i
’s scope is now inside the for
loop only:
for (int i = 0; i < 10; ++i) {
// ...
if ( /* something weird */ )
break;
// ...
}
if (i != 10) {
// We exited the loop early; handle this situation separately
// ...
}
If you’re working with some old code that uses a for
loop variable after the for
loop, the compiler will
(hopefully!) give you a warning or an error message such as “Variable i
is not in scope”.
Unfortunately there are cases when old code will compile cleanly, but will do something different — the wrong thing.
For example, if the old code has a global variable i
, the above code if (i != 10)
silently change in meaning from
the for
loop variable i
under the old rule to the global variable i
under the current rule. This is not good. If
you’re concerned, you should check with your compiler to see if it has some option that forces it to use the old rules
with your old code.
Note: You should avoid having the same variable name in nested scopes, such as a global i
and a local i
. In fact,
you should avoid globals altogether whenever you can. If you abided by these coding standards in your old code, you
won’t be hurt by a lot of things, including the scoping rules for for
loop variables.
Note: If your new code might get compiled with an old compiler, you might want to put {...}
around the for
loop to
force even old compilers to scope the loop variable to the loop. And please try to avoid
the temptation to use macros for this. Remember: macros are evil in 4 different ways:
evil#1, evil#2, evil#3, and
evil#4.
Why can’t I overload a function by its return type?
If you declare both char f()
and float f()
, the compiler gives you an error message, since calling simply f()
would be ambiguous.
What is “persistence”? What is a “persistent object”?
A persistent object can live after the program which created it has stopped. Persistent objects can even outlive different versions of the creating program, can outlive the disk system, the operating system, or even the hardware on which the OS was running when they were created.
The challenge with persistent objects is to effectively store their member function code out on secondary storage along with their data bits (and the data bits and member function code of all member objects, and of all their member objects and base classes, etc). This is non-trivial when you have to do it yourself. In C艹, you have to do it yourself. C艹/OO databases can help hide the mechanism for all this.
How can I create two classes that both know about each other?
Use a forward declaration.
Sometimes you must create two classes that use each other. This is called a circular dependency. For example:
class Fred {
public:
Barney* foo(); // Error: Unknown symbol 'Barney'
};
class Barney {
public:
Fred* bar();
};
The Fred
class has a member function that returns a Barney*
, and the Barney
class has a member function that
returns a Fred*
. You may inform the compiler about the existence of a class or structure by using a “forward
declaration”:
class Barney;
This line must appear before the declaration of class Fred
. It simply informs the compiler that the name Barney
is
a class, and further it is a promise to the compiler that you will eventually supply a complete definition of that
class.
What special considerations are needed when forward declarations are used with member objects?
The order of class declarations is critical.
The compiler will give you a compile-time error if the first class contains an object (as opposed to a pointer to an object) of the second class. For example,
class Fred; // Okay: forward declaration
class Barney {
Fred x; // Error: The declaration of Fred is incomplete
};
class Fred {
Barney* y;
};
One way to solve this problem is to reverse order of the classes so the “used” class is defined before the class that uses it:
class Barney; // Okay: forward declaration
class Fred {
Barney* y; // Okay: the first can point to an object of the second
};
class Barney {
Fred x; // Okay: the second can have an object of the first
};
Note that it is never legal for each class to fully contain an object of the other class since that would imply
infinitely large objects. In other words, if an instance of Fred
contains a Barney
(as opposed to a Barney*
), and
a Barney
contains a Fred
(as opposed to a Fred*
), the compiler will give you an error.
What special considerations are needed when forward declarations are used with inline functions?
The order of class declarations is critical.
The compiler will give you a compile-time error if the first class contains an inline function that invokes a member function of the second class. For example,
class Fred; // Okay: forward declaration
class Barney {
public:
void method()
{
x->yabbaDabbaDo(); // Error: Fred used before it was defined
}
private:
Fred* x; // Okay: the first can point to an object of the second
};
class Fred {
public:
void yabbaDabbaDo();
private:
Barney* y;
};
There are a number of ways to work around this problem. One workaround would be to define Barney::method()
with the
keyword inline
below the definition of class Fred
(though still within the header file). Another would be to define
Barney::method()
without the keyword inline
in file Barney.cpp
. A third would be to use nested classes. A fourth
would be to reverse the order of the classes so the “used” class is defined before the class that uses it:
class Barney; // Okay: forward declaration
class Fred {
public:
void yabbaDabbaDo();
private:
Barney* y; // Okay: the first can point to an object of the second
};
class Barney {
public:
void method()
{
x->yabbaDabbaDo(); // Okay: Fred is fully defined at this point
}
private:
Fred* x;
};
Just remember this: Whenever you use forward declaration, you can use only that symbol; you may not do anything that requires knowledge of the forward-declared class. Specifically you may not access any members of the second class.
Why can’t I put a forward-declared class in a std::vector<>
?
Because the std::vector<>
template needs to know the sizeof()
its contained elements, plus the std::vector<>
probably accesses members of the contained elements (such as the copy constructor, the destructor, etc.). For example,
class Fred; // Okay: forward declaration
class Barney {
std::vector<Fred> x; // Error: the declaration of Fred is incomplete
};
class Fred {
Barney* y;
};
One solution to this problem is to change Barney
so it uses a std::vector<>
of Fred
pointers (raw pointers or smart pointers such as unique_ptr or shared_ptr) rather than a
std::vector<>
of Fred
objects:
class Fred; // Okay: forward declaration
class Barney {
std::vector<std::unique_ptr<Fred>> x; // Okay: Barney can use Fred pointers
};
class Fred {
Barney* y;
};
Another solution to this problem is to reverse the order of the classes so Fred
is defined before Barney
:
class Barney; // Okay: forward declaration
class Fred {
Barney* y; // Okay: the first can point to an object of the second
};
class Barney {
std::vector<Fred> x; // Okay: Fred is fully defined at this point
};
Just remember this: Whenever you use a class as a template parameter, the declaration of that class must be complete and not simply forward declared.
Why do some people think x = ++y + y++
is bad?
Because it’s undefined behavior, which means the runtime system is allowed to do weird or even bizarre things.
The C艹 language says you cannot modify a variable more than once between sequence points. Quoth the standard (section 5, paragraph 4):
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.
What’s the value of i++ + i++
?
It’s undefined. Basically, in C and C艹, if you read a variable twice in an expression where you also write it, the result is undefined. Don’t do that. Another example is:
v[i] = i++;
Related example:
f(v[i],i++);
Here, the result is undefined because the order of evaluation of function arguments is undefined.
Having the order of evaluation undefined is claimed to yield better performing code. Compilers could warn about such examples, which are typically subtle bugs (or potential subtle bugs). It’s disappointing that after decades, most compilers still don’t warn, leaving that job to specialized, separate, and underused tools.
What’s the deal with “sequence points”?
Note: The C艹11 standard has expressed the same rules as below in a different way. It no longer refers to “sequence points,” but the effects should be the same as described below.
The C艹98 standard said (1.9p7):
At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.
For example, if an expression contains the subexpression y++
, then the variable y
will be incremented by the next
sequence point. Furthermore if the expression just after the sequence point contains the subexpression ++z
, then z
will not have yet been incremented at the moment the sequence point is reached.
The “certain specified points” that are called sequence points are (section and paragraph numbers are from the standard):
- the semicolon (1.9p16)
- the non-overloaded comma-operator (1.9p18)
- the non-overloaded
||
operator (1.9p18) - the non-overloaded
&&
operator (1.9p18) - the ternary
?:
operator (1.9p18) - after evaluation of all a function’s parameters but before the first expression within the function is executed (1.9p17)
- after a function’s returned object has been copied back to the caller, but before the code just after the call has yet been evaluated (1.9p17)
- after the initialization of each base and member (12.6.2p3)