Inline Functions
What’s the deal with inline
functions?
When the compiler inline-expands a function call, the function’s code gets inserted into the caller’s code stream
(conceptually similar to what happens with a #define
macro). This can, depending on a zillion
other things, improve performance, because the optimizer can procedurally
integrate the called code — optimize the called code into the caller.
There are several ways to designate that a function is inline
, some of which involve the inline
keyword, others do not. No matter how you designate a function as
inline
, it is a request that the compiler is allowed to ignore: the compiler might inline-expand some, all, or none
of the places where you call a function designated as inline
. (Don’t get discouraged if that seems hopelessly vague.
The flexibility of the above is actually a huge advantage: it lets the compiler treat large functions differently from
small ones, plus it lets the compiler generate code that is easy to debug if you select the right compiler options.)
What’s a simple example of procedural integration?
Consider the following call to function g()
:
void f()
{
int x = /*...*/;
int y = /*...*/;
int z = /*...*/;
// ...code that uses x, y and z...
g(x, y, z);
// ...more code that uses x, y and z...
}
Assuming a typical C艹 implementation that has registers and a stack, the registers and parameters get written to the
stack just before the call to g()
, then the parameters get read from the stack inside g()
and read again to restore
the registers while g()
returns to f()
. But that’s a lot of unnecessary reading and writing, especially in cases
when the compiler is able to use registers for variables x
, y
and z
: each variable could get written twice (as a
register and also as a parameter) and read twice (when used within g()
and to restore the registers during the return
to f()
).
void g(int x, int y, int z)
{
// ...code that uses x, y and z...
}
If the compiler inline-expands the call to g()
, all those memory operations could vanish. The registers wouldn’t need
to get written or read since there wouldn’t be a function call, and the parameters wouldn’t need to get written or read
since the optimizer would know they’re already in registers.
Naturally your mileage may vary, and there are a zillion variables that are outside the scope of this particular FAQ, but the above serves as an example of the sorts of things that can happen with procedural integration.
Do inline
functions improve performance?
Yes and no. Sometimes. Maybe.
There are no simple answers. inline
functions might make the code faster, they might make it slower. They might make
the executable larger, they might make it smaller. They might cause thrashing, they might prevent thrashing. And they
might be, and often are, totally irrelevant to speed.
inline
functions might make it faster: As shown above, procedural integration might
remove a bunch of unnecessary instructions, which might make things run faster.
inline
functions might make it slower: Too much inlining might cause code bloat, which might cause “thrashing”
on demand-paged virtual-memory systems. In other words, if the executable size is too big, the system might spend most
of its time going out to disk to fetch the next chunk of code.
inline
functions might make it larger: This is the notion of code bloat, as described above. For example, if a
system has 100 inline
functions each of which expands to 100 bytes of executable code and is called in 100 places,
that’s an increase of 1MB. Is that 1MB going to cause problems? Who knows, but it is possible that that last 1MB could
cause the system to “thrash,” and that could slow things down.
inline
functions might make it smaller: The compiler often generates more code to push/pop registers/parameters
than it would by inline-expanding the function’s body. This happens with very small functions, and it also happens with
large functions when the optimizer is able to remove a lot of redundant code through procedural integration — that is,
when the optimizer is able to make the large function small.
inline
functions might cause thrashing: Inlining might increase the size of the binary executable, and that
might cause thrashing.
inline
functions might prevent thrashing: The working set size (number of pages that need to be in memory at
once) might go down even if the executable size goes up. When f()
calls g()
, the code is often on two distinct
pages; when the compiler procedurally integrates the code of g()
into f()
, the code is often on the same page.
inline
functions might increase the number of cache misses: Inlining might cause an inner loop to span across
multiple lines of the memory cache, and that might cause thrashing of the memory-cache.
inline
functions might decrease the number of cache misses: Inlining usually improves locality of reference
within the binary code, which might decrease the number of cache lines needed to store the code of an inner loop. This
ultimately could cause a CPU-bound application to run faster.
inline
functions might be irrelevant to speed: Most systems are not CPU-bound. Most systems are I/O-bound,
database-bound or network-bound, meaning the bottleneck in the system’s overall performance is the file system, the
database or the network. Unless your “CPU meter” is pegged at 100%, inline
functions probably won’t make your system
faster. (Even in CPU-bound systems, inline
will help only when used within the bottleneck itself, and the bottleneck
is typically in only a small percentage of the code.)
There are no simple answers: You have to play with it to see what is best. Do not settle for simplistic answers
like, “Never use inline
functions” or “Always use inline
functions” or “Use inline
functions if and only if the
function is less than N lines of code.” These one-size-fits-all rules may be easy to write down, but they will
produce sub-optimal results.
How can inline
functions help with the tradeoff of safety vs. speed?
In straight C, you can achieve “encapsulated struct
s” by putting a void*
in a struct
, in which case the void*
points to the real data that is unknown to users of the struct
. Therefore users of the struct
don’t know how to
interpret the stuff pointed to by the void*
, but the access functions cast the void*
to the appropriate hidden type.
This gives a form of encapsulation.
Unfortunately it forfeits type safety, and also imposes a function call to access even trivial fields of the struct
(if you allowed direct access to the struct
’s fields, anyone and everyone would be able to get direct access since
they would of necessity know how to interpret the stuff pointed to by the void*
; this would make it difficult to
change the underlying data structure).
Function call overhead is small, but can add up. C艹 classes allow function calls to be expanded inline
. This lets
you have the safety of encapsulation along with the speed of direct access. Furthermore the parameter types of these
inline
functions are checked by the compiler, an improvement over C’s #define
macros.
Why should I use inline
functions instead of plain old #define
macros?
Because #define
macros are evil in 4 different ways: evil#1,
evil#2, evil#3, and evil#4. Sometimes you
should use them anyway, but they’re still evil.
Unlike #define
macros, inline
functions avoid infamous macro errors since inline
functions always evaluate every
argument exactly once. In other words, invoking an inline
function is semantically just like invoking a regular
function, only faster:
// A macro that returns the absolute value of i
#define unsafe(i) \
( (i) >= 0 ? (i) : -(i) )
// An inline function that returns the absolute value of i
inline
int safe(int i)
{
return i >= 0 ? i : -i;
}
int f();
void userCode(int x)
{
int ans;
ans = unsafe(x++); // Error! x is incremented twice
ans = unsafe(f()); // Danger! f() is called twice
ans = safe(x++); // Correct! x is incremented once
ans = safe(f()); // Correct! f() is called once
}
Also unlike macros, argument types are checked, and necessary conversions are performed correctly.
Macros are bad for your health; don’t use them unless you have to.
How do you tell the compiler to make a non-member function inline
?
When you declare an inline
function, it looks just like a normal function:
void f(int i, char c);
But when you define an inline
function, you prepend the function’s definition with the keyword inline
, and you put
the definition into a header file:
inline
void f(int i, char c)
{
// ...
}
Note: It’s imperative that the function’s definition (the part between the {...}
) be placed in a header file, unless
the function is used only in a single .cpp file. In particular, if you put the inline
function’s definition into a
.cpp
file and you call it from some other .cpp
file, you’ll get an “unresolved external” error from the linker.
How do you tell the compiler to make a member function inline
?
The declaration of an inline
member function looks just like the declaration of a non-inline
member function:
class Fred {
public:
void f(int i, char c);
};
But when you define an inline
member function (the {...}
part), you prepend the member function’s definition with
the keyword inline
, and you (almost always) put the definition into a header file:
inline
void Fred::f(int i, char c)
{
// ...
}
The reason you (almost always) put the definition (the {...}
part) of an inline
function in a header file is to
avoid “unresolved external” errors from the linker. That error will occur if you put the inline
function’s definition
in a .cpp
file and if that function is called from some other .cpp
file.
Is there another way to tell the compiler to make a member function inline
?
Yep: define the member function in the class body itself:
class Fred {
public:
void f(int i, char c)
{
// ...
}
};
This is often more convenient than the alternative of defining your inline
functions outside the class
body. However, although it is easier on the person who writes the class, it is harder on all the
readers since it mixes what a class does (the external behavior) with how it does it (the implementation). Because
of this mixture, you should define all your member functions outside the class body if your class
is intended to be highly reused and your class’s documentation is the header file itself. This is another application of
Spock’s logic: the needs of the many (all the people reusing your class) outweigh the needs of the few (those who
maintain your class’s implementation) or the one (the class’s original
author).
Of course if you are not writing a highly reused class, or if you are providing documentation of your class’s external
behavior outside the header files (e.g., HTML or PDF or whatever), then you should probably define your inline
functions inside the class body proper, as that will simplify your development as well as maintenance of the class’s
implementation.
This approach is further exploited in the next FAQ.
With inline
member functions that are defined outside the class, is it best to put the inline
keyword next to the declaration within the class body, next to the definition outside the class body, or both?
Definition only.
Here is an example of an inline
member function defined outside the class body:
class Foo {
public:
void method(); // Best practice: Don't put the inline keyword here
// ...
};
inline void Foo::method() // Best practice: Put the inline keyword here
{
// ...
}
Recall that you should define your inline
member function outside the class body when your
class is intended to be highly reused and your reusers will read your header file to determine what the class does —
its observable semantics or external behavior. In that case…
- The
public:
part of the class body is where you describe the observable semantics of the class, its public member functions, its friend functions, and any other features of the class to be reused by others. The goal is to keep thispublic:
part public — to drain thepublic:
part of any inklings of anything that is unimportant to reusers. If “it” can’t be observed from the caller’s code, “it” shouldn’t be in thepublic:
part of the class body. - The other parts of the class, including non-
public:
part of the class body, the definitions of your member and friend functions, etc. are pure implementation. Try not to describe any observable semantics that were not already described in the class’spublic:
part. If “it” can be observed from the caller’s code, “it” should be described in thepublic:
part of the class body; “it” might also appear in the non-public:
parts of the class, but “it” should be specified, somehow, in thepublic:
part.
From a practical standpoint, this separation makes life easier and safer for your class’s reusers. Say Chuck wants to
simply use your reusable class. Because you read this FAQ and used the above separation, Chuck will see, in the
public:
part of your class, everything he needs to see and nothing he doesn’t need to see. Your class’s public:
part will be Chuck’s one-stop-shop for your class’s observable semantics AKA external behavior. By purifying your
class’s public:
parts, you made Chuck’s life both easier (he needs to look in only one spot) and safer (his pure mind
isn’t polluted by implementation minutiae).
Back to inline-ness: the decision of whether a function is or is not inline
is an implementation detail that does
not change the observable semantics (the “meaning”) of a call. Therefore the inline
keyword should not go within the
class’s public:
(or the protected:
or private:
) part, so it needs to go next to the function’s definition.
*NOTE: most people use the terms “declaration” and “definition” to differentiate the above two places. For example,
they might say, “Should I put the inline
keyword next to the declaration or the definition?” In case you’re talking to
a language lawyer, it would be more precise to talk about a non-defining declaration and a defining declaration,
since definitions are also declarations.