Preface

The Evolution of C++: from classical to modern

Since its inception as "C with classes", C++ has experienced numerous significant revisions and improvements. The language is now standardized by ISO JTC1/SC22/WG21, a working group composed of C++ experts from various countries. The first standardized version of C++ was ISO/IEC 14882:1998, commonly known as C++98. The next edition, ISO/IEC 14882:2003, was a minor revision that addressed issues found in C++98.

The true revolution of C++ arrived with ISO/IEC 14882:2011, also known as C++11 or C++0x. Officially released in 2011, it had been delayed longer than originally planned, leading developers to joke about the delay by dubbing it C++0B, with the hexadecimal B representing the release year. C++11 is considered a watershed moment in the language's evolution, marking the transition from classical to modern C++. It introduced many important additions to both the core language and the standard library, including rvalue references/move semantics, auto type deduction, uniform initialization syntax using {} lists, lambdas, variadic templates, SFINAE rules, and various smart pointer classes, among other valuable features for crafting robust C++ programs.

A small extension to C++11 was introduced in ISO/IEC 14882:2014. This was followed by another major revision ISO/IEC 14882:2017, which added notable features like std::any, std::variant, and std::optional classes to the standard library.

C++20, i.e., ISO/IEC 14882:2020 was officially published on 15 December 2020, representing the latest major revision. The most welcomed core language features of C++20 include concepts for generic type constraints, modules for improved expression of program physical modules, and coroutines for non-preemptive multitasking. Among the many new standard library features, the ranges library is particularly exciting, as it enables functional programming with "pipeable" functions similar to F#, my favorite .NET language.

Given the impact and changes brought about by C++11/14/17/20, it's clear that pre-2011 C++ and post-2011 C++ are fundamentally different languages. This distinction is reflected in the terms "Classical C++" represented by C++98 and "Modern C++" represented by C++11 and later. Learning the reimagined modern C++ as a new language is necessary, whether it's approached with enthusiasm or apprehension.

C++ was designed with backward compatibility to C, allowing developers to use C-style programming constructs such as raw pointers, arrays, and null-terminated strings. As C++ has evolved, the focus has shifted towards reducing the reliance on C-style idioms and sticking to the "zero overhead" principle. Modern C++ is simpler, safer, more elegant, and retains its speed.

Who this book is for

This book expects readers to have a basic knowledge of C++ and a genuine interest in evolving their skills in modern C++. Most chapters are beginner-friendly, while some need extra focus. Advanced meta-template programming topics may require multiple readings but can be skipped initially. Beginners should refer to other C++ books for fundamental guidance.

What this book covers

This book focuses on helping readers learn and understand new C++11 to C++20 features. Where necessary, it also explains how new features are implemented in compilers.

Fundamental Data Types

The fundamental types in C++ include integer types, character types, and floating-point types. These types are considered fundamental because they are built into the language itself and can be used to create more complex data structures and objects. Additionally, they are the building blocks for other C++ data types, such as arrays, structures, and classes.

The following table lists the type specifiers of the fundamental data types in C++.

Character Types	Integer Types	Floating-Point Types
`char`	`bool`	`float`
`wchar_t`	`short`	`double`
`char16_t`	`int`	`long double`
`char32_t`	`long`
`char8_t`	`long long`
	`unsigned short`
	`unsigned int`
	`unsigned long`
	`unsigned long long`
	`signed char`
	`unsigned char`

`void`

void is considered a fundamental type in C++. It represents the absence of a value and is used as a placeholder in function signatures and pointer declarations. It cannot be used to declare variables because it has no size or storage, but it is an important part of the C++ language and is often used in conjunction with other data types.

`bool`

bool is considered an integer type in C++, but it is often treated as a separate category due to its Boolean semantics.

`signed char` and `unsigned char`

In C++, the char type is considered a distinct type that can be used to represent individual characters in text string. It is technically not considered an integer type, but does have an integer representation according to the ASCII or Unicode standard, which allows it to be used for integer calculations in some context.

When signed or unsigned is applied to char, it creates a type for small integers that can hold values between 0 and 255 (or -128 to 127 in the case of signed char).Therefore, signed char and unsigned char are both considered integer types.

Note that char is a distinct type from signed char and unsigned char, and it is not guaranteed to be signed or unsigned. The signedness of char is implementation-defined, and it can vary depending on the platform and the compiler.

Type qualifiers and cv-correctness

Type specifiers can be combined with type qualifiers. In C++, there are two type qualifiers: const and volatile.

const indicates that a variable's value cannot be modified after it has been initialized.
volatile indicates that a variable's value can be modified by external factors such as hardware or other processes. Sometimes, volatile is applied to a variable to prevent compiler optimization.

CV-correctness is a programming concept in C++ that involves using the const and volatile type qualifiers to ensure that functions and data members behave correctly in the presence of const and volatile objects.

For example, a member function that does not modify the state of the object it operates on should be declared const. This ensures that the function can be called on const objects, and that it does not modify the state of the object.

class Example {
public:
    // Declared const because it does not modify the object state
    int getValue() const; 
private:
    int value_;
};

int Example::getValue() const {
    return value_;
}

A member variable can also be declared const if it should not be modified in any case:

class Example {
public:
    Example(int value) : value_(value) {}
    int getValue() const {
        // Cannot be modified because getValue is const
        return value_; 
    }
private:
    // Declared const to ensure it cannot be modified
    const int value_; 
};

The volatile qualifier can be applied to variables that can be changed by external factors, such as hardware or other processes. This ensures that the compiler does not optimize away accesses to the variable, which could cause incorrect behavior.

volatile int* ptr; // Pointer to a volatile int

Using CV-correctness can help prevent errors and improve code safety by ensuring that functions and data members behave correctly in the presence of const and volatile objects.

`mutable`

In C++, mutable is a type specifier that can be used to declare a non-static data member that can be modified even if the containing object is declared const. This is useful when the variable represents a cache or temporary value that does not affect the state of the object.

class Example {
public:
    int getValue() const {
        // Marked const, so it cannot modify any non-mutable members.
        // However, it can modify mutable members such as cachedValue_.
        if (cachedValue_ == 0) {
            cachedValue_ = someExpensiveCalculation();
        }
        return cachedValue_;
    }

private:
    // Declared mutable to allow modification even 
    // if Example object is const
    mutable int cachedValue_;
};

In this example, cachedValue_ is declared as mutable, which allows it to be modified even if the containing object is declared const. The getValue() function is declared const, which means it cannot modify any non-mutable members of the Example object, but it can modify the mutable member cachedValue_.

Integer Types

Common integer types

C++ supports several integer types with varying sizes and ranges. Here is a list of the most commonly used integer types in C++, available since the earlier versions of the language. Note that char is treated as integer type here for practical reason, though technically it is not.

Type name	Typical Size (in bytes)	Range
`bool`	1	Boolean literal `true` or `false`, added in C++98
`char`	1	[-128, 127] or [0, 255] depending on signedness
`short`	2	[-32,768, 32,767]
`int`	4	[-2,147,483,648, 2,147,483,647]
`long`	4 or 8	[-2,147,483,648, 2,147,483,647] or [-9,223,372,036,854,775,808, 9,223,372,036,854,775,807] depending on platform
`long long`	8	[-9,223,372,036,854,775,808, 9,223,372,036,854,775,807]
`unsigned char`	1	[0, 255]
`unsigned short`	2	[0, 65,535]
`unsigned int`	4	[0, 4,294,967,295]
`unsigned long`	4 or 8	[0, 4,294,967,295] or [0, 18,446,744,073,709,551,615] depending on platform
`unsigned long long`	8	[0, 18,446,744,073,709,551,615]

The C++ standard does not specify the minimum bytes for these integer types, except the following constraints:

sizeof(char)      == 1                  // Rule 1
sizeof(char)      <= sizeof(short)      // Rule 2
sizeof(short)     <= sizeof(int)        // Rule 3
sizeof(int)       <= sizeof(long)       // Rule 4
sizeof(long)      <= sizeof(long long)  // Rule 5
sizeof(char)      *  CHAR_BIT >= 8      // Rule 6
sizeof(short)     *  CHAR_BIT >= 16     // Rule 7
sizeof(int)       *  CHAR_BIT >= 16     // Rule 8
sizeof(long)      *  CHAR_BIT >= 32     // Rule 9
sizeof(long long) *  CHAR_BIT >= 64     // Rule 10

CHAR_BIT represents the number of bits in a char type. Although most modern architectures use 8 bits per byte, this is not always the case as some older machines may have used 7-bit bytes. Under Rule 4, C/C++ allows long and int to have the same size, but it must be at least 32 bits according to Rule 9.

Fixed size integer types

The C++11 standard introduced new integer types such as int8_t, int16_t, int32_t, and int64_t with fixed sizes, as well as their unsigned counterparts, uint8_t, uint16_t, uint32_t, and uint64_t. These types are guaranteed to have the specified size and range on any conforming implementation.

The following table summarizes fixed size integer types - note that the intN_t and uintN_t types are guaranteed to have exactly N bits, where N is 8, 16, 32, or 64.

Type	Size (in bytes)	Range
`int8_t`	1	[-128, 127]
`uint8_t`	1	[0, 255]
`int16_t`	2	[-32,768, 32,767]
`uint16_t`	2	[0, 65,535]
`int32_t`	4	[-2,147,483,648, 2,147,483,647]
`uint32_t`	4	[0, 4,294,967,295]
`int64_t`	8	[-9,223,372,036,854,775,808, 9,223,372,036,854,775,807]
`uint64_t`	8	[0, 18,446,744,073,709,551,615]

128-bit integer types

The C++ standard does not define a 128-bit integer type, as of the latest version C++20.

However, some compilers and libraries provide extensions that define a 128-bit integer type. For example, the GCC and Clang compilers provide an __int128 type, which is a 128-bit signed integer type. The Boost Multiprecision library provides several integer types with arbitrary precision, including a boost::multiprecision::int128_t type.

Type name	Library/Compiler	Description
`__int128`	GCC, Clang	A 128-bit signed integer type
`unsigned __int128`	GCC, Clang	A 128-bit unsigned integer type
`int128_t`	Boost Multiprecision	A 128-bit signed integer type
`uint128_t`	Boost Multiprecision	A 128-bit unsigned integer type

It's important to note that the availability and behavior of non-standard integer types may vary depending on the platform and compiler used.

Integer Type long long

History

Before long long was officially added to the C++11 standard in 2011, C++ programmers already knew about the long long integer type for a long time. It has been part of the C language since the C99 standard, and many major C++ compilers supported long long for compatibility with C.

As early as 1995, Roland Hartinger first proposed to add long long to C++. At the time, the C committee had not yet considered this type. As a result, the C++ committee was reluctant to add a fundamental type that was not also in C. After long long had been added to C99, Stephen Adamczyk proposed to reconsider its addition to C++ in 2005. Finally, long long was accepted as part of C++ in 2011, more than ten years after it was first included in the C standard.

Bit size

The C++ standard defines long long as an integer type that is at least 64 bits long, but it does not guarantee that long long will always be 64 bits on all platforms. The size of long long can depend on the architecture and the compiler being used. However, most modern platforms do support a 64-bit long long type. To ensure portability and avoid any potential issues, it's best to use the sizeof operator to determine the size of long long on a specific platform.

Remember that in C++, long long is a signed data type, and its corresponding unsigned data type is unsigned long long. It's important to note that long long int and unsigned long long int have the same meaning as long long and unsigned long long, respectively, with the latter forms being shorthand for the former ones.

Literal suffix

The C++ standard defines LL and ULL as literal suffixes for long long and unsigned long long, respectively. When initializing a long long type variable, you can write it like this:

long long x = 65536LL;

The literal suffix LL can be omitted with the same result:

long long x = 65536;

When working with large integer values in C++, it is important to use literal suffixes to ensure that the code runs as intended. For example:

long long x = 65536 << 16; // Value overflows to 0
std::cout << "x = " << x << std::endl;
long long y = 65536LL << 16;
std::cout << "y = " << y << std::endl;

The code long long x = 65536 << 16 performs a bitwise left shift operation on the decimal value 65536 by 16 bits, which can result in an overflow and unexpected behavior.

To prevent overflowing, we should use the LL literal suffix to ensure that the value is treated as a long long data type, as in long long y = 65536LL << 16. This will ensure that the code runs as intended and the value is not unexpectedly truncated or overflowed.

Numerical limits

We should avoid using macro as much as possible for defining the maximum and minimum values:

#define LLONG_MAX 9223372036854775807LL        // long long max value
#define LLONG_MIN (-9223372036854775807LL - 1) // long long min value
#define ULLONG_MAX 0xFFFFFFFFFFFFFFFFULL       // unsigned long long max value

Instead, we should use std::numeric_limits:

#include <iostream>
#include <limits>
#include <cstdio>

int main(int argc, char *argv[])
{
    // Avoid these!
    std::cout << "LLONG_MAX = "  
            << LLONG_MAX  
            << std::endl;

    std::cout << "LLONG_MIN = "  
            << LLONG_MIN  
            << std::endl;

    std::cout << "ULLONG_MAX = " 
            << ULLONG_MAX 
            << std::endl;

    std::printf("LLONG_MAX  = %lld\n", LLONG_MAX);  // format specifier %lld
    std::printf("LLONG_MIN  = %lld\n", LLONG_MIN);  // format specifier %lld
    std::printf("ULLONG_MAX = %llu\n", ULLONG_MAX); // format specifier %llu

    // Use std::numeric_limits
    std::cout << "std::numeric_limits<long long>::max() = " 
            << std::numeric_limits<long long>::max() 
            << std::endl;

    std::cout << "std::numeric_limits<long long>::min() = "
            << std::numeric_limits<long long>::min()
            << std::endl;

    std::cout << "std::numeric_limits<unsigned long long>::max() = "
            << std::numeric_limits<unsigned long long>::max() 
            << std::endl;
}

Character Types

In C++, char is not necessarily the same type as signed char, although on most platforms they are equivalent.

The C++ standard defines char, signed char, and unsigned char as three distinct integral types, each with its own range of representable values. The C++ standard does not specify whether char is signed or unsigned by default, which means that it is implementation-defined.

On most platforms, char is implemented as a signed type, and its range of representable values is the same as that of signed char. However, on some rare platforms, char may be implemented as an unsigned type, in which case it would have the same range of representable values as unsigned char.

So, while char and signed char are often the same type in C++, it is not guaranteed by the standard. To ensure portability of code that relies on the signedness of char, it is recommended to use signed char explicitly.

Issue with `wchar_t`

wchar_t is a character type in C++ that is used to represent wide characters. It was introduced into C++ with the C++98 standard. Many Windows API functions have a wide character version that takes wchar_t strings as arguments. The wide character version of these functions has a suffix of W added to the function name. For example, the function CreateFile() in the Windows API has a wide character version named CreateFileW().

The C++ standard specifies that a string literal with an L prefix creates a wide character string literal.

#include <windows.h>

int main()
{
    LPCWSTR fileName = L"C:\\example\\test.txt";
    HANDLE hFile = CreateFileW(fileName, 
                               GENERIC_READ, 
                               FILE_SHARE_READ, 
                               NULL, 
                               OPEN_EXISTING, 
                               FILE_ATTRIBUTE_NORMAL, 
                               NULL);

    if (hFile == INVALID_HANDLE_VALUE) {
        // Handle error
        return 1;
    }
    // Do something with the file handle
    CloseHandle(hFile);
    return 0;
}

The issue with wchar_t is that its size is implementation-defined, which means that it can vary across different systems and compilers. The C++ standard does not specify the size of wchar_t, leaving it up to the implementation to decide. For example, on Windows systems, wchar_t is 16 bits (2 bytes), while on Unix-like systems, it is typically 32 bits (4 bytes).

This lack of standardization has led to portability issues when writing cross-platform code. Code that relies on wchar_t may not work as expected when compiled on a different system with a different wchar_t size. This can result in problems with data alignment, byte order, and other issues that can cause the program to behave incorrectly.

To address this issue, the C++11 standard introduced new character types, char16_t and char32_t, which have fixed sizes of 16 and 32 bits, respectively. These types are recommended for use in portable code, rather than wchar_t.

Character Sets and Encodings

Character set

A character set, also known as a character repertoire, is a collection of characters and symbols that are used to represent written language in computing. Each character in a character set is assigned a unique code point, which is a numerical value that represents that character in digital form.

Character sets can include characters from many different writing systems and languages, such as the Latin alphabet used in English, or the Chinese characters used in Mandarin Chinese. Some character sets are designed for specific languages or scripts, while others are designed to be universal and include characters from many different languages.

Examples of character sets include ASCII, which includes characters commonly used in the English language, and Unicode, which is a universal character set that can represent all characters used in modern computing, including characters from many different writing systems.

Code point

A code point is a numerical value that represents a single character or symbol in a character set. Each character in a character set is assigned a unique code point, which is a specific number that identifies that character.

Code points are typically expressed as hexadecimal numbers, which means that they use a base-16 numbering system. For example, the code point for the letter "A" in the ASCII character set is 0x41, while the code point for the Greek letter "α" in the Unicode character set is 0x03B1.

Unicode comprises 1,114,112 code points in the range [0, 1,114,111]. The maximum value of Unicode code point is 1,114,111 (0x10FFFF).

Encodings

Encoding involves mapping each code point to a specific sequence of bits or bytes that can be used to represent that character in digital form.

The Unicode standard defines a character set that includes 1,114,111 characters, each with a unique code point, and provides several encoding schemes, including UTF-8, UTF-16, and UTF-32, that allow characters to be represented using variable-length sequences of bytes.

UTF-8 encoding

UTF-8 is a variable-length encoding scheme. It works by mapping each Unicode code point to a sequence of 1 to 4 bytes, depending on the code point value.

Code Point Range	Number of Bytes	Binary Format
0 to 127	1 byte	`0xxx'xxxx`
128 to 2047	2 bytes	`110x'xxxx, 10xx'xxxx`
2048 to 65535	3 bytes	`111'0xxxx 10x'xxxxx 10xx'xxxx`
65536 to 1114111	4 bytes	`1111'0xxx 10xx'xxxx 10xx'xxxx 10xx'xxxx`

Here's how UTF-8 encoding works:

If the code point value is between 0 and 127 (inclusive), the code point is represented as a single byte with the same value. This means that ASCII characters (which have code point values between 0 and 127) can be represented in UTF-8 encoding using a single byte.
If the code point value is between 128 and 2047 (inclusive), the code point is represented as 2 bytes. The first byte starts with the binary value 110, followed by 5 bits that represent the most significant bits of the code point value. The second byte starts with the binary value 10, followed by 6 bits that represent the least significant bits of the code point value.
If the code point value is between 2048 and 65535 (inclusive), the code point is represented as 3 bytes. The first byte starts with the binary value 1110, followed by 4 bits that represent the most significant bits of the code point value. The second and third bytes start with the binary value 10, followed by 6 bits each that represent the remaining bits of the code point value.
If the code point value is between 65536 and 1114111 (inclusive), the code point is represented as 4 bytes. The first byte starts with the binary value 11110, followed by 3 bits that represent the most significant bits of the code point value. The second, third, and fourth bytes start with the binary value 10, followed by 6 bits each that represent the remaining bits of the code point value.

By using a variable-length encoding scheme, UTF-8 encoding can represent all Unicode code points using a sequence of 1 to 4 bytes. This allows UTF-8 to be a compact and efficient encoding scheme. UTF-8 is a superset of ASCII and fully compatible with it.

UTF-8 has unique patterns with the first byte, and a fixed pattern with trailing bytes. This allows for easy validation of a correct UTF-8 sequence, quick "scrolling" to a random position and synchronizing quickly where a character will start.

UTF-16 encoding

Code Point Range	Number of Bytes	Binary Format
0 to 65535	1 code unit (2 bytes)	`xxxxxxxx xxxxxxxx`
65536 to 1114111	2 code units (4 bytes)	`110110yy yyyyyyyy 110111xx xxxxxxxx`

For code points in the range of 0 to 65535, UTF-16 encoding represents each code point using a single 16-bit code unit.
For code points in the range of 65536 to 1114111, UTF-16 encoding represents each code point using a pair of 16-bit code units, known as a surrogate pair. The first 16-bit code unit (known as the high surrogate) has a value in the range of 0xD800 to 0xDBFF, while the second 16-bit code unit (known as the low surrogate) has a value in the range of 0xDC00 to 0xDFFF.

UTF-32 encoding

Code Point Range	Number of Code Units	Binary Format
0 to 1114111	1 code unit (4 bytes)	`00000000 xxxxxxxx xxxxxxxx xxxxxxxx`

UTF-32 encoding represents each code point using a single 32-bit code unit, which means that every Unicode code point is represented using exactly 4 bytes of memory.

Why not UTF-24 encoding

Although it is theoretically possible to create a fixed-length encoding scheme using 3 bytes to represent each Unicode code point, such a scheme would not provide any significant advantages over existing ones like UTF-8, UTF-16, or UTF-32 in terms of processing or space efficiency. Many software systems and programming languages are optimized for these standard Unicode encoding schemes, making them more convenient and widely supported.

Furthermore, most of the commonly used Unicode code points are smaller than 65536, which means that using three bytes per code point would result in unnecessary wastage of space. Therefore, despite the theoretical possibility of a 3-byte fixed-length encoding scheme, it is not practical to use it in most real-world scenarios.

Byte order mark

The Unicode encoding of a text file can be determined by examining the byte order mark (BOM) at the beginning of the file, or by analyzing the byte sequences of the file.

Encoding	Byte Order Mark
UTF-8	`EF BB BF` (optional)
UTF-16	`FE FF` (big-endian) or `FF FE` (little-endian)
UTF-32	`00 00 FE FF` (big-endian) or `FF FE 00 00` (little-endian)

Code page

The legacy term "code page" originated from IBM's EBCDIC-based mainframe systems. Originally, the code page numbers referred to the page numbers in the IBM standard character set manual.

Vendors that use a code page system allocate their own code page number to a character set and its encoding, even if it is better known by another name; for example, UTF-8 has been assigned page numbers 1208 at IBM, 65001 at Microsoft, and 4110 at SAP.

The following table lists Windows code pages used by Microsoft in its own Windows operating system.

Microsoft Code Page	Code Page Number	Description
Windows-1252	1252	Western European languages
Windows-1250	1250	Central and Eastern European languages
Windows-1251	1251	Cyrillic languages
Windows-1253	1253	Greek language
Windows-1254	1254	Turkish language
Windows-1255	1255	Hebrew language
Windows-1256	1256	Arabic language
Windows-1257	1257	Baltic languages
Windows-1258	1258	Vietnamese language
UTF-8	65001	8-bit Unicode
UTF-16LE	1200	16-bit Unicode, Little Endian
UTF-16BE	1201	16-bit Unicode, Big Endian
UTF-32LE	12000	32-bit Unicode, Little Endian
UTF-32BE	12001	32-bit Unicode, Big Endian
UTF-7	65000	7-bit Unicode
UTF-1	12000	8-bit Unicode
UTF-EBCDIC	1200	EBCDIC-based Unicode

New Character Types

Why `char` not good for UTF-8

In C++, char is a fundamental type that represents a byte-sized unit of data. Historically, it has been used to represent both ASCII characters and other narrow character sets, depending on the execution environment.

Suppose we have the following C++ code (in C++11), with the source file saved as UTF-8 text:

// "你吃饭了吗?" literal is treated as a plain array of bytes, interpreted by
// the compiler as Windows-1252 single byte encoding.
const char* utf8_str = "你吃饭了吗?";

If the source file containing the Chinese characters "你吃饭了吗?" is saved as UTF-8 text, then the encoded representation of the text will also be in UTF-8 format. However, if the platform where the code is compiled is using a different encoding, such as Windows-1252, then the compiler may attempt to interpret the Chinese characters as single-byte characters in the Windows-1252 encoding, because the type of the variable utf8_str is declared as a plain char array, which relies on the execution environment to provide the encoding context.

For example, the Chinese character "你" is represented by three bytes in UTF-8, which are 0xE4 0xBD 0xA0. When interpreted as Windows-1252, the first byte 0xE4 is an invalid character, so the compiler replaces it with the ASCII replacement character 0x3F. As a result, every byte of the UTF-8 encoded string "你吃饭了吗?" is replaced with the ASCII replacement character 0x3F before being assigned to utf8_str. The mismatched data can cause unexpected results and errors in the program.

Execution environment explained

The "execution character set of the platform" refers to the character encoding scheme used by the operating system and/or the compiler to represent text data internally in a computer program.

In C and C++, the execution character set determines how characters are represented in the char data type. The specific character set used can vary depending on the platform, compiler, and locale settings.

For example, on Windows systems, the default execution character set is typically based on the Windows-1252 code page, which is a superset of ASCII that includes characters for European languages. On Unix-based systems, the default execution character set is typically based on the ASCII encoding.

char8_t was introduced in C++20 to provide a distinct type that is guaranteed to represent an 8-bit code unit of UTF-8 encoded Unicode text. This allows for safer and more efficient handling of UTF-8 strings, as developers can use char8_t to represent individual code units of the UTF-8 encoding. This can help to avoid issues such as misinterpreting multi-byte sequences or incorrectly handling invalid code points.

In the following code, utf8_str will have the correct UTF-8 code point values, regardless of the execution character set of the platform.

// char8_t is a new C++20 type. The "u8" prefix makes sure the string literal is 
// interpreted as UTF-8 encoded text while enforcing type safety with char8_t.
// Without "u8" prefix, the string literal will be treated as "const char*" type,
// which is a type mismatch with char8_t, thus failing compiling.
const char8_t* utf8_str = u8"你吃饭了吗?"; 
// std::cout << utf8_str << std::endl; // This won't compile

In C++20, there is no char8_t-aware I/O streams (the overloaded std::cout for char8_t, char16_t and char32_t are marked as "delete". It is expected that the issue will be resolved in C++23 or C++26.

char16_t and char32_t were introduced in C++11 to provide support for Unicode text encoding. char16_t represents a 16-bit code unit of UTF-16 encoded Unicode text, while char32_t represents a 32-bit code unit of UTF-32 encoded Unicode text.

Type	Introduced in	Main Reason for Introduction	Literal Prefix	Sample Code
`char8_t`	C++20	UTF-8 encoding	`u8`	`const char8_t* str = u8"吃了吗";`
`char16_t`	C++11	UTF-16 encoding	`u`	`const char16_t* str = u"吃了吗";`
`char32_t`	C++11	UTF-32 encoding	`U`	`const char32_t* str = U"吃了吗";`

The string literal prefix u8, u, U were introduced in C++11. The following code won't pass compilation with C++11 because they cannot be applied to characters. It is since C++17 that these literal prefix are allowed to be used with a character.

char utf8c = u8'a'; // C++11 will fail but C++17/20 can pass

Also the following code would fail compiling because the value cannot fit a single byte.

char utf8c = u8'好';

Print UTF-8 string to console

std::cout cannot be used to output UTF-8 string to console. Use printf instead. On Windows, remember to set the active code page of the Windows commandline console to UTF-8 by running chcp command first.

chcp 65001

The following code uses printf to output an UTF-8 string.

#include <iostream>

using namespace std;

// Remember to run Windows commandline command "chcp 65001" first to set the active
// code page to UTF-8.

int main() {
  // Null terminator automatically appended.
  char8_t utf8Chars[] = u8"你好世界";
  // Will have two null terminators. 
  char8_t utf8CharsWithNull[] = u8"你好世界\0"; 

  auto len_1 = std::char_traits<char8_t>::length(utf8Chars);
  auto len_2 = std::char_traits<char8_t>::length(utf8CharsWithNull);

  cout << "length(utf8Chars) = " 
       << len_1 
       << endl; // output 12

  cout << "length(utf8CharsWithNull) = " 
       << len_2 
       << endl; // output 12

  cout << "sizeof(char8_t) = " 
       << sizeof(char8_t) 
       << endl; // output 1
  
  // std::cout << utf8Words << std::endl; // This would fail compiling.  
  printf("%s", reinterpret_cast<char*>(&utf8Chars[0]));

  /*
  for (std::size_t i = 0; i < len; i++) {
    std::cout << utf8Chars[i] << '\n'; // This would fail compiling.
  }
  */

  return 0;
}

Print a character of UTF-8 text to console

In C++20, the use of the std::codecvt facet is deprecated and discouraged. To display a UTF-8 string character on the Windows commandline console, we need to utilize the platform-specific MultiByteToWideChar function provided by Windows. This will convert the UTF-8 text to wide characters, which can then be output using std::wcout. If we need to access a particular character in the UTF-16 or UTF-32 text based on its position, we should apply the same approach.

#include <iostream>
#include <locale>
#include <string>
#include <Windows.h>

using namespace std;

// Remember to run Windows commandline command "chcp 65001" first to set the active
// code page to UTF-8.

int main() {
    u8string my_string = u8"こんにちは";

    // my_string[0] is the byte value of the UTF-8 text at byte position 0.
    // The actual character could have multiple bytes.
    // std::cout << my_string[0] << std::endl; would fail compiling.

    // Get the required buffer size  
    int len = MultiByteToWideChar(CP_UTF8,
                                  0, 
                                  reinterpret_cast<const char*>(my_string.data()), 
                                  static_cast<int>(my_string.size()), 
                                  nullptr, 
                                  0);

    // Create a buffer of the required size
    wstring my_wstring(len, 0);

    // Convert to UTF-16 
    MultiByteToWideChar(CP_UTF8, 
                        0, 
                        reinterpret_cast<const char*>(my_string.data()), 
                        static_cast<int>(my_string.size()), 
                        &my_wstring[0], 
                        len); 

    locale::global(locale("en_US.UTF-8"));

    // Output the string
    wcout << my_wstring << endl; 

    for (int i = 0; i < len; i++) {
       wcout << my_wstring[i] << endl;    
    }

    return 0;
}

Automatic String Literal Concatenation

Automatic concatenation of adjacent string literals is a feature present in both C and C++ programming languages. It allows the compiler to automatically merge two or more string literals that are placed next to each other, without any explicit concatenation operator. This can be useful for breaking long strings into shorter, more manageable pieces, while still treating them as a single string constant.

Here is an example:

const char* my_string = "Hello,"
                        "World!";

The compiler will automatically concatenate the two string literals, resulting in the following:

const char* my_string = "Hello,World!";

This feature has its roots in the C programming language. It was inherited by C++ in the early 1980s.

Notes on automatic string literal concatenation

Some nuances and caveats of using automatic concatenation of adjacent string literals:

Whitespace not strictly required

Adjacent string literals can be separated by whitespace, like a space, a tab, or a newline, for the concatenation to occur. However, white space between the literals is not strictly required, so the following is still valid in both C and C++:

const char* my_string = "Hello,""World";

The compile will automatically concatenate the adjacent string literals, resulting in the following:

const char* my_string = "Hello,World";

It's a good practice to include whitespace between adjacent string literals for better readability and maintainability.

Compile time concatenation

The concatenation happens at compile-time, not at runtime, which means it has no performance overhead.

Variables or expressions not allowed

Automatic concatenation can only be used with string literals, not with variables or other expressions.

Mixed encodings

Be aware that trying to concatenate string literals with different character encodings may lead to compilation errors or unexpected behavior. For example, the following code will result in compiler error "concatenation of string literals with conflicting encoding prefixes".

const char8_t* utf8Chars = u8"Hello," 
                           L"World!";

If one of the string literals does not have prefix, it will be treated as having the same as others, hence the following is a valid operation:

const char8_t* utf8Chars = u8"Hello," 
                           "World!"; // Equivalent to u8"World!"

The `+` operator

Using the + operator for concatenation works differently than automatic concatenation of adjacent string literals. In C++, the + operator can be used to concatenate std::string objects or a std::string object and a string literal. However, the + operator cannot be used to concatenate two string literals directly.

Here is an example:

#include <iostream>
#include <string>

int main() {
    std::string str1 = "Hello, ";
    std::string str2 = "World!";
    
    std::string result = str1 + str2 + "Oh Yeah"; // Valid in C++
    
    std::cout << result << std::endl;
    return 0;
}

In the example above, the + operator is used to concatenate two std::string objects. However, trying to do this with string literals directly will lead to a compilation error:

const char* result = "Hello, " + "World!" + "Oh Yeah; // NOT valid in C++ (or C)

C does not have the std::string class and the + operator for concatenation. Use functions like strcat or strncat from the string.h library to concatenate character arrays (null-terminated strings). Remember to allocate enough memory for the concatenated result and ensure that the destination string is null-terminated.

Here's an example of using strcat and strncat functions in C:

#include <stdio.h>
#include <string.h>

int main() {
    char str1[20] = "Hello, ";
    char str2[] = "world!";
    char str3[20] = "I am a string.";

    // Using strcat
    strcat(str1, str2);
    printf("str1 after strcat: %s\n", str1);

    // Using strncat
    strncat(str3, str2, 4);
    printf("str3 after strncat: %s\n", str3);

    return 0;
}

In the above code, we have used two different functions for concatenating strings.

strcat function concatenates str2 to the end of str1 and modifies str1. After the strcat operation, str1 will contain the concatenated string.
strncat function concatenates a specified number of characters (in this case, 4) from str2 to the end of str3 and modifies str3. After the strncat operation, str3 will contain the concatenated string.

The output of the above code will be:

str1 after strcat: Hello, world!
str3 after strncat: I am a string.worl

Library Support

Deprecated library support

Component	Purpose	Status
`template<class InternT, class ExternT, class StateT> class codecvt` defined in header `<locale>`	Provides a template class for converting between different character encodings	Deprecated in C++20
`<codecvt>` header	Provides a set of templates for character encoding conversion, including `std::codecvt_utf8`, `std::codecvt_utf16`, and `std::codecvt_utf8_utf16`	Deprecated in C++17
`std::wstring_convert`	Provides a higher-level interface for converting between wide character strings (`std::wstring`) and narrow character strings (`std::string`)	Deprecated in C++17

New string types

String Type	Description	Basic Definition	Introduced in C++
u8string	A string of 8-bit characters encoded in UTF-8	`std::basic_string<char8_t>`	C++20
u16string	A string of 16-bit characters encoded in UTF-16	`std::basic_string<char16_t>`	C++11
u32string	A string of 32-bit characters encoded in UTF-32	`std::basic_string<char32_t>`	C++11

`std::pmr::u8string`

std::pmr::u8string is a variant of the std::basic_string template that represents a sequence of 8-bit characters encoded in UTF-8 format, and allows for custom memory allocation using user-defined memory resources. It is part of the C++20 Polymorphic Memory Resource library (std::pmr).

To use std::pmr::u8string, you need to include the <string> and <memory_resource> headers, and create a std::pmr::memory_resource object to use as the memory allocator. You can then create an instance of std::pmr::u8string by passing the memory allocator as a constructor argument.

Here's an example of how to use std::pmr::u8string:

#include <iostream>
#include <string>
#include <memory_resource>

int main()
{
    // create a memory pool using std::pmr::monotonic_buffer_resource
    std::pmr::monotonic_buffer_resource pool(1024);

    // create an std::pmr::u8string using the memory pool
    std::pmr::u8string str(u8"Hello, world!", &pool);

    // print the string to the console
    printf(reinterpret_cast<char*>(str.data()));

    return 0;
}

C11 way

Function	Description
`mbrtoc16`	Converts a multibyte sequence to a 16-bit wide character
`c16rtomb`	Converts a 16-bit wide character to a multibyte sequence
`mbrtoc32`	Converts a multibyte sequence to a 32-bit wide character
`c32rtomb`	Converts a 32-bit wide character to a multibyte sequence

These are C11 functions.

In the function name mbrtoc16, the "rto" stands for "read to". This function reads a multibyte character sequence and converts it to a 16-bit wide character. The "c16" part of the function name indicates that the output is a 16-bit character, while the "mb" part indicates that the input is a multibyte character sequence.

Here's an example of using the mbrtoc16 function to convert a multibyte sequence to a 16-bit wide character:

#include <stdio.h>
#include <uchar.h>
#include <locale.h>
#include <wchar.h>

int main() {
    setlocale(LC_ALL, "en_US.UTF-8");

    char mbstr[] = "Hello, world!"; // Note char8_t is not part of C language yet.
    char16_t wc16;
    mbstate_t state = { 0 };
    size_t res = mbrtoc16(&wc16, mbstr, sizeof(mbstr), &state);
    if (res == (size_t)-1 || res == (size_t)-2) {
        printf("Error: invalid multibyte sequence\n");
        return 1;
    }
    printf("The first character is: %lc\n", (wint_t)wc16);

    return 0;
}

Initialization

Default Initialization for Non-Static Data Members

Default Member Initializers

Before C++11, non-static data members had to be initialized using constructor initializer lists. This often led to repetitive and error-prone code, especially when dealing with many data members or overloaded constructors.

Here's an example that illustrates the issue:

class Person {
public:
    Person() : age_(0), height_(170.0), name_("John Doe") 
    {}

    Person(int age) : age_(age), height_(170.0), name_("John Doe")
    {}

    Person(double height) : age_(0), height_(height), name_("John Doe") 
    {}

    Person(const std::string& name) : age_(0), height_(170.0), name_(name) 
    {}

private:
    int age_;
    double height_;
    std::string name_;
};

This class Person has multiple constructors that all repeat default values for members not being set. Maintaining such duplication across constructors is tedious and invites bugs when defaults need to be updated.

C++11 allows us to move those default values to the member declarations themselves using either = or {} syntax:

class Person {
public:
    Person() 
    {}

    Person(int age) : age_(age) 
    {}

    Person(double height) : height_(height)
    {}

    Person(const std::string& name) : name_(name) 
    {}

private:
    int age_ = 0;
    double height_{170.0};
    std::string name_{"John Doe"};
};

This version is cleaner. Each constructor focuses only on what it needs to initialize, and the rest rely on the defaults declared with the member.

For example, the constructor Person(double height) just sets height_; both age_ and name_ are initialized using their declared defaults.

Initialization Precedence

If a member is initialized in both the declaration and a constructor initializer list, the constructor's initializer list takes precedence. That means the default is only used when the constructor doesn't override it.

Common Mistakes to Avoid

1. Do not use parentheses `()` for default member initialization

This will cause a compiler error:

struct Invalid {
    int x(42);  // ❌ Error: Invalid syntax
};

Instead, use = or {}:

struct Valid {
    int x = 42;     // ✅ OK
    int y{100};     // ✅ OK
};

2. Do not use `auto` in member declarations

Although auto is fine for local variables, it is not allowed for member declarations with default initialization:

struct Bad {
    auto n = 5;  //  Error: `auto` not allowed in this context
};

You must explicitly specify the type:

struct Good {
    int n = 5;  // ✅ OK
    int y {5};  // ✅ OK
};

However, for static inline data member, you can:

struct Good {
    static inline auto n = 5; // ✅ OK  
};

Default Initialization for Bitfields (C++20)

In C++, bitfields allow you to define struct members that occupy a specified number of bits, enabling compact storage of small data like flags or codes. For example:

struct Status {
    unsigned int ready : 1;
    unsigned int error : 1;
    unsigned int mode  : 2;
};

Here, Status uses just 4 bits instead of 3 full ints, making it memory-efficient—ideal for embedded systems or hardware register mappings. Bitfields improve clarity when working with individual bits, avoiding manual bitmasks. However, they come with drawbacks: layout and alignment are implementation-defined, so bitfields are not portable across compilers; you can't take the address of bitfield members; and access may be slower due to extra masking logic.

Bitfields are powerful for space-constrained or hardware-close applications, but should be avoided when portability or precise control is required.

C++20 expands this feature to allow default initializers for bitfields as well:

struct Flags {
    /*
    is_valid is a 1-bit-wide field. It can store either 0 or 1 (only two possible values).
    = 1 is the default member initializer, which means:
    If you create an instance of Flags without explicitly setting is_valid, it will default to 1.
    In practice, this could serve as a default "enabled" or "valid" flag.
    */
    unsigned int is_valid : 1 = 1;

    /* 
    error_code is a 3-bit-wide field.  It can represent integer values from 0 to 7 (since 3 bits → 2³ = 8 possibilities).
    {4} is brace-style default initialization — another valid syntax in C++11 and beyond.
    So, if you don’t explicitly assign error_code, it defaults to 4.
    This might represent a default error type or status code in a compact system.
    */
    unsigned int error_code : 3 {4};
};

Here, is_valid is a 1-bit field default-initialized to 1, and error_code is a 3-bit field default-initialized to 4.

Bitfield initialization is simple, but you must be careful with expressions that include conditionals or operators, which can confuse the compiler’s parsing rules.

Consider this broken example:

int config;
struct Settings {
    int level : true ? 4 : config = 3;
    int mode : 1 || new int{1};
};

This won’t default-initialize level or mode because the parser assumes = 3 and {1} are part of the bitfield width expressions.

To fix this, use parentheses to clarify parsing:

int config;
struct Settings {
    int level : (true ? 4 : config) = 3; // Bitfield widths must be const expressions. But config here is not evaluated.
    int mode : (1 || new int){1}; // new int is not evaluated.
};

Now level will default to 3, and mode will default 1.

Summary

Default member initializers introduced in C++11 and enhanced in C++20 help make class definitions cleaner and more robust:

They reduce redundancy in constructors.
Constructors can focus only on members they care about.
Code becomes easier to read and maintain.
Bitfields can now also benefit from default values in C++20.

List Initialization in C++

"Traditional" Initialization Styles

Before list initialization, C++ supported two main initialization styles: copy initialization (=) and direct initialization (()).

struct Widget {
    Widget(int a) {}
};

int main() {
    // Copy initialization
    int a = 10;

    // Direct initialization
    int b(20);
    
    // Copy initialization: creating a new object and initializing it
    // using a constructor that can take 5 as an argument
    Widget w1 = 5;
    
    // Direct initialization
    Widget w2(5);      
}

Copy Initialization

Copy initialization (=) can trigger implicit constructor calls. Using explicit on a constructor disables implicit conversions:

struct Widget {
    explicit Widget(int a) {}
};

Widget w = 10;  // Error: explicit constructor blocks implicit conversion

Important note: In C++, "copy initialization" refers to the syntax, not necessarily the action of copying:

MyClass a = 5;   // Copy initialization syntax

This syntax looks like assignment (=), but it's actually initializing a, not assigning to an existing object. The name "copy initialization" is historical—it suggests "initializing an object using a value", even though modern compilers typically optimize away any actual copying through copy elision.

Copy initialization also occurs in other contexts:

void process(Widget w) {}    // Parameter initialization

Widget create() { 
    return 10;               // Return value initialization
}

int main() {
    process(7);              // Copy initialization of parameter
    Widget w = create();     // Copy initialization of w
}

For primitive types like int, there is no difference under the hood between copy initialization and direct initialization. Both generate identical assembly code.

int a = 10;    // Copy initialization, C-style
int b(20);     // Direct initialization, function call style

Both will typically compile to something like:

mov dword ptr [a], 10
mov dword ptr [b], 20

Direct Initialization

Direct initialization (()) calls constructors explicitly without implicit conversions.

Term	Description
Copy Initialization	`T x = value;` — Can use implicit conversions
Direct Initialization	`T x(value);` — Calls constructor directly
Copy Assignment	`x = other;` — Modifies an existing object (not initialization)

List Initialization

The C++ Standard defines list initialization as "initialization of an object or reference from a braced-init-list". It's also informally called "brace initialization".

C++11 introduced list initialization using {}:

int a = {42};      // Copy list initialization
int b{42};         // Direct list initialization

struct Widget {
    Widget(std::string, int) {}
    Widget(int) {}
};

Widget w1 = {7};                  // Copy list initialization
Widget w2{7};                     // Direct list initialization
process({7});                     // Copy list initialization (parameter)
process({"hello", 5});            // Copy list initialization (parameter)
Widget w3 = create();             // Copy initialization (return value)
Widget* w4 = new Widget{"hi", 9}; // Direct list initialization

Two Forms of List Initialization

List initialization comes in two forms:

Direct list initialization: T obj{args...} (no =)
Copy list initialization: T obj = {args...} (with =)

The key difference is how they interact with explicit constructors:

#include <string>

struct Widget {
    explicit Widget(std::string, int) {}
    Widget(int) {}
};

int main() {
   // Works: Direct list initialization of temporary + copy initialization
   // Copy elision ensures no actual copying occurs
   auto w1 = Widget{"hello", 5}; 

   // Error: Copy list initialization blocked by explicit constructor
   // The compiler cannot implicitly convert {"hello", 5} to Widget
   Widget w2 = {"hello", 5};    

   // Works: Direct list initialization
   Widget w3{"hello", 5};
}

How List Initialization Works

List initialization has multiple mechanisms with the following precedence:

std::initializer_list constructor (if available and matches)
Regular constructor matching (overload resolution)
Aggregate initialization (for aggregate types)
Value initialization (for empty braces {})

Examples of different mechanisms:

// Arrays and containers
int arr[] = {1, 2, 3};                                // Copy list initialization
std::vector<int> v{1, 2, 3};                          // Uses initializer_list constructor
std::set<int> s{4, 5, 6};                             // Uses initializer_list constructor
std::map<std::string, int> m{{"dog", 1}, {"cat", 2}}; // Uses initializer_list constructor

// Regular constructor matching
Widget w{"hello", 5};                                 // Uses Widget(std::string, int)

Using `std::initializer_list`

Many standard containers support std::initializer_list constructors:

std::vector<int> v{1, 2, 3, 4, 5};  // Uses initializer_list<int> constructor

You can add std::initializer_list support to custom types:

#include <initializer_list>

struct MyCollection {
    MyCollection(std::initializer_list<std::string> items) {
        for (const auto& item : items) {
            std::cout << item << " ";
        }
    }
};

MyCollection c{"alpha", "beta", "gamma"};  // Uses initializer_list constructor

When you write MyCollection c{"alpha", "beta", "gamma"}, the compiler roughly transforms it to:

// Conceptual transformation (implementation-defined details)
const std::string temp_array[] = {"alpha", "beta", "gamma"};
MyCollection c(std::initializer_list<std::string>(temp_array, temp_array + 3));

Advantages and Pitfalls

Preventing Narrowing Conversions

List initialization disallows implicit narrowing conversions, improving type safety:

int x = 300;
char a1 = x;      // OK with traditional initialization (potentially unsafe)
char a2{x};       // Error: narrowing conversion not allowed

unsigned int u1 = {-1};   // Error: negative to unsigned
int i1 = {2.5};           // Error: float to int
float f1{3};              // OK: int to float is safe
double d = 3.14159;
float f2{d};              // Error: potential precision loss

Narrowing includes:

Floating-point to integer conversion
Larger to smaller integer types (when value doesn't fit)
Integer to floating-point (when not exactly representable)
Signed to unsigned (when negative)

Constructor Preference Gotcha

When both regular and initializer_list constructors match, the initializer_list version takes precedence:

std::vector<int> v1(3, 2);    // Regular constructor: 3 elements, each with value 2
std::vector<int> v2{3, 2};    // initializer_list constructor: 2 elements with values 3 and 2

// Be careful with this difference!
std::vector<int> empty1(0);   // Empty vector
std::vector<int> empty2{0};   // Vector with one element: 0

Nested Initialization

For nested types like std::map, list initialization works hierarchically:

std::map<std::string, int> m{{"fox", 1}, {"owl", 2}};
//                            ^^^^^^^^^  ^^^^^^^^^
//                            Each creates a std::pair
//                           ^^^^^^^^^^^^^^^^^^^^^^^^
//                           Outer braces create initializer_list<pair>

Designated Initialization (C++20)

C++20 introduced designated initialization for aggregate types:

struct Point { int x; int y; };
Point p{.x = 4, .y = 2};  // Named field initialization

This is especially useful for structs with many fields:

struct Config {
    int width = 800;
    int height = 600;
    bool fullscreen = false;
    int samples = 4;
};

Config cfg{.width = 1920, .height = 1080, .fullscreen = true};
// Unspecified fields keep their default values

Designated Initialization Rules

Requirements:

Only works with aggregate types (no user-declared constructors, virtual functions, etc.)
Only non-static data members can be designated

Must follow declaration order:

Point p{.y = 1, .x = 2};  // Error: wrong order

Restrictions:

Each field can be initialized only once:

Point p{.x = 1, .x = 2};  // Error: duplicate initialization

Cannot mix designated and positional initialization:
```
Point p{.x = 1, 2};       // Error: mixed styles
```

For unions, only one member can be designated:

union Data { int a; double b; };
Data d{.a = 10, .b = 3.14};  // Error: multiple members

No direct nested access (use nested braces instead):

struct Line { Point start, end; };
Line l{.start.x = 1};         // Error: nested access
Line l{.start{.x = 1}};       // OK: nested initialization

Best Practices

Prefer list initialization for its safety benefits (prevents narrowing)
Use direct list initialization ({}) over copy list initialization (= {}) when possible
Be aware of constructor precedence with std::initializer_list
Use designated initialization for aggregate types with many fields
Mark single-argument constructors explicit unless implicit conversion is specifically desired

Summary

List initialization is a powerful C++ feature that provides:

Uniform syntax for initialization
Type safety through narrowing prevention
Flexibility through multiple initialization mechanisms
Readability improvements, especially with designated initialization (C++20)

Understanding the distinction between direct and copy list initialization, along with their interaction with explicit constructors, is crucial for effective modern C++ programming.

Structured Binding

Memory Alignment

Compile Time Evaluation

In C++, compile-time evaluation refers to the ability to evaluate expressions and perform computations at compile-time, rather than at runtime. This can be achieved using keywords such as constexpr, consteval, and constinit.

Keyword	Introduced in	Usage
`constinit`	C++20	Defines objects that are guaranteed to be initialized with a constant expression.
`constexpr`	C++11	Indicates that a function or object can be evaluated at compile-time.
`consteval`	C++20	Similar to `constexpr`, but functions marked with `consteval` must be evaluated at compile-time.

In addition to these keywords, C++ also includes several other features that enable compile-time evaluation, such as template metaprogramming and the std::integral_constant class template. These features allow for complex computations and logic to be performed at compile-time, leading to more efficient and optimized code.

Performance boost with compile time evaluation

The ability to perform compile-time evaluation is an important part of the C++ language, as it enables developers to create more efficient and optimized code. The C++ standard includes a number of requirements and guidelines for how these features should be implemented and used. These guidelines help ensure that code that uses compile-time evaluation is portable and can be used across different platforms and architectures.

Compile-time evaluation can help performance in several ways:

Reduce runtime overhead: When values or expressions are evaluated at compile time, the resulting code can be optimized by the compiler. This can reduce the amount of runtime overhead that would be incurred if the same calculations were performed at runtime.
Eliminate runtime errors: By evaluating values or expressions at compile time, potential runtime errors can be caught and eliminated before the program is even executed. This can help improve the stability and reliability of the program.
Enable constant propagation: When values are known at compile time, they can be propagated throughout the code as constants. This can eliminate unnecessary memory accesses and reduce the number of instructions that need to be executed, leading to faster program execution.
Allow for more aggressive optimization: By providing the compiler with information about values and expressions at compile time, the compiler can perform more aggressive optimizations, such as loop unrolling, instruction scheduling, and register allocation. These optimizations can improve program performance by reducing the number of instructions that need to be executed and by maximizing the use of hardware resources.

A real-life sample

The following shows a picture of NEMA-TS2 16-channel Malfunction Management Unit (MMU). Credit: Rob Klug

A Malfunction Management Unit (MMU) is a device utilized in the traffic signal control industry to detect conflicts that may arise when conflicting traffic flows are given right of way simultaneously. This is achieved through the use of a soldering board at the hardware level, which defines the compatibility of each pair of different channels. Essentially, each channel is physically connected to the signal head in the field through load switches, and the compatibility between the channels is relayed to the MMU through this hardware board.

The following illustrates an application of C++ compile time evaluation approach. It is part of the open source C++ Virtual Traffic Cabinet Framework (VTC). VTC framework is developed using modern C++ 20.

The code provides O(1) complexity for returning the start position of a given channel. Note the template functions have zero runtime overhead, while all searching are done at compile time. Apart from the performance benefits, the implementation is concise and generic for any sizable current or future evoluation of MMU compatibility cards.

/*!
 * The size of channel compatibility set. For example, for Channel 1 of MMU16,
 * its compatibility set includes 1-2, 1-3, 1-4, ..., 1-16, thus the size is 15.
 * @tparam Channel - The given MMU chanel.
 * @tparam MaxChannel - Max number of channels the MMU supports.
 * @return The size of the compatibility set of the given channel.
 */
template<size_t Channel, size_t MaxChannel> requires ((Channel >= 1) && (Channel <= MaxChannel))
constexpr size_t ChannelSegmentSize()
{
  return (MaxChannel - Channel);
}

/*!
 * The start position (0-based) for the given MMU channel in the fixed-size MMU channel compatibility byte array.
 * @tparam Channel - The given MMU channel.
 * @tparam MaxChannel - Max number of channels the MMU supports.
 * @return The start position (0-based) for the given MMU channel.
 * @remarks MMU channel compatibility is represented by a fixed-size byte array, for
 * MMU16, the byte array has 120 bytes. Each channel has a start position and total number of relevant
 * bytes in the stream describing the channel's compatibility.
 */
template<size_t Channel, size_t MaxChannel = 16> requires ((Channel >= 1) && (Channel <= MaxChannel))
constexpr size_t ChannelSegmentStartPos()
{
  if constexpr (Channel == 1) {
    return 0;
  } else if constexpr (Channel == 2) {
    return ChannelSegmentSize<1, MaxChannel>();
  } else {
    return ChannelSegmentSize<Channel - 1, MaxChannel>() + ChannelSegmentStartPos<Channel - 1>();
  }
}

`constexpr`

constexpr is a C++ keyword that was introduced in C++11 to allow the evaluation of expressions at compile time. It specifies that the value of a variable or function can be computed at compile time, and therefore can be used in places where a constant expression is required.

`constexpr` vs. `const`

const only guarantees that the value of a variable cannot be changed after it is initialized, whereas constexpr guarantees that the value of a variable can be computed at compile time. Therefore, constexpr is more powerful than const because it enables the use of constant expressions in more contexts.

Here are some examples of how constexpr can be used:

constexpr int square(int x) {
    return x * x;
}

constexpr int x = 5;

// y is computed at compile time
constexpr int y = square(x); 

// z is computed at run time
const int z = square(6); 

constexpr int arr_size = 10;

// arr_size is a constant expression
int arr[arr_size]; 

constexpr char c = 'A' + 1;

// static_assert is a compile-time assertion
static_assert(c == 'B', "c should be equal to 'B'");

`constexpr` function

To make a function constexpr, it must meet the following conditions:

Must have a Non-void return type.

// Must return a non-void type, like int here
constexpr int square(int x) { 
    return x * x;
}

A constexpr function cannot have a return type of void, as it must produce a constant expression.

Must be defined with constexpr keyword.

// Use the 'constexpr' keyword before the function definition
constexpr int factorial(int n) { 
    return (n <= 1) ? 1 : n * factorial(n - 1);
}

Must not contain any definitions of variables with non-const-qualified types, unless they are initialized with a constant expression:

// Must use const-qualified type.
constexpr int sum(int a, int b) {
    const int result = a + b; 
    return result;
}

// Non-const variables are allowed as long as they are 
// initialized with a const expression.
// This is only valid when (a + b) produces a constant
// expression.
constexpr int add(int a, int b) {
    // 'sum' is initialized with a constant expression (a + b)
    int sum = a + b; 
    return sum;
}

May include control structures and constructs, such as if, switch, for, while, and do-while loops, provided they don't violate other constexpr constraints. static_assert, typedef, using, if constexpr, and returnare also allowed.

#include <iostream>

constexpr int factorial(int n) {
    int result = 1;
    for (int i = 1; i <= n; ++i) {
        result *= i;
    }
    return result;
}

int main() {
    constexpr auto a = factorial(5);
    return 0;
}

The generated assembly code confirms that variable a is evaluated at the compile time:

main:                                 
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 4], 0
        mov     dword ptr [rbp - 8], 120
        xor     eax, eax
        pop     rbp
        ret

Can only call other constexpr functions.

constexpr int square(int x) {
    return x * x;
}

// Only call other constexpr functions
constexpr int square_sum(int a, int b) {
    return square(a) + square(b); 
}

Must produce constant expressions when called with constant expressions.

#include <iostream>

constexpr int power(int base, int exponent) {
    int result = 1;
    for (int i = 0; i < exponent; ++i) {
        result *= base;
    }
    return result;
}

int main() {
    constexpr auto b = power(2, 5);
    return 0;
}

The following assembly code confirms that no run time computation is performed when calculating power(2, 5).

main:
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 4], 0
        mov     dword ptr [rbp - 8], 32
        xor     eax, eax
        pop     rbp
        ret

Can modify constexpr object that has a lifetime extends longer than the constexpr function.

constexpr int next(int x)
{
    return ++x;
}

char buffer[next(5)] = { 0 };

Constructor

constexpr constructors in C++ are used to create constant expressions of user-defined types during compile-time. They are useful because they allow for more efficient code by performing computations at compile-time and enabling the usage of user-defined types in other constexpr contexts.

constexpr constructors were introduced in C++11, along with the general constexpr specifier.

Conditions (or constraints) for constexpr constructors:

The constructor must not be a copy or move constructor.
Every expression and construct used in the constructor must be a constant expression.
Every base class and member of the class must have a constexpr constructor.
Every constructor call and full-expression in the constructor's member initializers must be a constant expression.

Here's an example of a constexpr constructor:

class Point {
public:
    constexpr Point(int x, int y) : x_(x), y_(y) {
        // Since C++14, the body of a constexpr constructor can include
        // other constructs like if statements and loops, as long as they
        // meet the constexpr requirements.
        if (x_ < 0) { x_ = 0; }
        if (y_ < 0) { y_ = 0; }
    }

    constexpr int getX() const { return x_; }
    constexpr int getY() const { return y_; }

private:
    int x_;
    int y_;
};

int main() {
    constexpr Point p1(1, 2);
    constexpr int x = p1.getX();
    constexpr int y = p1.getY();
}

Member initializer

When defining a constexpr constructor, the constructor's member initializer list must only contain constant expressions. This means that when initializing member variables or calling base class constructors, the expressions used must be evaluable compile-time. This is required to guarantee that the object can be constructed as a constant expression during compile-time.

Here's an example to illustrate this requirement:

class Base {
public:
    constexpr Base(int value) : value_(value) {}

private:
    int value_;
};

class Derived : public Base {
public:
    // Both initializers are constant expressions
    constexpr Derived(int baseValue, int derivedValue) 
        : Base(baseValue), derivedValue_(derivedValue) {} // Both initializers are constant expressions

private:
    int derivedValue_;
};

int main() {
    // Constructed as a constant expression during compile-time
    constexpr Derived d(1, 2); 
}

Destructor

If a class has a constexpr constructor and is meant to be used in a constexpr context, then the destructor should be trivial. A trivial destructor does not perform any custom actions, allowing the object to be safely used in a constexpr context.

A destructor is considered trivial if:

It is not user-provided (i.e., the compiler generates the destructor implicitly).

The class has no virtual functions or virtual base classes.

All direct base classes have trivial destructors.

For all non-static data members of the class that are of class type (or array thereof), each such class has a trivial destructor.

Here's an example of a class with a constexpr constructor and a trivial destructor:

class Point {
public:
    constexpr Point(int x, int y) : x_(x), y_(y) {}

    // Destructor is trivial (not user-provided and no custom actions)
    // ~Point() = default;

    constexpr int getX() const { return x_; }
    constexpr int getY() const { return y_; }

private:
    int x_;
    int y_;
};

int main() {
    constexpr Point p(1, 2);
}

`constexpr` function returning `void`

A member function of a class can be declared constexpr and have a return type of void, for performing a sequence of actions at compile time. For example:

class MyClass {
public:
    constexpr void doSomething() {
        myData = 42; // Set a constexpr data member
    }

    constexpr int getMyData() const {
        return myData; // Return the value of the constexpr data member
    }

private:
    int myData = 0; // Define a constexpr data member
};

int main() {
    constexpr MyClass obj;
    obj.doSomething(); // This call is evaluated at compile time
    static_assert(obj.getMyData() == 42, "Unexpected value of myData");
}

Note that constexpr void doSomething() does not have to be qualified with const.

Precision of floating-point `constexpr`

In C++11 and later, constexpr functions can compute floating-point expressions and return floating-point values as constant expressions.

One limitation of constexpr floating-point computations is that they must terminate in a finite number of steps known at compile time, which means that they cannot compute certain mathematical functions or operations that require an infinite number of steps or iterations. Because of this, the use of functions like std::sin and std::sqrt within constexpr functions is not allowed inside constexpr function.

Additionally, the standard imposes specific requirements on the rounding behavior of constexpr floating point operations. For example, if a constexpr floating point operation results in a value that cannot be represented exactly, the result must be rounded in a manner consistent with the floating point rounding mode specified by the implementation.

The C++ standard requires that constexpr functions produce the same results as their non-constexpr counterparts when called with the same arguments.

This means that if a non-constexpr function performs a floating point computation with a certain precision, a constexpr function that performs the same computation must produce a result that is at least as precise. The standard does not specify a minimum level of precision, but it requires that the result of a constexpr floating point computation be consistent and reproducible, so that the same result is obtained every time the computation is performed.

In practice, the precision of constexpr floating point computations will depend on the compiler and the platform being used. In general, compilers will try to produce constexpr results that are as precise as possible, but there may be cases where the precision is lower than the runtime counterpart due to limitations of the compiler or platform.

std::numeric_limits

std::numeric_limits is a class template defined in the C++ standard library that provides information about the properties of arithmetic types, such as minimum and maximum representable values, number of significant digits, and whether the type is signed or unsigned.

The std::numeric_limits class template has the following general syntax:

template<typename T>
class numeric_limits {
public:
    static constexpr bool is_specialized;
    static constexpr T min() noexcept;
    static constexpr T max() noexcept;
    static constexpr T lowest() noexcept;
    static constexpr int digits;
    static constexpr int digits10;
    static constexpr int max_digits10;
    static constexpr bool is_signed;
    static constexpr bool is_integer;
    static constexpr bool is_exact;
    static constexpr int radix;
    static constexpr T epsilon() noexcept;
    static constexpr T round_error() noexcept;
    static constexpr int min_exponent;
    static constexpr int min_exponent10;
    static constexpr int max_exponent;
    static constexpr int max_exponent10;
    static constexpr bool has_infinity;
    static constexpr bool has_quiet_NaN;
    static constexpr bool has_signaling_NaN;
    static constexpr float_denorm_style has_denorm;
    static constexpr bool has_denorm_loss;
    static constexpr T infinity() noexcept;
    static constexpr T quiet_NaN() noexcept;
    static constexpr T signaling_NaN() noexcept;
    static constexpr T denorm_min() noexcept;
};

The std::numeric_limits class template provides a set of static member functions and constants that can be used to query the properties of the template parameter type T. These functions and constants are all constexpr, which means that they can be evaluated at compile-time and used in constant expressions.

The constexpr specifier is useful for several reasons in the context of std::numeric_limits. For one, it allows the properties of a type to be determined at compile-time, which can be useful for optimization purposes. Additionally, it enables the use of these properties in other constexpr contexts, such as defining other constexpr functions or variables. This can help improve the efficiency and readability of code. For example:

#include <iostream>
#include <limits>

template<typename T>
constexpr bool is_power_of_two(T value) {
    return value != 0 && (value & (value - 1)) == 0;
}

template<typename T>
constexpr T next_power_of_two(T value) {
    static_assert(std::numeric_limits<T>::is_integer, "Type must be an integer");
    static_assert(std::numeric_limits<T>::is_signed == false, "Type must be unsigned");

    if (is_power_of_two(value)) {
        return value;
    } else {
        T result = 1;
        while (result < value) {
            result <<= 1;
        }
        return result;
    }
}

int main() {
    constexpr unsigned int x = 31;
    constexpr auto y = next_power_of_two(x);
    std::cout << "The next power of two after " << x << " is " << y << '\n';
    return 0;
}

C++20 `constexpr` math functions

In C++20, many math functions from the <cmath> library were made constexpr. This enables complex mathematical operations at compile time, which can lead to more efficient and optimized code.

The main advantage of using constexpr math functions is that they enable calculations at compile time rather than at runtime. This can lead to performance improvements because the compiler can optimize the code based on the known constant values. Additionally, because the values are known at compile time, they can be used in places where a constant expression is needed, such as in array sizes and template arguments.

Here are some important points to remember about constexpr math functions:

Only a subset of math functions from the <cmath> library are constexpr in C++20. Other functions may still be evaluated at runtime.
The arguments provided to a constexpr function must be constant expressions themselves. Otherwise, the function call will not be evaluated at compile time.
constexpr math functions are subject to the same floating-point rounding and accuracy limitations as their runtime counterparts. In other words, you should be aware of potential floating-point inaccuracies when using constexpr functions in calculations.
Some compilers may not yet fully support C++20 or all of its constexpr math functions. Be sure to check the documentation of the compiler being used to ensure that it supports the specific functions.

Here is a list of selected math functions that became constexpr in C++20. Note that this list is not exhaustive, but it covers some of the most commonly used functions. Once again, these functions became constexpr in C++20, not C++17.

Here's the table sorted by function name in ascending order:

Function	Description	Since
`abs`	Absolute value	C++20
`acos`	Arc cosine function	C++20
`acosh`	Inverse hyperbolic cosine function	C++20
`asin`	Arc sine function	C++20
`asinh`	Inverse hyperbolic sine function	C++20
`atan`	Arc tangent function	C++20
`atan2`	Arc tangent function with two parameters	C++20
`atanh`	Inverse hyperbolic tangent function	C++20
`cbrt`	Cube root	C++20
`ceil`	Ceiling function	C++20
`copysign`	Copy sign of a number	C++20
`cos`	Cosine function	C++20
`cosh`	Hyperbolic cosine function	C++20
`div`	Integral division	C++20
`drem`	Deprecated; use remainder instead	C++20
`erf`	Error function	C++20
`erfc`	Complementary error function	C++20
`exp`	Exponential function	C++20
`exp2`	Base-2 exponential function	C++20
`expm1`	Exponential function minus 1	C++20
`fdim`	Positive difference	C++20
`floor`	Floor function	C++20
`fma`	Fused multiply-add	C++20
`fmax`	Maximum of two floating-point values	C++20
`fmin`	Minimum of two floating-point values	C++20
`fmod`	Floating-point remainder (modulo)	C++20
`frexp`	Break floating-point number into fraction	C++20
`gamma`	Deprecated; use tgamma instead	C++20
`gamma_r`	Deprecated; use lgamma instead	C++20
`hypot`	Hypotenuse	C++20
`ilogb`	Integral logarithm of exponent base-2	C++20
`j0`	Bessel function of the first kind of order 0	C++20
`j1`	Bessel function of the first kind of order 1	C++20
`jn`	Bessel function of the first kind of order n	C++20
`ldexp`	Multiply by integral power of 2	C++20
`lgamma`	Natural logarithm of the absolute value of the gamma function	C++20
`llrint`	Round to long long integral value	C++20
`llround`	Round to nearest long long integer	C++20
`log`	Natural logarithm	C++20
`log10`	Base-10 logarithm	C++20
`log1p`	Natural logarithm of 1 plus argument	C++20
`log2`	Base-2 logarithm	C++20
`logb`	Base-2 logarithm of exponent	C++20
`lrint`	Round to long integral value	C++20
`lround`	Round to nearest long integer	C++20
`max`	Maximum of two values	C++20
`min`	Minimum of two values	C++20
`modf`	Decompose a floating-point number into its integer and fractional parts	C++20
`nan`	Generate quiet NaN	C++20
`nearbyint`	Round to integral value in current rounding mode	C++20
`nextafter`	Next representable floating-point value	C++20
`nexttoward`	Next representable floating-point value toward a long double	C++20
`pow`	Power function	C++20
`remainder`	Remainder of the floating-point division	C++20
`remquo`	Remainder and quotient of the floating-point division	C++20
`rint`	Round to integral value	C++20
`round`	Round to nearest integer	C++20
`scalb`	Deprecated; use scalbn or scalbln instead	C++20
`scalbln`	Scale floating-point number by a power of FLT_RADIX as a long integer	C++20
`scalbn`	Scale floating-point number by a power of FLT_RADIX	C++20
`significand`	Get the significand of a floating-point number	C++20
`sin`	Sine function	C++20
`sinh`	Hyperbolic sine function	C++20
`sqrt`	Square root	C++20
`tan`	Tangent function	C++20
`tanh`	Hyperbolic tangent function	C++20
`tgamma`	Gamma function	C++20
`trunc`	Truncate function	C++20
`y0`	Bessel function of the second kind of order 0	C++20
`y1`	Bessel function of the second kind of order 1	C++20
`yn`	Bessel function of the second kind of order n	C++20

In C++17, lambda expressions can be used as constexpr by default, meaning they can be evaluated at compile-time. This feature enables developers to perform computations at compile-time, reducing runtime overhead and improving performance in certain cases. It can also make the code more readable and easier to understand.

Lambda expressions are anonymous functions that can be defined and used within code. They have the following general syntax:

[capture](parameters) -> return_type { function_body }

Using lambda as `constexpr` in C++17:

Since C++17, lambdas are implicitly constexpr by default, which means they can be used in constant expressions, as long as the lambda body and its captures are constexpr-compatible. Here's an example to illustrate this:

#include <iostream>

int main() {
    constexpr auto square = [](int x) {
        return x * x;
    };

    constexpr int result = square(5);
    static_assert(result == 25, "Square of 5 should be 25");

    std::cout << "Square of 5: " << result << std::endl;
    return 0;
}

Benefits of using lambda as `constexpr`

Compile-time computation: Using constexpr lambdas can shift computation from runtime to compile-time, potentially improving performance for computationally expensive operations.
Readability and expressiveness: By using lambdas, one can write more expressive and readable code, as functions can be defined and used in-place, right where they are needed.
Type inference: Lambdas can deduce the return type automatically, making the code shorter and easier to understand.
Better optimization: Since the lambda is evaluated at compile-time, the compiler has more opportunities to optimize the code further.
Enhanced safety: Using constexpr ensures that the lambda can only be used in constant expressions, which can help catch errors early in the development process.

Runtime degrading

A constexpr lambda can degrade into a runtime lambda when it's used in a context that doesn't require a constant expression or when it doesn't meet the requirements for a constexpr function. In such cases, the lambda will be evaluated at runtime instead of compile-time.

Here are some conditions that can cause a constexpr lambda to degrade into a runtime lambda:

Non-constexpr parameters or captures: If the lambda captures or accepts non-constexpr variables as parameters, the lambda will not be able to be evaluated at compile-time. For example:

int non_const_var = 10;
auto lambda = [non_const_var](int x) {
    return x * non_const_var;
};
int result = lambda(5); // This will be evaluated at runtime

Non-constexpr expressions in the lambda body: If the lambda body contains expressions that cannot be evaluated at compile-time, the lambda will not be constexpr. For example:

#include <iostream>
#include <cmath>

constexpr auto sqrt_lambda = [](double x) {
    return std::sqrt(x); // std::sqrt is not constexpr (prior to C++20)
};

int main() {
    double result = sqrt_lambda(25.0); // This will be evaluated at runtime
    std::cout << "Square root of 25: " << result << std::endl;
    return 0;
}

Using the lambda in a non-constexpr context: Even if the lambda itself is constexpr, if it is used in a context that doesn't require a constant expression, it will be evaluated at runtime. For example:

constexpr auto square = [](int x) {
    return x * x;
};

int main() {
    int input = 0;
    std::cout << "Enter an integer: ";
    std::cin >> input;

    int result = square(input); // This will be evaluated at runtime
    std::cout << "Square of " << input << ": " << result << std::endl;
    return 0;
}

In this example, although the square lambda is constexpr, it is used with a runtime input value, so it's evaluated at runtime.

When a constexpr lambda degrades into a runtime lambda, it doesn't cause any errors or warnings. It simply means that the lambda is evaluated at runtime, and the performance advantages and safety guarantees of a constexpr lambda are not achieved.

Inlining constexpr

In C++17, a constexpr static data member is implicitly inline. This means that the static data member has the same address in every translation unit that uses it, and there is no need to provide a separate definition for the data member in a source file.

The following example would produce a linker error pre-C++ 17:

// MyClass.h
class MyClass {
public:
    static constexpr int myConstExpr = 42;
};

// main.cpp
#include "MyClass.h"
#include <iostream>

void printAddress(const int *ptr);

int main() {
    // Taking the address of myConstExpr, this requires a definition.
    printAddress(&MyClass::myConstExpr); 
    return 0;
}

void printAddress(const int *ptr) {
    std::cout << "Address of myConstExpr: " << ptr << std::endl;
}

In this case, the address of MyClass::myConstExpr is required, so a separate definition is needed in a source file for pre-C++17:

// MyClass.cpp (pre-C++17)
#include "MyClass.h"

// Definition in source file required to avoid linker errors
const int MyClass::myConstExpr;

However, in C++17, the separate definition is not necessary, as the constexpr static data member is implicitly inlined:

// MyClass.h (C++17)
class MyClass {
public:
    // Automatically inlined, no separate definition required
    static constexpr int myConstExpr = 42; 
};

The following code will not produce a linker error for pre-C++17. This is because the compilier just does a compile time replacement for the line std::cout << "Value of myConstExpr: " << MyClass::myConstExpr << std::endl;, directly replacing MyClass::myConstExpr with 42. There is no addressing involved, hence no linker error.

// MyClass.h
class MyClass {
public:
    static constexpr int myConstExpr = 42;
};

// main.cpp
#include "MyClass.h"
#include <iostream>

int main() {
    std::cout << "Value of myConstExpr: " << MyClass::myConstExpr << std::endl;
    return 0;
}

Conditional Compilation

`if constexpr` and `#if`

C++'s if constexpr is not directly intended to replace conditional defines (e.g., #ifdef or #if). While they serve somewhat similar purposes, they have different use cases and operate at different stages of the compilation process.

#ifdef and #if are preprocessor directives in C++ that allow conditional compilation. They operate at the preprocessing stage, which occurs before the actual compilation. Conditional defines are typically used to conditionally include or exclude sections of code based on compile-time conditions or macros.

On the other hand, if constexpr is a feature introduced in C++17 that allows compile-time evaluation of conditions within the context of template metaprogramming or constexpr functions. It is part of the regular C++ code and is evaluated during the compilation process, not the preprocessing stage. if constexpr allows you to conditionally choose between different branches of code based on compile-time constant expressions.

Here's an example to illustrate the difference:

#include <iostream>

#define USE_FEATURE

void doSomething() {
#ifdef USE_FEATURE
    std::cout << "Feature is enabled." << std::endl;
#else
    std::cout << "Feature is disabled." << std::endl;
#endif
}

template <bool UseFeature>
void doSomethingTemplate() {
    if constexpr (UseFeature) {
        std::cout << "Feature is enabled." << std::endl;
    } else {
        std::cout << "Feature is disabled." << std::endl;
    }
}

int main() {
    doSomething();  // Output depends on the USE_FEATURE macro.

    doSomethingTemplate<true>();  // Output depends on the template argument.
    doSomethingTemplate<false>();

    return 0;
}

In this example, doSomething() uses a conditional define to determine which section of code to compile based on the USE_FEATURE macro. On the other hand, doSomethingTemplate() is a function template that utilizes if constexpr to conditionally choose between different code branches at compile time based on the template argument.

While if constexpr can sometimes be used to achieve similar conditional behavior as conditional defines, their usage and capabilities are different. Conditional defines are more flexible and can be controlled externally via macros or command-line options, while if constexpr operates within the confines of the C++ code and allows compile-time decision making based on template arguments or constexpr conditions.

Short-circuit behavior

Unlike regular if statements, where the short-circuit behavior applies to the evaluation of the condition, if constexpr evaluates the condition at compile-time, and all branches are checked for syntactic correctness regardless of the condition's value.

In this example:

template <typename T>
void foo(T value) {
    if constexpr (std::is_integral_v<T> && (value > 0)) {
        // Code specific to integral types and positive values
        // ...
    } else {
        // Code for other cases
        // ...
    }
}

Both std::is_integral_v<T> and (value > 0) will be evaluated during compilation, regardless of the outcome of the condition. This means that any type-dependent or invalid code inside the discarded branch may still lead to compilation errors.

Branch elimination

In an if constexpr statement, the condition is evaluated at compile-time. If the condition is determined to be false during compilation, the code inside the branch that is not taken (either if or else) is discarded by the compiler. The discarded branch is not checked for syntactic correctness or compiled.

This compile-time evaluation and branch elimination make if constexpr useful for conditional compilation and optimizing code based on compile-time conditions.

By discarding the unused branch, the compiler avoids checking its syntax and does not generate any corresponding object code. This can help improve the compile time and reduce the size of the resulting binary executable.

Always provide `else` branch

It is generally a good practice to provide an else branch or alternative handling for all possible cases in an if constexpr statement to avoid potential runtime issues and ensure that all scenarios are properly handled.

template<class T>
auto subtract(T a, T b) {
    if constexpr (std::is_same<T, double>::value) {
        if (std::abs(a - b) < 0.0001) {
            return 0.0;
        } else {
            return a - b;
        }
    } else if constexpr (std::is_integral<T>::value) {
        return a - b;
    } else {
        static_assert(always_false<T>::value, "Non-handled type for subtract function");
    }
}

In this code, both double and integral types are explicitly handled. If a type is used that is neither double nor an integral type, the static_assert will trigger a compile-time error with a clear message, which is generally preferable to a more obscure error about invalid operations. This is a more defensive programming strategy that makes sure all potential types are handled.

`constexpr` virtual method

In C++20, virtual methods can be declared as constexpr, enabling their evaluation during compile time. This allows for potential optimizations where the virtual method can be resolved and reduced to a simple assignment without the overhead of a function call.

Note - such optimizations occur when the static type of the object is known at compile time.

Consider an example where the base class has a non-constexpr virtual method, but the derived class overrides it as constexpr:

class Base {
public:
    virtual int getValue() { return 42; }
};

class Derived : public Base {
public:
    constexpr int getValue() override { return 10; }
};

Suppose an object of the derived class with the static type known at compile time:

Derived der = Derived();
int value = der.getValue();

With proper compiler optimizations, the constexpr virtual method getValue can be evaluated at compile time and reduced to a direct assignment without a function call overhead. The resulting assembly code might resemble the following:

mov DWORD PTR [ebp-4], 10

This assembly code demonstrates a direct assignment of the constant value 10 to the variable value without any function call involved. The compiler can determine the value of getValue at compile time, considering the known static type of the object.

It's important to note that the specific optimization and resulting assembly code may vary depending on the compiler, compiler flags, and optimizations enabled. However, with appropriate optimizations, a constexpr virtual method can indeed be optimized to a simple assignment during compile time, avoiding the function call overhead.

try-catch

In C++20, the language standard introduced the ability to use try-catch blocks inside constexpr functions. Prior to C++20, constexpr functions were limited to containing only a subset of operations that were considered "constexpr-friendly." This limitation prevented the use of exceptions, dynamic memory allocation, and other runtime-only features.

With C++20, the restrictions on constexpr functions have been relaxed, and try-catch blocks are now allowed inside constexpr functions. This change allows for more expressive and flexible constexpr functions, enabling them to handle exceptions and perform more complex operations at compile time.

The primary motivation behind allowing try-catch blocks in constexpr functions is to enable error handling and better handling of unexpected situations during compile-time evaluation. It allows constexpr functions to handle exceptions and provide a fallback mechanism in case of errors. This can be useful in scenarios where you want to perform complex computations at compile time, but need to handle potential errors gracefully.

Here's an example that demonstrates the usage of try-catch blocks inside a constexpr function:

constexpr int divide(int a, int b) {
    try {
        return a / b;
    } catch (...) {
        return 0; // fallback value in case of division by zero or other exceptions
    }
}

int main() {
    constexpr int result = divide(10, 2);
    static_assert(result == 5, "Division failed at compile time!");
    return 0;
}

In the above example, the divide function attempts to perform division but handles the potential exception by catching any exception thrown. If an exception occurs, it returns a fallback value of 0.

It's important to note a few caveats and considerations when using try-catch blocks in constexpr functions:

Exceptions inside constexpr functions are only evaluated during compile time. If an exception is thrown, the program won't terminate at runtime. Instead, the exception is handled by the constexpr function, and the program continues execution.
The exception handling in constexpr functions is limited to exceptions that are handled within the constexpr function itself. It does not allow for exceptions to propagate to the calling context.
The use of dynamic memory allocation (e.g., new, malloc) is still not allowed in constexpr functions, even with the introduction of try-catch blocks.

Overall, the addition of try-catch blocks in constexpr functions in C++20 expands the capabilities of compile-time evaluation and allows for more robust error handling during constexpr computations.

Default Initialization of `constexpr` Objects

In C++20, the language standard introduced the ability to use trivial default construction for constexpr objects. Trivial default construction means that a constexpr object can be default-initialized without explicitly providing a constructor or initializer.

Here is an example that demonstrates the usage of trivial default construction in a constexpr function:

struct X {
    bool val;
};

constexpr void f() {
    X x;
}

The above code only works with C++20. C++ 17 requires that explicit initialization for constexpr objects must be provided to ensure their proper initialization. Here's an example of explicit initializing a constexpr object in C++17:

struct X {
    bool val;
};

constexpr void f() {
    X x{true}; // Explicit initialization required in C++17
}

The following example demonstrates the usage of trivial default construction in a more practical scenario:

#include <array>

constexpr std::array<int, 5> createArray() {
    std::array<int, 5> arr;
    for (int i = 0; i < arr.size(); ++i) {
        arr[i] = i * i;
    }
    return arr;
}

int main() {
    constexpr std::array<int, 5> result = createArray();
    // Use the constexpr array at compile time
    static_assert(result[2] == 4, "Unexpected value at compile time!");
    return 0;
}

In this example, the constexpr function createArray creates an array of integers and assigns values to its elements using a loop. The array arr is default-initialized without explicitly providing an initializer because std::array is a trivial type. The function returns the resulting array, which can then be used at compile time.

By allowing trivial default construction for constexpr objects, C++20 simplifies the initialization process for certain types and enables more concise and efficient constexpr code. It can be particularly beneficial when working with trivial types or when initializing objects that don't require explicit initialization before use.

`consteval` and `constinit`

`consteval`

consteval keyword was introduced in C++20 as a new kind of function declaration known as a "consteval function." A consteval function is designed to be evaluated (and must be evaluable) at compile-time within constant expressions.

To be valid, a consteval function must have a literal type, meaning that its type can be used within a constant expression. Additionally, the body of a consteval function must be fully evaluated at compile-time, without any runtime execution. If these requirements are not met, the compiler will generate an error.

Here's an example of a consteval function:

consteval int square(int x) {
    return x * x;
}

Difference between `consteval` and `constexpr`

constexpr int add(int x, int y) {
    return x + y;
}

consteval int multiply(int x, int y) {
    return x * y;
}

int main() {
    constexpr int result1 = add(3, 4);        // Evaluates at compile-time
    consteval int result2 = multiply(5, 6);   // Evaluates at compile-time

    int x = 2, y = 3;
    int result3 = add(x, y);                  // Evaluates at runtime

    return 0;
}

In the code above, the add function is declared as constexpr, allowing it to be evaluated at both compile-time and runtime. The multiply function is declared as consteval, ensuring that it is evaluated strictly at compile-time within constant expressions.

`constinit`

The constinit specifier is introduced in C++20 to qualify a variable with static storage duration. A variable marked with constinit specifier must be initialized with compile-time constant expressions and it guarantees that the initialization will be done during the static initialization phase. It prevents the variables with static storage duration to be initialized at runtime.

constinit cannot be used together with constexpr or consteval as constinit is used for static initialization of variables, which happens before the program starts the execution, whereas constexpr and consteval are used to evaluate the expression at compile time.
constinit forces constant initialization of static or thread-local variables. It can help to limit static order initialization fiasco by using precompiled values and well-defined order rather than dynamic initialization and linking order
constinit does not mean that the object is immutable. constinit variable cannot be used in constant expressions

#include <array>

// init at compile time
constexpr int compute(int v) { return v*v*v; }
constinit int global = compute(10);

// won't work:
// constinit int another = global;

int main() {
    // but allow to change later...
    global = 100;

    // global is not constant!
    // std::array<int, global> arr;
}

main:
 push   rbp
 mov    rbp,rsp
 mov    DWORD PTR [rip+0x2efc],0x64        # 404010 <global>
 mov    eax,0x0
 pop    rbp
 ret
 nop    DWORD PTR [rax+rax*1+0x0]

The following table summaries all const specifiers (credit: Bartłomiej Filipek)

Keyword	On Auto Variables	On Static/Thread_Local Variables	On Functions	On Constant Expressions
`const`	Yes	Yes	As const member functions	Sometimes
`constexpr`	Yes or Implicit (in constexpr functions)	Yes	To indicate constexpr functions	Yes
`consteval`	No	No	To indicate consteval functions	Yes (as a result of a function call)
`constinit`	No	To force constant initialization	No	No, a `constinit` variable is not a constexpr variable

std::is_constant_evaluated

std::is_constant_evaluated function was introduced in C++20 as a standard library feature. It provides a way to check whether a function is being evaluated in a constant expression context or a non-constant expression context. This feature enables developers to write code that behaves differently during compile-time evaluation compared to runtime execution.

The motivation behind introducing std::is_constant_evaluated is to allow for explicit compile-time evaluation, which provides more control and flexibility in code execution. It allows developers to optimize certain operations or choose alternate code paths specifically for constant expressions.

Here's an example that demonstrates the usage of std::is_constant_evaluated:

#include <iostream>

void printEvaluationContext() {
    if (std::is_constant_evaluated()) {
        std::cout << "Constant expression evaluation" << std::endl;
    } else {
        std::cout << "Runtime execution" << std::endl;
    }
}

constexpr int doubleValue(int value) {
    if (std::is_constant_evaluated()) {
        return value * 2;  // Constant expression evaluation
    } else {
        std::cout << "Runtime evaluation" << std::endl;
        return value;      // Runtime execution
    }
}

int main() {
    printEvaluationContext();

    constexpr int result1 = doubleValue(10);
    std::cout << "Result 1: " << result1 << std::endl;

    int value = 20;
    int result2 = doubleValue(value);
    std::cout << "Result 2: " << result2 << std::endl;

    return 0;
}

Other C++20 Enhancements

In C++20, several enhancements were made to the constexpr feature, including the ability to modify members of a union and the inclusion of certain language constructs like dynamic_cast, typeid, and inlined assembly within constexpr functions.

Modifying members of a union in constexpr

In earlier versions of C++, modifying a member of a union within a constexpr context was not allowed. However, starting from C++20, it became possible. Here's an example that demonstrates this:

#include <iostream>

union MyUnion {
    int i;
    float f;
};

constexpr int modifyUnionMember(int value) {
    MyUnion u;
    u.i = value;
    return u.f;  // Modify the float member
}

int main() {
    constexpr int modifiedValue = modifyUnionMember(42);
    std::cout << "Modified value: " << modifiedValue << std::endl;
    return 0;
}

`dynamic_cast` and `typeid` within `constexpr`

C++20 also introduced the ability to use dynamic_cast and typeid operators within constexpr functions. This allows for dynamic type checks and type information retrieval during compile-time evaluation. Here's an example:

#include <iostream>
#include <typeinfo>

struct Base {
    virtual ~Base() {}
};

struct Derived : Base {};

constexpr bool isDerivedFromBase(const Base* obj) {
    return dynamic_cast<const Derived*>(obj) != nullptr;
}

constexpr const std::type_info& getTypeInfo(const Base* obj) {
    return typeid(*obj);
}

int main() {
    constexpr Base* basePtr = new Derived();
    constexpr bool isDerived = isDerivedFromBase(basePtr);
    constexpr const std::type_info& typeInfo = getTypeInfo(basePtr);

    std::cout << "Is Derived from Base? " << isDerived << std::endl;
    std::cout << "Type info: " << typeInfo.name() << std::endl;

    delete basePtr;
    return 0;
}

Inlined assembly within `constexpr`

C++20 also allows the use of inlined assembly within constexpr functions, enabling low-level operations during compile-time evaluation. Here's an example:

#include <iostream>

constexpr int addNumbersInlineAssembly(int a, int b) {
    int result;
    asm("add %[a], %[b];"
        : [result] "=r" (result)
        : [a] "r" (a), [b] "r" (b)
    );
    return result;
}

int main() {
    constexpr int sum = addNumbersInlineAssembly(10, 20);
    std::cout << "Sum: " << sum << std::

Modern C++ Type Deduction Mechanisms

Type deduction allows the compiler to infer types automatically, making code more concise, expressive, and easier to maintain.

From C++11 through C++20, the language has introduced a variety of type deduction capabilities.

This chapter provides a comprehensive overview of modern C++ type deduction. It also highlights best practices and rules to help developers use auto, decltype, and related features effectively.

Mastering type deduction is essential for writing clean, robust, and modern C++ code.

Introduction to Type Deduction

What is Type Deduction?

The compiler uses type deduction to determine the type of a variable or return value automatically. It eliminates the need for explicitly specifying types.

The most common interface to type deduction is the auto keyword, introduced in C++11. The compiler infers the type from the initializer:

auto x = 42; // x is deduced to be int

The utility of type deduction goes beyond simplifying type declarations. In modern C++, it plays a significant role in templates, decltype, structured bindings, and function return types as well.

Why Use Type Deduction?

1. Prevents Type Mismatches and Narrowing Conversions

Using auto ensures the deduced type exactly matches the initializer, avoiding silent type conversions or loss of precision.

int x = 0.1;     // Compiles, but silently truncates to 0
auto y = 0.1;    // y is double — no truncation

2. Encourages Consistent Initialization

Because auto requires initialization, it reduces the chance of uninitialized variables:

auto value;         // ❌ Error — must be initialized
int value;          // ✅ Legal, but uninitialized!

3. Improves Code Maintainability

If the return type of a function or container changes, auto adapts automatically:

auto result = myMap.find(key);
// No need to know if it's an iterator or const_iterator

This makes code more resilient to API or type changes.

4. Simplifies Opaque and Long-Name Types

auto comp = [](const std::pair<int, int>& a, const std::pair<int, int>& b) {
    return a.second > b.second;
};

// Creating a priority queue of std::pair<int, int> elements, where:
//  - underlying container is a std::vector<std::pair<int, int>>
//  - comparison function is a custom lambda stored in comp
std::priority_queue<std::pair<int, int>, std::vector<std::pair<int, int>>, decltype(comp)> pq(comp);

With C++17’s class template argument deduction (CTAD), the same declaration becomes more concise, using auto:

auto pq = std::priority_queue{
    std::vector<std::pair<int, int>>{},
    comp
};

auto helps avoid repeating long type names:

std::unordered_map<std::string, std::vector<int>>::iterator it = map.begin();
// becomes
auto it = map.begin();

5. Enables Modern C++ Idioms (C++11–C++23)

auto is fundamental for the following idioms:

Range-based for loops

for (auto& value : container) {
    // Clean and safe iteration
}

Structured bindings

for (auto& [key, value] : myMap) { ... }

Lambdas and closures

auto adder = [](int a, int b) { return a + b; };
std::cout << adder(2, 3); // 5

Trailing return types

template<typename T, typename U>
auto add(T t, U u) -> decltype(t + u) {
    return t + u;
}

decltype(auto) for perfect forwarding

template<typename T>
decltype(auto) forwardValue(T&& val) {
    return std::forward<T>(val);
}

Type Deduction Mechanisms

The table below summarizes C++ type deduction features and their respective introductions into the language standard:

Mechanism	Keyword(s)	Description	Introduced
Function template deduction	template parameters	Deduces template types from function arguments	C++98
Auto type deduction	`auto`	Deduces type from initializer	C++11
Exact expression type	`decltype`, `decltype(auto)`	Queries the exact type of an expression (w/o evaluating)	C++11/14
Return type deduction	`auto`, `decltype(auto)`	Deduces function return type	C++14
Lambda parameter deduction	`auto` in lambda	Deduces parameter types in generic lambdas	C++14
Structured bindings	`auto` with `[ ]`	Unpacks structured types like tuples	C++17
Class template arg deduction	CTAD	Deduces template types from constructor args	C++17
Non-type template deduction	`auto`	Deduce type of constant template parameter	C++17
Abbreviated function templates	`auto` in function param	Template parameter deduction in normal function syntax	C++20
Constrained deduction	Concepts + `auto`	Adds semantic constraints to type deduction	C++20
Compile-time enforcement	`consteval`, `constinit`	Restricts deduction to compile-time context	C++20

Examples

Auto Type Deduction

int i = 42;
auto x = i;   // x is deduced as int

The auto keyword causes the compiler to deduce x as int, based on the initializer.

Decltype Type Query

int i = 42;
decltype(i) y = i;   // y is also int
auto z = (i);         // auto is int, decltype((i)) is int&

decltype determines the type of an expression without evaluating it. Parentheses can influence whether a value or reference type is deduced.

Function Template Deduction

template<typename T>
void print(T value) {
    std::cout << value << std::endl;
}

print(10);   // T is deduced as int

Template arguments are deduced from the function call's parameter types.

Return Type Deduction

auto add(int a, int b) {
    return a + b;   // return type deduced as int
}

The compiler infers the return type from the return expression when auto is used.

Structured Bindings

std::tuple<int, double> t{1, 2.0};
auto [a, b] = t;  // a is int, b is double

Structured bindings destructure compound types into named variables with deduced types.

Lambda Parameter Deduction

auto lambda = [](auto a, auto b) {
    return a + b;
};

Generic lambdas deduce parameter types during invocation, functioning similarly to templated callables.

Class Template Argument Deduction (CTAD)

template<typename T>
struct Wrapper {
    T value;
    Wrapper(T v) : value(v) {}
};

Wrapper w(123);  // T deduced as int

Constructor arguments guide the deduction of template parameters, eliminating the need for explicit specification.

Non-Type Template Parameter (NTTP) Deduction

 #include <iostream>

template<auto N>
void f() {
    std::cout << N << std::endl;
}

int main() {
    f<5>();     // OK: N is deduced as int
    f<'c'>();   // OK: N is deduced as char
    f<5.0>();   // ❌ Error: double is not a valid non-type template parameter
}

Starting with C++17, non-type template parameters can use auto to infer both the value and the type. In C++20, non-type template parameters (NTTPs) were enhanced to allow a broader set of types, but floating-point types (float, double, long double) are still not allowed as non-type template parameters.

For example, the following class Color is a literal class type with structural semantics, and can be used as NTTP:

struct Color {
    int r, g, b;
    constexpr bool operator==(const Color&) const = default;
};

template<Color C>
struct Widget {
    void print() {
        std::cout << C.r << ", " << C.g << ", " << C.b << "\n";
    }
};

int main() {
    Widget<Color{255, 255, 0}> w; // OK in C++20!
    w.print();
}

But the following can not:

struct NonStructural {
    double d;  // ❌ double is not allowed in structural types, due to comparison and representation issues.
    constexpr bool operator==(const NonStructural&) const = default;
};

template<NonStructural N>
struct T {};  // ❌ Error

Abbreviated Function Templates

void log(auto x) {
    std::cout << x;
}

Function templates can be expressed using auto in parameter declarations, reducing boilerplate syntax.

Concepts and Constrained Deduction

template<typename T>
concept Printable = requires(T t) { std::cout << t; };

void log(Printable auto x) {
    std::cout << x;
}

Concepts restrict template parameters to types satisfying specified requirements. The example ensures that x is printable to an output stream.

`consteval` and `constinit` Impact

consteval int square(int x) { return x * x; }

The consteval specifier enforces that the function is evaluated at compile time. This feature is used to guarantee constexpr behavior.

consteval does not itself cause type deduction, but it may participate in deduction contexts. For example, if the return value of a consteval function is used to initialize a variable declared with auto, then type deduction will occur based on the result:

auto y = square(4);  // y deduced as int, square(4) evaluated at compile time

So here, deduction still happens, just as with any function returning a known type. The twist is: the result must be known at compile time.

On the other hand, constinit ensures that a variable with static storage duration (like globals, static members, etc.) is initialized at compile time. It does not mean the variable is constant (unlike const). It ensures that no dynamic initialization will occur — useful for avoiding the static initialization order fiasco.

Similar to consteval, constinit does not perform type deduction itself. But it can interact with deduction:

constinit auto z = square(5);  // auto deduces int

Again, auto deduces the type from the value returned by a consteval function, which satisfies constinit's compile-time requirement.

Feature	Purpose	Role in Type Deduction
`consteval`	Requires function to be CT evaluated	May influence deduction (via result value)
`constinit`	Ensures static init is CT	Works alongside deduction, doesn't perform it
`auto`	Deduce type from initializer	Can use values from `consteval` or `constinit`

The following example illustrates consteval and constinit putting together:

consteval int factorial(int n) {
    return (n <= 1) ? 1 : (n * factorial(n - 1));
}

constinit auto fact5 = factorial(5);  // fact5 is int, initialized at compile time

Pitfall: Object Slicing

Base* d = new Derived();
auto b = *d;  // b is Base, object slicing occurs
b.f();        // Calls Base::f(), not Derived::f()

When deducing by value from a base pointer, object slicing occurs, stripping derived-type behavior.

To preserve polymorphic behavior:

auto& b = *d; // b is Base&
b.f();        // Calls Derived::f()

Type Deduction Rules in C++

The auto keyword instructs the compiler to deduce the type of a variable based on its initializer.

However, deduction follows a set of specific rules, especially regarding references, const qualifiers, value categories, and initializer forms.

The following summarizes the key rules with examples and explanations.

Rule 1: Top-Level CV Qualifiers Are Discarded During Value Initialization

When auto is used to declare a variable and the initializer is a value (i.e. not a reference or pointer), any top-level const or volatile qualifiers in the initializer's type are ignored.

const int i = 5;
auto j = i;        // deduced as int, not const int
auto& m = i;       // deduced as const int&, reference preserves cv-qualifier
auto* p = &i;      // deduced as const int*, pointer type retains cv-qualifier
const auto n = j;  // deduced as const int

Explanation:

j is deduced as int since top-level const is ignored.
When auto is used with reference (&) or pointer (*), the const qualifier is preserved in the deduced type.
const auto applies a new const to the deduced type of auto.

Rule 2: Reference Qualifiers in Initializers Are Ignored in Value Declarations

When a variable is initialized using a reference, but the declaration uses auto without a reference, the reference in the initializer is not preserved.

int i = 5;
int& ref = i;
auto m = ref;  // deduced as int, not int&

Explanation:
The type of m is int because the reference in ref is discarded during value deduction.

Rule 3: Universal References Deduce Lvalue/Rvalue Appropriately

When auto&& is used (also known as a universal or forwarding reference), the deduced type depends on the value category of the initializer:

int i = 5;
auto&& x = i;  // deduced as int&, because i is an lvalue
auto&& y = 10; // deduced as int&&, because 10 is an rvalue

Explanation:
This behavior uses the reference collapsing rules. Lvalues result in T&, and rvalues result in T&&.

Rule 4: Array and Function Types Decay into Pointers

When auto is used to deduce the type of an array or function, the type is deduced as a pointer.

int arr[5];
auto a = arr;  // deduced as int*

int sum(int, int);
auto b = sum;  // deduced as int (*)(int, int)

Explanation:
Array names decay to pointers to the first element, and function names decay to function pointers.

Rule 5: Deduction with List Initialization

C++17 introduces more precise rules for auto with list-initialization. The deduction behavior differs between brace-init and brace-init with =.

Case 1: Direct List Initialization (`auto x{...}`)

auto x1{1};       // deduced as int
auto x2{1, 2};    // error: more than one element

If a single element is used, the type is deduced from that element.
Multiple elements are not permitted—this results in a compilation error.

Case 2: Copy List Initialization (`auto x = {...}`)

auto y1 = {1};       // deduced as std::initializer_list<int>
auto y2 = {1, 2};    // deduced as std::initializer_list<int>
auto y3 = {1, 2.0};  // error: conflicting types, cannot deduce common T

If multiple elements of the same type are used, the type is deduced as std::initializer_list<T>.
If the types differ, deduction fails due to type mismatch.

Pitfall: Object Slicing with `auto`

class Base {
public:
    virtual void f() { std::cout << "Base::f()" << std::endl; }
};

class Derived : public Base {
public:
    void f() override { std::cout << "Derived::f()" << std::endl; }
};

Base* d = new Derived();
auto b = *d;   // deduced as Base (value)
b.f();         // calls Base::f() due to object slicing

Explanation:

*d yields a Base&, but since b is declared with auto (not auto&), the result is value-initialized.
This results in object slicing, where the Derived part of the object is sliced off, and b becomes a pure Base object.
As a result, the virtual function call resolves to Base::f().

To preserve polymorphism:

auto& b = *d;  // deduced as Base&
b.f();         // correctly calls Derived::f()

Summary Table

Scenario	Deduction Result
`auto j = const int`	`int` (cv removed)
`auto& j = const int`	`const int&` (cv preserved)
`auto m = ref` where `ref` is `int&`	`int`
`auto&& m = lvalue`	`T&`
`auto&& m = rvalue`	`T&&`
`auto m = array`	`pointer to element type`
`auto m = function`	`function pointer`
`auto x = {1, 2}`	`std::initializer_list<int>`
`auto x{1, 2}`	error
`auto x = {1, 2.0}`	error

Best Practices

Using auto can greatly improve code clarity and reduce verbosity — but it should be used judiciously. Here are guidelines for when auto is beneficial:

When to Use `auto`

When the Type Is Obvious from the Initializer

Use auto when the type is clear and unambiguous:

auto i = 10;                      // Clearly an int
auto name = std::string("John"); // Obvious string construction

Also ideal for range-based loops and iterator declarations:

for (auto it = container.begin(); it != container.end(); ++it) {
    // Avoids long iterator type
}

When the Type Is Long or Tedious to Write

auto helps avoid unnecessarily verbose or complex type declarations:

auto pair = std::make_pair(42, "answer");
auto mapIter = std::unordered_map<int, std::vector<std::string>>::iterator{};

When Dealing with Lambdas or Callable Objects

auto lambda = [](int x, int y) { return x + y; };

auto boundFunc = std::bind(sum, 5, std::placeholders::_1);

When Working with Templates, STL Iterators, or Ranges

Using auto prevents clutter from deeply nested or templated types:

auto result = someTemplateFunction<T, U>(arg1, arg2);

for (auto& [key, value] : myMap) {
    // Structured bindings with auto make this much cleaner
}

When to Avoid `auto`

When It Makes the Code Ambiguous or Unclear:

auto x = getValue();  // What type is x? Unclear without looking up getValue()

When Explicit Typing is Critical for Readability or Correctness:

int count = 0;  // More readable than auto when you want to emphasize the type

Type Query

This chapter presents a focused exploration of type query mechanisms in modern C++, emphasizing decltype and its interaction with value categories and type deduction.

The Introduction outlines the motivation and typical use cases for querying types at compile time.
The Mechanisms section details core tools such as decltype, typeid, std::declval, and type traits, with practical examples.
The Rules section formalizes the deduction behavior of decltype(e) through five canonical rules and edge cases involving cv-qualifiers and value categories.
Finally, Best Practices offers guidance on when and how to use decltype effectively, especially in generic programming and library design, where exact type preservation is essential.

Introduction to Type Query in C++

Type query in C++ refers to the ability to inspect the type of an expression at compile time without evaluating it. This capability is essential in template meta programming and generic programming, where exact type information influences code generation and correctness. For example:

int x = 42;
decltype(x) y = x; // y is deduced as int

Type query mechanisms serve several key purposes:

Determine the exact type of an expression without requiring evaluation.

int getValue();
// result has the type returned by getValue(), without calling it
decltype(getValue()) result;

Enable compile-time type reflection useful in diagnostics, code synthesis, or meta programming.

template<typename T>
void printTypeInfo(const T& val) {
  std::cout << "Type: " << typeid(decltype(val)).name() << '\n';
}

Preserve type qualifiers such as references and const for accurate type handling.

const int ci = 10;
const int& ref = ci;
// alias is of type const int&, ref and const are preserved
decltype(ref) alias = ci;

Type Query Mechanisms

C++ provides several mechanisms to query the type of expressions at compile time. The primary ones include decltype, typeid, std::declval, and type traits from <type_traits>.

1. Type Specifier `decltype`

decltype(expr) yields the type of the expression expr without evaluating it. This makes it useful for examining the type of variables, function calls, or even complex expressions in a safe way during compilation.

Example:

int a = 5;
decltype(a) b = 10; // b is int

Used for data member:

struct S1 {
    int x1;
    decltype(x1) x2;
    double x3;
    decltype(x2 + x3) x4;
};

Used in function parameter list.

int x1 = 0;
decltype(x1) sum(decltype(x1) a1, decltype(a1) a2)
{
    return a1 + a2;
}
auto x2 = sum(5, 10);

Note on Reference and `const` Preservation:

const int& r = a;
decltype(r) x = a; // x is const int&

The following code would fail:

template<class T>
auto return_ref(T& t)
{
    return t;
}

int x1 = 0;

static_assert(
    std::is_reference_v<decltype(return_ref(x1))>
);

The following would be OK:

template<class T>
auto return_ref(T& t)->decltype(t)
{
    return t;
}

int x1 = 0;

static_assert(
    std::is_reference_v<decltype(return_ref(x1))>
);

decltype preserves the exact type of the expression, including reference and cv-qualifiers.

2. Type Identification Operator `typeid` (Runtime)

typeid(expr) yields a reference to a std::type_info object representing the type of the expression. It is evaluated at runtime and is primarily useful when working with polymorphic types.

Example:

#include <iostream>
#include <typeinfo>

void printType(int x) {
    std::cout << "Type: " << typeid(x).name() << '\n';
}

Note: When used on polymorphic types through a base pointer or reference, typeid reveals the dynamic type. Otherwise, it yields the static type.

Note

Return Value Lifetime
The return value of typeid is a lvalue reference to a const std::type_info object. Its lifetime is extended to the entire lifetime of the program — it is safe to store the reference or pointer.

No Copy Constructor
std::type_info has a deleted copy constructor, so it cannot be copied. Attempting to assign it directly as a value will result in a compilation error.

auto t1 = typeid(int);     // ❌ Error: copy constructor is deleted
auto& t2 = typeid(int);    // ✅ OK: t2 is a const std::type_info&
auto* t3 = &typeid(int);   // ✅ OK: t3 is a const std::type_info*

CV-Qualifiers Ignored typeid always ignores const and volatile qualifiers when comparing types:
```
const int ci = 42;
bool same = (typeid(int) == typeid(ci)); // true
```
This means, typeid(T) == typeid(const T) == typeid(volatile T) == typeid(const volatile T)

3. Function Template `std::declval<T>()`

std::declval<T>() is a utility from <utility> that simulates an rvalue of type T in unevaluated contexts. It is primarily used in conjunction with decltype to query types that depend on operations without needing an actual object of type T.

Example:

#include <utility>

/*
This works even if T has no default constructor, because the expression is unevaluated
— std::declval<T>() just returns a value of type T&& without constructing anything.
*/
template <typename T>
auto getReturnType() -> decltype(std::declval<T>().foo());

/*
This would be invalid because:
    - T is a type, and T.foo() is not a valid syntax (you can't call .foo() on a type).
    - There's no instance of T to call foo() on.
    - Even if T had a static member function foo(), that would be accessed as T::foo().
*/
template <typename T>
auto getReturnType() -> decltype(T.foo()); // ❌ Error

This technique is common in SFINAE and type trait definitions.

Expression	Works?	Reason
`decltype(std::declval<T>().foo())`	✅	Simulates an rvalue of type `T` in unevaluated context
`decltype(T.foo())`	❌	Invalid syntax: `T` is a type, not an object
`decltype(T::foo())`	✅ (only if `foo()` is static)	Accesses static member function

4. Type Traits (`<type_traits>`)

The C++ standard library provides a wide range of type traits in the <type_traits> header for compile-time type inspection and transformation.

Examples:

#include <type_traits>

std::is_integral<int>::value          // true
std::is_same<int, long>::value        // false
std::remove_reference<int&>::type     // int
std::decay<const int&>::type          // int

Type traits enable generic code to adapt behavior based on type properties or to transform types as needed.

Summary

Mechanism	Compile-Time	Runtime	Key Use Cases
`decltype(expr)`	✅	❌	Exact type inference of expressions
`typeid(expr)`	❌	✅	Runtime type information, polymorphism
`std::declval<T>()`	✅	❌	Simulated expressions in decltype
Type Traits	✅	❌	Type inspection, manipulation, SFINAE

Type Query Rules in C++

This section outlines the official deduction rules for decltype, along with examples, clarifications about cv-qualifiers, and the role of decltype(auto).

`decltype(e)` Deduction Rules

When e is an expression and T is its type, the type deduced by decltype(e) follows five core rules:

Identifier or Class Member Access (without parentheses) If e is an unparenthesized identifier or class member access, decltype(e) is simply T. This excludes overloaded function names and structured bindings.
Function or Functor Call If e is a function call or functor invocation, decltype(e) is the function's return type.
Lvalue If e is an lvalue of type T, decltype(e) is T&.
Xvalue (expiring value) If e is an xvalue of type T, decltype(e) is T&&.
Prvalue (pure rvalue) In all other cases, decltype(e) is simply T.

Standard Examples

const int&& foo();
int i;
struct A { double x; };
const A* a = new A();

decltype(foo());    // const int&& (rules 2 and 4)
decltype(i);        // int         (rule 1)
decltype(a->x);     // double      (rule 1)
decltype((a->x));   // const double& (rule 3 — parenthesized, so it's an lvalue)

Additional Deduction Examples

int i;
int *j;
int n[10];
const int&& foo();

decltype(static_cast<short>(i)); // short (prvalue)
decltype(j);                     // int*
decltype(n);                     // int[10]
decltype(foo);                   // const int&&() (function type)

struct A {
    int operator() () { return 0; }
};
A a;
decltype(a()); // int (functor call)

More Complex Cases

int i;
int *j;
int n[10];

decltype(i = 0);                 // int& (assignment returns lvalue)
decltype(0, i);                  // int& (comma operator, result is i — an lvalue)
decltype(i, 0);                  // int  (comma operator, result is 0 — a pure rvalue)
decltype(n[5]);                  // int& (array element is an lvalue)
decltype(*j);                    // int& (dereference of pointer is lvalue)
decltype(static_cast<int&&>(i)); // int&& (xvalue)
decltype(i++);                   // int  (post-increment yields prvalue)
decltype(++i);                   // int& (pre-increment yields lvalue)
decltype("hello world");         // const char(&)[12] (string literal is lvalue array)

cv-Qualifier Deduction Behavior

In general, decltype(e) preserves the const and volatile (cv) qualifiers of e. For example:

const int i = 0;
decltype(i); // const int

However, there are exceptions, particularly for class member access. If e is an unparenthesized member access expression, the cv-qualifiers of the object are not propagated:

struct A { double x; };
const A* a = new A();
decltype(a->x);   // double (cv-qualifier on `a` ignored)
decltype((a->x)); // const double& (parenthesized — now cv is considered)

In summary:

Unparenthesized member access: cv-qualifiers not preserved.
Parenthesized expression: cv-qualifiers are preserved.

`decltype(auto)`

Introduced in C++14, decltype(auto) merges the behavior of decltype and auto. It tells the compiler to deduce the type using decltype rules, not auto rules.

Note: decltype(auto) must be used alone in a declaration. It cannot be combined with pointer/reference/cv-qualifiers.

Comparison Examples:

int i;
int&& f();

auto x1 = i;                // int
decltype(auto) x2 = i;      // int

auto x3 = (i);              // int
decltype(auto) x4 = (i);    // int&

auto x5 = f();              // int
decltype(auto) x6 = f();    // int&&

auto x7 = {1, 2};           // std::initializer_list<int>
decltype(auto) x8 = {1, 2}; // ❌ Error: not a single expression

auto* p1 = &i;              // int*
decltype(auto)* p2 = &i;    // ❌ Error: decltype(auto) must appear alone

Return Type Use Case

Before C++14, returning references required a trailing return type:

template<class T>
auto return_ref(T& t) -> T& { return t; }

With decltype(auto), this becomes:

template<class T>
decltype(auto) return_ref(T& t) {
    return t; // preserves reference type
}

C++17: `decltype(auto)` as a Non-Type Template Parameter

In C++17, decltype(auto) can also be used as a non-type template parameter, with deduction rules matching decltype.

#include <iostream>

template<decltype(auto) N>
void f() {
    std::cout << N << std::endl;
}

static const int x = 11;
static int y = 7;

int main() {
    f<x>();     // N deduced as const int
    f<(x)>();   // N deduced as const int&
    f<y>();     // ❌ Error: y is not a constant expression
    f<(y)>();   // N deduced as int&
}

Best Practices for Using `decltype`

In typical application development, decltype may not be used extensively. However, it becomes highly valuable in the context of library development and generic programming. It significantly enhances C++'s ability to support advanced meta programming patterns.

When to Use `decltype`

When writing template functions that need to deduce return types precisely.
When combined with std::declval to form expressions for SFINAE or concepts.
When building generic utilities where preserving exact types (e.g., reference or const-ness) matters.

Practical Guidelines

Prefer auto for readability when exact type preservation is not critical.
Use decltype when querying the result of complex expressions, especially in templates.
Wrap expressions in parentheses when necessary to ensure correct cv/ref deduction.
Avoid using decltype in evaluated contexts—combine it with unevaluated tools like std::declval.

Advanced Use Cases

Combine decltype with SFINAE (Substitution Failure Is Not An Error) to enable or disable overloads based on expression validity.
In C++14 and later, prefer decltype(auto) to preserve exact return types without trailing-return syntax.
In C++17, decltype(auto) can also serve as a non-type template parameter, enhancing flexibility in meta programming.

Overall, decltype is a precise and powerful tool that plays a critical role in modern C++ library design. Developers who need maximum control over type behavior—especially in templates—will find it indispensable.

Namespace

C++ namespaces provide a way to group related declarations and definitions, such as classes, functions, and variables, under a common name. This helps to avoid naming conflicts between different parts of a program or different libraries that may be used together.

Namespaces were introduced into the C++ standard with the release of C++98. The syntax for declaring and defining namespaces is similar to that used for classes. Here's an example:

// Declaration of a namespace
namespace MyNamespace {
    int x;
    void foo();
}

// Definition of the namespace's contents
namespace MyNamespace {
    int x = 42;
    void foo() {
        // Implementation of the function
    }
}

In this example, MyNamespace is declared and defined to contain an integer variable x and a function foo(). The namespace's contents can be accessed using the scope resolution operator ::, like this:

int main() {
    MyNamespace::x = 10;
    MyNamespace::foo();
    return 0;
}

Inline Namespace

What is `inline namespace`

When a namespace is declared as inline, it means that its members are automatically injected into the enclosing parent namespace, as if they were defined directly in the parent namespace. This allows clients of the namespace to refer to its members without needing to qualify them with the namespace name.

For example, consider the following code:

namespace outer {
    inline namespace inner {
        void foo() {}
    }
}

Here, inner is an inline namespace that is declared within the outer namespace. This means that foo() can be accessed either as outer::inner::foo() or simply as outer::foo().

Use case

C++ inline namespaces were introduced in the C++11 standard to provide a mechanism for versioning and incremental updates of libraries, without breaking backward compatibility.

An inline namespace can be used to provide an updated version of a library's interface, while still allowing old code to use the previous version. By using an inline namespace, the new version of the library can be introduced without breaking the existing code that depends on the old version.

Here is an example of how an inline namespace can be used:

#include <iostream>

/*
// Initial version of the library
namespace MyLib {
    void foo() {
        std::cout << "Hello, world!" << std::endl;
    }
}
*/

// Updated version of the library, in an inline namespace
namespace MyLib {
    inline namespace v1 {
        void foo() {
            std::cout << "Hello, World!" << std::endl;
        }
    }
    
    namespace v2 {
        void foo() {
            std::cout << "Hello, C++11!" << std::endl;
        }
    }
}

// Usage of the library
int main() {
    MyLib::foo();     // calls the initial version of foo
    MyLib::v2::foo(); // calls the updated version of foo
    return 0;
}

This code demonstrates how backward compatibility is maintained in a library called MyLib, which defines two versions of a function named foo(). The output of the program will be:

Hello, World!
Hello, C++11!

New Nested Namespace Syntax

Prior to C++17, nested namespaces are defined like this:

namespace A {
    namespace B {
        namespace C {
            int foo() { return 5; }
        }
    }
}

With C++17, the same nested namespaces can be defined using the inline syntax concisely:

namespace A::B::C {
    int foo() { return 5; }
}

Both of these code snippets achieve the same result: defining a function foo() in the namespace A::B::C. The inline namespace definition syntax introduced in C++17 allows for a more compact and readable way to define nested namespaces.

Nested inline namespace

The combination of the nested namespace definition syntax (introduced in C++17) and the inline namespace declaration is allowed in C++20.

The following is valid in C++20:

namespace A::B::inline C {
    int foo() { return 5; }
}

In this code, the inline keyword is applied to the C namespace within the nested namespace definition A::B. This declares C as an inline namespace within the enclosing namespace B.

Note inline keyword can appear before any namespace name except namespace A.

Unnamed Namespace

The unnamed namespace (or anonymous namespace) is a feature in C++ that was introduced in the C++98 standard. It provides a way to declare identifiers (e.g., functions, variables, or types) with internal linkage, meaning they are only visible within the scope of their parent namespace, or translation unit (i.e., the source file) in which they are defined.

Unnamed namespaces can be declared using the namespace keyword, followed by a pair of braces, like this:

namespace {
    // Your code here
}

For example, a helper function or a constant that is only needed within a single source file, can be put in an unnamed namespace to prevent it from being accessible in other parts of the program:

// File: my_file.cpp
#include "my_file.h"

namespace {
    const int someConstant = 42;

    void helperFunction() {
        // Implementation here
    }
}

void myPublicFunction() {
    helperFunction();
    // Other implementation details
}

In this example, someConstant and helperFunction are only visible within my_file.cpp and won't conflict with any other code using the same names.

Another example:

namespace my_namespace {
    namespace {
        void helperFunction() {
            // Implementation here
        }
    }

    void publicFunction() {
        helperFunction(); // This is allowed since helperFunction() is in the same parent namespace
    }
}

In this example, helperFunction() is declared within an unnamed namespace inside my_namespace. Although helperFunction() has internal linkage and is not visible outside of the translation unit, it can still be accessed by other functions within the same parent namespace (my_namespace), such as publicFunction().

Merged Namespace

If a namespace is defined multiple times, its contents are merged together. For example:

// First definition of namespace MyNamespace
namespace MyNamespace {
    int x = 1;
    void foo() {
        // Implementation of the function
    }
}

// Second definition of namespace MyNamespace, with different contents
namespace MyNamespace {
    int y = 2;
    void bar() {
        // Implementation of the function
    }
}

// Usage of the namespace contents
int main() {
    MyNamespace::foo();
    MyNamespace::bar();
    std::cout << MyNamespace::x + MyNamespace::y << std::endl;
    return 0;
}

Howerver, if the same variable is defined multiple times, a redefinition error will occur:


#include <iostream>

namespace Namespace1 {
    int x = 1;
}

namespace Namespace1 {
    int x = 2;
}

int main() {
    std::cout << Namespace1::x << std::endl;
    std::cout << Namespace2::x << std::endl;
    return 0;
}

We'll see the following compiler error:

<source>:8:9: error: redefinition of 'int Namespace1::x'
    8 |     int x = 2;
      |         ^
<source>:4:9: note: 'int Namespace1::x' previously defined here
    4 |     int x = 1;
      |         ^
<source>: In function 'int main()':
<source>:13:18: error: 'Namespace2' has not been declared
   13 |     std::cout << Namespace2::x << std::endl;

Global Namespace

In C++, the global namespace is the outermost namespace that encompasses all the code in a program. When you define a variable, function, or type without explicitly placing it in a named or unnamed namespace, it becomes part of the global namespace. The global namespace is accessible from anywhere in the program, making its members visible across different translation units.

Although using the global namespace can make it easier to access identifiers without needing to specify a particular namespace, it is generally not recommended to place many identifiers in the global namespace, as it can lead to name clashes and reduced code maintainability. In large projects, putting too many identifiers in the global namespace can make it difficult to determine the purpose or origin of a particular identifier.

Instead, it's usually better to use named namespaces to organize and encapsulate your code, which helps prevent name collisions and improve code readability.

Here's an example that demonstrates the difference between global and named namespaces:

// Global namespace
int globalVariable = 10;

void globalFunction() {
    // Implementation here
}

// Named namespace
namespace my_namespace {
    int myVariable = 20;

    void myFunction() {
        // Implementation here
    }
}

int main() {
    globalFunction(); // Accessing a function in the global namespace
    my_namespace::myFunction(); // Accessing a function in a named namespace

    return 0;
}

In this example, globalVariable and globalFunction() are defined in the global namespace, while myVariable and myFunction() are defined within the named namespace my_namespace. To access members of a named namespace, use the namespace qualifier ::.

Scope resolution operator `::`

The global namespace can be accessed explicitly by using the scope resolution operator ::. This can be helpful when an identifier in the global namespace shares the same name as an identifier in a different namespace, or it is desirable to explicitly refer to the global namespace version of an identifier.

Here's an example demonstrating the use of :: to access the global namespace:

#include <iostream>

// Global namespace
int myVariable = 10;

namespace my_namespace {
    int myVariable = 20;

    void printVariables() {
        std::cout << "Global namespace myVariable: " << ::myVariable << std::endl;
        std::cout << "my_namespace myVariable: " << myVariable << std::endl;
    }
}

int main() {
    my_namespace::printVariables();
    return 0;
}

In this example, there are two variables with the same name myVariable, one in the global namespace and another in the named namespace my_namespace. Inside the printVariables() function, resolution operator :: is specified to access the myVariable from the global namespace, while the unqualified myVariable refers to the one in the my_namespace.

Program Structure

Control Flow

Exceptions and Assertion

Understanding `static_assert` in Modern C++

Static assertions enable compile-time validation of program logic.

Introduced in C++11 and enhanced in C++17 and later standards, static_assert is used to catch programming errors early in the development cycle—during compilation, rather than at runtime.

Motivation

Before static_assert, C++ developers relied on runtime assertions using assert() from <cassert>. These runtime checks serve as DEBUG aid, and are only evaluated during program execution:

They do not prevent compilation of incorrect code.
They can be disabled in Release builds using the NDEBUG macro.
They are unsuitable for verifying template logic or constant expressions.

For example:

#include <cassert>

void resize_buffer(void* buffer, int new_size) {
    assert(buffer != nullptr);   // Valid: internal check for program invariants
    assert(new_size > 0);        // Valid: internal logic
}

// Avoid assert() for user input, file format, environment conditions or anything not under
// direct program control. Assert is a DEBUG aid, not error handling.
bool handle_user_input(char c) {
    assert(c == '\r');           // Not recommended: external or user input, not controlled by developer
    return c == '\r';
}

Runtime assertions help catch developer mistakes, but they cannot verify correctness of types, templates, or values at compile time.

Basic Syntax of `static_assert` (C++11)

C++11 introduced static_assert to allow assertions at compile time.

static_assert(constexpr_condition, "error message");

The first argument must be a constant expression.
The second is a string literal shown during compilation if the assertion fails.

For example:

#include <type_traits>

template <typename T>
struct IsDerivedFromBase {
    static_assert(std::is_base_of<Base, T>::value, "T must derive from Base");
};

If T does not inherit from Base, compilation fails with the specified message.

Single-Argument Version (C++17)

C++17 simplified static_assert by making the message optional. If omitted, the compiler displays the failed expression.

Syntax (C++17):

// MSVC  - error C2338: static_assert failed: 'sizeof(int) >= 4'
// Clang - static_assert failed due to requirement 'sizeof(int) >= 4'
static_assert(constexpr_condition);

For example:

static_assert(sizeof(int) >= 4);

Use Cases and Best Practices

Valid Uses:

Verifying template arguments.
Ensuring platform or compiler constraints (e.g., word size).
Asserting invariants within class or function templates.

Invalid Uses:

Runtime values (e.g., function arguments or user input).
Conditions that depend on external input or file contents.

Example o Invalid Use:

int main(int argc, char* argv[]) {
    static_assert(argc > 0, "argc must be > 0");  // Invalid: not a compile-time constant
}

Advanced Compile-Time Constraints

Custom Macros (Pre-C++11)

Before C++11, libraries like Boost used templates to simulate static assertions:

template<bool>
struct static_assertion; // Primary template - intentionally left undefined.

template<> 
struct static_assertion<true> {}; // Specialization only for `true`

// This attempts to create a temporary object of type static_assertion<true>
// (if the condition is true). Otherwise, compiler would fail.
#define STATIC_ASSERT(expr) static_assertion<(expr)>()

These techniques are now obsolete due to static_assert.

Enhancements in C++20 and Beyond

Concepts (C++20)

C++20 introduces concepts, a powerful way to constrain template parameters. This is often used in place of static_assert.

Example:

template<typename T>
concept Integral = std::is_integral_v<T>;

template<Integral T>
T add(T a, T b) {
    return a + b;
}

This eliminates the need for static_assert(std::is_integral_v<T>).

`consteval` and `constinit` (C++20)

consteval enforces compile-time evaluation of functions.
constinit ensures static variables are initialized at compile time.

These provide compile-time safety in contexts where static_assert might be too coarse.

Example with consteval:

consteval int square(int x) {
    return x * x;
}

static_assert(square(5) == 25);

`static_assert` with Type Traits (C++23/26 Context)

With growing support for constexpr-friendly type traits, static_assert is increasingly used in generic programming. Libraries and frameworks leverage it to enforce type invariants:

template<typename T>
void serialize(const T& obj) {
    static_assert(std::is_trivially_copyable_v<T>, "T must be trivially copyable");
}

Summary

static_assert enables compile-time validation, avoiding runtime surprises.
Introduced in C++11, improved in C++17 (single-argument form).
C++20 and later expand compile-time programming with concepts, consteval, and more expressive constexpr support.
Should be used to enforce logic that must always be true during compilation.
Avoid using it for checking inputs or runtime states.

Static assertions improve code robustness, help detect logic errors early, and are an essential tool in template meta programming and modern C++ design.

`noexcept` in Modern C++

C++11 introduced noexcept as a replacement and improvement over the older throw()-style exception specifications. It plays a vital role in optimization, particularly in generic programming and move semantics. Later revisions of C++ reinforced its importance, culminating in significant changes by C++20.

What is `noexcept`?

noexcept is both:

An exception specification (like noexcept or noexcept(true)), telling the compiler a function won't throw.
A compile-time operator (noexcept(expression)) that returns true if the expression is known not to throw.

Why Not Use `throw()`?

Before C++11:

void foo() throw();                   // Not supposed to throw any exceptions
void bar() throw(std::runtime_error); // Supposed to throw only specific exceptions

These had weak compiler enforcement and inconsistent support. Worse, throw() required stack unwinding and called std::unexpected() on violation. In C++ 20, throw() is removed entirely.

In contrast, noexcept calls std::terminate() directly, avoiding complex runtime behavior and enabling better optimizations.

Basic `noexcept` Usage

int f() noexcept {
    return 42;
}

struct X {
    int g() const noexcept {
        return 58;
    }

    void h() noexcept {}
};

Declaring functions noexcept helps the compiler generate better code, especially in templates and STL containers.

Conditional `noexcept` with Templates

You often want to declare noexcept only if the operations inside a template won’t throw:

#include <type_traits>

template <typename T>
T copy(const T& o) noexcept(std::is_nothrow_copy_constructible<T>::value) {
    return T(o); // Calls the copy ctor of T
}

// std::is_nothrow_copy_constructible<T>::value is a compile time boolean constant, and is 
// true if the type T has a copy ctor that is declared noexcept

Or more generally:

template <typename T>
T copy(const T& o) noexcept(noexcept(T(o))) {
    return T(o);
}

Here, the outer noexcept(...) is the specifier, and the inner is the operator.

`noexcept` and Move Semantics

Using move operations inside containers is risky if the move constructor/assignment might throw. noexcept helps guide the compiler to choose moves over copies safely.

Example: Safe `swap` with `noexcept`

#include <utility>
#include <type_traits>

template <typename T>
void swap(T& a, T& b)
noexcept(noexcept(T(std::move(a))) && noexcept(a = std::move(b))) {
    T tmp(std::move(a));
    a = std::move(b);
    b = std::move(tmp);
}

/*
The swap function is declared noexcept only if both:
  T's move constructor T(std::move(a)) is noexcept
  T's move assignment a = std::move(b) is noexcept

If either operation could throw, the whole swap function is not noexcept, 
preventing false promises.
*/

Example: Safe `swap` with Conditional Overload

template<typename T>
void swap_impl(T& a, T& b, std::true_type) noexcept {
    T tmp(std::move(a));
    a = std::move(b);
    b = std::move(tmp);
}

template<typename T>
void swap_impl(T& a, T& b, std::false_type) {
    T tmp(a);
    a = b;
    b = tmp;
}

template<typename T>
void swap(T& a, T& b)
noexcept(noexcept(swap_impl(a, b,
    std::integral_constant<bool,
        std::is_nothrow_move_constructible<T>::value &&
        std::is_nothrow_move_assignable<T>::value>()))) {
    swap_impl(a, b,
        std::integral_constant<bool,
            std::is_nothrow_move_constructible<T>::value &&
            std::is_nothrow_move_assignable<T>::value>());
}

Destructor and `delete` are `noexcept` by Default

Even user-defined destructors inherit noexcept unless explicitly marked otherwise. Example:

// A's dtor might throw exception.
struct A { ~A() noexcept(false) {} }; 

// B's dtor by default is noexcept, but its member a is not noexcept, 
// hence B's dtor is not noexcept
struct B { A a; };  

// This will pass. Note: noexcept(B()) is testing both the dtor and ctor.
static_assert(!noexcept(B()), "B’s destructor is not noexcept");

`noexcept` in the Type System (C++17)

From C++17 onwards, exception specifications are part of the function type:

void foo();           // May throw
void bar() noexcept;  // No-throw

void (*fp)() noexcept = foo; // ERROR in C++17, not compatible type

The two function types are not compatible anymore, enhancing type safety.

`noexcept` with Lambdas (C++20)

auto f = []() noexcept { return 42; };
static_assert(noexcept(f()));

Before C++20, you couldn't specify noexcept on lambdas unless you wrote a full trailing return type with it.

Support for `consteval` and Immediate Functions (C++20)

C++20's consteval and constinit features pair well with noexcept, allowing better compile-time enforcement:

consteval int f() noexcept {
    return 42;
}

If such a function throws or allows throwing, the compiler gives an error — reinforcing that throwing in constant-evaluated code is forbidden.

When to Use `noexcept`

Use noexcept when:

You guarantee the function won't throw (e.g., simple math, memory deallocation).
Throwing would be catastrophic, and std::terminate is acceptable.
You aim to enable move optimizations in STL containers.

Avoid using noexcept if there's a possibility of future changes that might introduce exceptions.

Summary

noexcept is essential for writing robust, optimized C++ code.
Use it wisely to guide compiler optimizations and avoid surprises during template instantiations or container operations.
From C++17 onwards, noexcept becomes part of the type system, enhancing type safety.

Keyboard shortcuts

Modern C++ Explained