Understanding Branch Prediction in C++: likely and unlikely
When writing performance-critical applications, understanding and optimizing branch prediction can lead to noticeable improvements. In this blog, we'll explore how branch prediction works and demonstrate its impact using C++ with likely
and unlikely
macros.
What is Branch Prediction?
Modern CPUs process instructions using pipelines. When a conditional branch (like an if
statement) is encountered, the CPU predicts the outcome and executes the next instructions accordingly.
Why Does Branch Prediction Matter?
If the CPU predicts correctly, the program continues smoothly.
If it mispredicts, the pipeline is flushed, and execution restarts at the correct branch. This "branch misprediction" can cost 10-20 cycles or more, depending on the CPU architecture.
Branch prediction works well for patterns, but it can fail for unpredictable branches, leading to performance penalties.
Introducing likely
and unlikely
C++ provides a way to hint the compiler about the expected outcome of a branch using macros:
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
How Do They Work?
likely(x)
: Tells the compiler thatx
is most likely to evaluate totrue
.unlikely(x)
: Tells the compiler thatx
is most likely to evaluate tofalse
.
These hints allow the compiler to optimize the generated assembly code, arranging instructions to favor the likely case and reducing potential branch misprediction penalties.
Example Program: Demonstrating likely
and unlikely
Below is a C++ program that compares the performance of code with and without branch prediction hints.
Code: Likely vs. Unlikely
#include <iostream>
#include <chrono>
#include <cstdlib>
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
const int NUM_ITERATIONS = 100000000; // Large number for measurable performance.
int branch_without_hints() {
int count = 0;
for (int i = 0; i < NUM_ITERATIONS; ++i) {
if (rand() % 100 < 95) { // 95% chance of true
count += 1;
} else {
count -= 1;
}
}
return count;
}
int branch_with_hints() {
int count = 0;
for (int i = 0; i < NUM_ITERATIONS; ++i) {
if (likely(rand() % 100 < 95)) { // Hint: this branch is likely.
count += 1;
} else {
count -= 1;
}
}
return count;
}
int main() {
srand(42); // Fixed seed for reproducibility.
// Measure performance without hints.
auto start_without = std::chrono::high_resolution_clock::now();
int result_without = branch_without_hints();
auto end_without = std::chrono::high_resolution_clock::now();
double duration_without = std::chrono::duration<double>(end_without - start_without).count();
// Measure performance with hints.
auto start_with = std::chrono::high_resolution_clock::now();
int result_with = branch_with_hints();
auto end_with = std::chrono::high_resolution_clock::now();
double duration_with = std::chrono::duration<double>(end_with - start_with).count();
// Output results.
std::cout << "Result without hints: " << result_without << "\n";
std::cout << "Time without hints: " << duration_without << " seconds\n";
std::cout << "Result with hints: " << result_with << "\n";
std::cout << "Time with hints: " << duration_with << " seconds\n";
return 0;
}
How It Works
Branch Without Hints
if (rand() % 100 < 95)
This condition checks whether a random value is less than 95, which happens 95% of the time.
The compiler has no prior knowledge of the likelihood of this condition, so the CPU relies on its branch predictor.
Branch With Hints
if (likely(rand() % 100 < 95))
This adds a hint, telling the compiler that the condition is likely to be true most of the time.
The compiler rearranges the assembly code, optimizing for the common case (true).
Probability of the Condition
rand() % 100
generates a random number in the range[0, 99]
.The condition
rand() % 100 < 95
is true for numbers{0, 1, ..., 94}
(95 values).Probability:
True: 95%
False: 5%
Results
When you compile and run the program, you'll observe:
Example Output
Result without hints: 95000000
Time without hints: 2.15 seconds
Result with hints: 1.95 seconds
Results: Both functions produce the same output (
95000000
), as they execute the same logic.Execution Time: The version with
likely
runs faster because branch mispredictions are minimized.
Why Does It Work?
Branch Prediction in CPUs
CPUs guess the outcome of branches based on patterns. If the prediction is correct, execution proceeds smoothly. If not, the pipeline stalls, causing a delay.
The penalty for a misprediction can be significant, especially in performance-critical loops.
Impact of Hints
- The
likely
macro helps the compiler align with the CPU’s branch predictor, reducing the chance of pipeline stalls in the common case.
When Should You Use likely
and unlikely
?
Use Cases:
Error Handling: Rare error cases can be marked as
unlikely
.Hot Paths: Frequently executed code paths can use
likely
.Performance-Critical Code: Systems like real-time applications, networking, or databases where branch misprediction penalties matter.
Avoid Overuse: Misusing these hints in code where branch probabilities are uncertain can degrade performance or make the code harder to maintain.
Key Takeaways
Branch prediction hints (
likely
andunlikely
) help the compiler optimize code for better performance.They are most effective in performance-critical applications with predictable branch behavior.
Use them sparingly and only when you're certain about the likelihood of branch conditions.
Try the program yourself and see how likely
and unlikely
affect performance on your machine!