Original benchmark by John Sokol (1998-2005)
Updated with modern hardware: January 2026
Overview
Budmark is a CPU benchmark based on brute-force searching for optimal error-correcting codes (ECC). The algorithm finds codewords with minimum Hamming distance 5, making it a pure integer/bit-manipulation workload that fits entirely in cache.
All results use unoptimized compilation (-O0) to measure raw CPU performance without compiler tricks.
Modern Hardware Results (2026)
Single-Core Performance
| CPU | Architecture | Clock | iter/sec | vs Xeon PII |
|---|---|---|---|---|
| Intel i5-4590 | x86-64 Haswell | 3.3GHz | 32,631 | 251,000x |
| Pi 5 Cortex-A76 | ARMv8 | 2.4GHz | 18,454 | 142,000x |
| Pi 4 Cortex-A72 | ARMv8 | 1.5GHz | 8,049 | 62,000x |
| Pi Zero ARM1176 | ARMv6 | 1.0GHz | 1,326 | 10,200x |
Multi-Core Performance
| CPU | Cores | Single | Multi | Scaling | vs Xeon PII |
|---|---|---|---|---|---|
| Intel i5-4590 | 4 | 32,631 | 121,500 | 3.73x | 935,000x |
| Pi 5 Cortex-A76 | 4 | 18,454 | 73,000 | 3.96x | 562,000x |
| Pi 4 Cortex-A72 | 4 | 8,049 | 30,200 | 3.75x | 232,000x |
| Pi Zero ARM1176 | 1 | 1,326 | — | — | 10,200x |
Historical Results (1998-2005)
From the original Budmark benchmark page.
| CPU | Clock | Run Time (s) | iter/sec | Efficiency | OS |
|---|---|---|---|---|---|
| P4 | 3800MHz | 1.27 | 0.79 | 71.6% | WinXP Cygwin |
| P4 | 3000MHz | 1.57 | 0.64 | 73.7% | Linux 2.6.11 |
| P4 | 2266MHz | 2.29 | 0.44 | 66.7% | FBSD 4.4 |
| P4 | 1800MHz | 4.23 | 0.24 | 45.5% | Win2K Cygwin |
| P3 | 1150MHz | 3.06 | 0.33 | 98.4% | OBSD 3.1 |
| VIA Eden | 1000MHz | 8.07 | 0.12 | 42.9% | Win2K Cygwin |
| P3 | 866MHz | 3.99 | 0.25 | 100.2% | Slackware |
| Celeron | 766MHz | 4.67 | 0.21 | 96.8% | FBSD 4.6.2 |
| AMD-K7 | 550MHz | 7.28 | 0.14 | 86.5% | RedHat |
| Celeron | 533MHz | 6.72 | 0.15 | 96.7% | FBSD 4.6.2 |
| Xeon PII | 450MHz | 7.70 | 0.13 | 100% | FBSD 3.0 |
| AMD-K6 | 450MHz | 8.47 | 0.12 | 90.9% | FBSD 2.2.7 |
| Xeon PII | 400MHz | 8.69 | 0.12 | 99.6% | FBSD |
| AMD-K6 | 350MHz | 10.31 | 0.10 | 95.9% | FBSD |
| Intel PII | 333MHz | 10.75 | 0.09 | 96.8% | FBSD 4.6.2 |
| AMD-K6 | 300MHz | 12.64 | 0.08 | 91.3% | FBSD |
| Cyrix GXm | 233MHz | 32.31 | 0.03 | 46.0% | FBSD |
| Pentium | 166MHz | 38.21 | 0.026 | 54.6% | FBSD 2.1.0 |
| IBM Power2 | 135MHz | 38.61 | 0.026 | 66.4% | AIX XLC -O2 |
| Pentium | 133MHz | 47.90 | 0.021 | 54.4% | FBSD |
| Pentium | 120MHz | 52.92 | 0.019 | 54.5% | FBSD |
| 486DX2 | 66MHz | 115.46 | 0.0087 | 45.4% | FBSD 2.2.7 |
| 486DX | 66MHz | 153.38 | 0.0065 | 34.2% | FBSD 3.1 |
| 486 | 33MHz | 230.42 | 0.0043 | 45.5% | FBSD 3.1 |
| 386DX | 40MHz | 537.78 | 0.0019 | 16.1% | FBSD 3.1 |
| 386 | 40MHz | 784.51 | 0.0013 | 11.0% | FBSD 3.1 |
| 386 | 16MHz | 1997.80 | 0.0005 | 10.8% | FBSD 3.1 |
Efficiency Analysis
Efficiency measures work-per-clock-cycle, normalized to Xeon PII 450MHz = 100%.
| CPU | Clock | Efficiency | Notes |
|---|---|---|---|
| P3 866MHz | 866MHz | 100% | Peak efficiency era |
| Xeon PII | 450MHz | 100% | Baseline |
| P4 3800MHz | 3800MHz | 72% | NetBurst penalty |
| P4 1800MHz | 1800MHz | 46% | Early P4 very inefficient |
| i5-4590 | 3300MHz | ~34,000% | Modern IPC gains |
| Pi 5 A76 | 2400MHz | ~26,000% | ARM efficiency |
Key observation: The Pentium 4 (NetBurst) architecture traded efficiency for clock speed. A P4 at 3.8GHz was only ~6x faster than a Xeon PII at 450MHz, despite having 8.4x the clock speed.
Modern CPUs have recovered efficiency through:
- Deeper pipelines with better branch prediction
- Larger caches (L1/L2/L3)
- Out-of-order execution improvements
- Better memory controllers
Raspberry Pi Comparison
| Model | CPU | Clock | Cores | Price | Multi iter/sec | Value (iter/$/sec) |
|---|---|---|---|---|---|---|
| Pi Zero | ARM1176 | 1.0GHz | 1 | $5 | 1,326 | 265 |
| Pi 4 | Cortex-A72 | 1.5GHz | 4 | $35 | 30,200 | 863 |
| Pi 5 | Cortex-A76 | 2.4GHz | 4 | $60 | 73,000 | 1,217 |
Pi 5 offers best performance per dollar for compute workloads.
All Pi models show near-perfect multicore scaling (3.75-3.96x on 4 cores).
Test Commands
Single-core test
gcc -O0 -o ecc4_original ecc4.c -lm
time ./ecc4_original 100000Multi-core test (4 cores)
time (./ecc4_original 100000 & ./ecc4_original 100000 & ./ecc4_original 100000 & ./ecc4_original 100000 & wait)Equivalent workload to 1998 Xeon (7.7s)
# On i5-4590: ~245,000 iterations
time ./ecc4_original 245000Summary
| Era | Best CPU | iter/sec | Improvement |
|---|---|---|---|
| 1988 | 386 16MHz | 0.0005 | — |
| 1998 | Xeon PII 450MHz | 0.13 | 260x |
| 2005 | P4 3800MHz | 0.79 | 1,580x |
| 2014 | i5-4590 (single) | 32,631 | 65M x |
| 2023 | Pi 5 (multi) | 73,000 | 146M x |
| 2014 | i5-4590 (multi) | 121,500 | 243M x |
A $60 Raspberry Pi 5 is 562,000x faster than a 1998 enterprise Xeon server.
Benchmark and original data: John Sokol, 1998-2026
https://www.dnull.com/cpubenchmark/budmark3.html
No comments:
Post a Comment