STREAMベンチマーク

メモリの帯域性能を測るのにSTREAMベンチマークがある。

使い方はソースをダウンロードしてコンパイルするだけ。

専門家が真面目にやるときはgccではなくインテルのコンパイラを使ったり、アレイサイズなどのパラメータを調整するようだが、最適化オプション付けただけの手抜きでも、大体92GB/s ~ 105GB/sでており、ほぼインテルの資料通りの数値が出てしまっている。

# gcc -O3 stream.c -o stream
#
# gcc -O3 -fopenmp stream.c -o stream_openmp
#
# export OMP_NUM_THREADS=24
# ./stream_openmp 
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 10000000 (elements), Offset = 0 (elements)
Memory per array = 76.3 MiB (= 0.1 GiB).
Total memory required = 228.9 MiB (= 0.2 GiB).
Each kernel will be executed 10 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested = 24
Number of Threads counted = 24
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 24683 microseconds.
   (= 24683 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:          100312.2     0.002189     0.001595     0.004063
Scale:          92691.8     0.001851     0.001726     0.002420
Add:           103541.8     0.002493     0.002318     0.002596
Triad:         105219.3     0.002473     0.002281     0.002672
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------