Intel Memory Latency Checker (mlc)

メモリのレイテンシと帯域幅をNUMAノード毎に計測してくれる。コマンド一発で簡単で便利。ダウンロードはここ

2ソケのマシンでローカルは85ns前後、リモートが130~140nsという感じ。

# ./mlc 
Intel(R) Memory Latency Checker - v3.9
Measuring idle latencies (in ns)...
        Numa node
Numa node        0       1  
       0      84.6   132.5  
       1     137.2    85.0  

Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads        :  214676.0    
3:1 Reads-Writes :  198112.2    
2:1 Reads-Writes :  196484.1    
1:1 Reads-Writes :  180014.4    
Stream-triad like:  176706.4    

Measuring Memory Bandwidths between nodes within system 
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
        Numa node
Numa node        0       1  
       0    108625.7    34419.7 
       1    34478.1 106305.6    

Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject  Latency Bandwidth
Delay   (ns)    MB/sec
==========================
 00000  193.25   215455.4
 00002  193.47   215368.1
 00008  193.03   215284.4
 00015  188.18   215231.4
 00050  158.86   206498.9
 00100  115.37   156563.4
 00200  105.77   107812.6
 00300  101.93    77927.1
 00400   99.50    61162.8
 00500   97.91    50090.8
 00700   96.64    36768.6
 01000   94.84    26342.3
 01300   94.32    20592.4
 01700   93.30    16004.3
 02500   92.33    11171.7
 03500   91.60     8209.9
 05000   91.14     5974.3
 09000   90.24     3648.2
 20000   89.55     2039.8

Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT  latency 49.0
Local Socket L2->L2 HITM latency 49.0
Remote Socket L2->L2 HITM latency (data address homed in writer socket)
            Reader Numa Node
Writer Numa Node     0       1  
            0        -   111.3  
            1    111.1       -  
Remote Socket L2->L2 HITM latency (data address homed in reader socket)
            Reader Numa Node
Writer Numa Node     0       1  
            0        -   178.5  
            1    180.7       -