I have finally had a chance to carryout a few HPCC benchmark runs. As with the Linpack benchmarks I have compiled the HPCC benchmark with MKL and with ATLAS.
Generally speaking MKL continues to perform better than ATLAS but clearly some of the benchmarks are hardly affected at all by the choice of library, which should not be surprising as some of the tests relate to latency and memory performance.
It is worth bearing in mind that this two node cluster does not have its own separate network switch and thus results will vary more than in a cluster with dedicated networking.
MKL | ATLAS |
HPL_Tflops=0.0689043 | HPL_Tflops=0.025042 |
StarDGEMM_Gflops=6.98082 | StarDGEMM_Gflops=1.95127 |
SingleDGEMM_Gflops=8.489 | SingleDGEMM_Gflops=2.0767 |
PTRANS_GBs=0.285695 | PTRANS_GBs=0.282756 |
MPIRandomAccess_LCG_GUPs=0.0107743 | MPIRandomAccess_LCG_GUPs=0.0106545 |
MPIRandomAccess_GUPs=0.00980471 | MPIRandomAccess_GUPs=0.011238 |
StarRandomAccess_LCG_GUPs=0.00292826 | StarRandomAccess_LCG_GUPs=0.00305111 |
SingleRandomAccess_LCG_GUPs=0.0202582 | SingleRandomAccess_LCG_GUPs=0.0209147 |
StarRandomAccess_GUPs=0.00294884 | StarRandomAccess_GUPs=0.0030755 |
SingleRandomAccess_GUPs=0.0222603 | SingleRandomAccess_GUPs=0.0219117 |
StarSTREAM_Copy=0.424375 | StarSTREAM_Copy=0.49285 |
StarSTREAM_Scale=0.445834 | StarSTREAM_Scale=0.52021 |
StarSTREAM_Add=0.596884 | StarSTREAM_Add=0.629714 |
StarSTREAM_Triad=0.640154 | StarSTREAM_Triad=0.684154 |
SingleSTREAM_Copy=3.0544 | SingleSTREAM_Copy=3.06379 |
SingleSTREAM_Scale=3.03374 | SingleSTREAM_Scale=3.03051 |
SingleSTREAM_Add=3.339 | SingleSTREAM_Add=3.332 |
SingleSTREAM_Triad=3.32887 | SingleSTREAM_Triad=3.31208 |
StarFFT_Gflops=0.242265 | StarFFT_Gflops=0.233275 |
SingleFFT_Gflops=0.894628 | SingleFFT_Gflops=0.93583 |
MPIFFT_Gflops=0.55462 | MPIFFT_Gflops=0.402139 |
MaxPingPongLatency_usec=717.918 | MaxPingPongLatency_usec=499.696 |
RandomlyOrderedRingLatency_usec=393.943 | RandomlyOrderedRingLatency_usec=201.655 |
MinPingPongBandwidth_GBytes=0.0386536 | MinPingPongBandwidth_GBytes=0.0312215 |
NaturallyOrderedRingBandwidth_GBytes=0.0129542 | NaturallyOrderedRingBandwidth_GBytes=0.012228 |
RandomlyOrderedRingBandwidth_GBytes=0.021249 | RandomlyOrderedRingBandwidth_GBytes=0.016799 |
MinPingPongLatency_usec=0.384119 | MinPingPongLatency_usec=0.357628 |
AvgPingPongLatency_usec=332.504 | AvgPingPongLatency_usec=291.221 |
MaxPingPongBandwidth_GBytes=3.87644 | MaxPingPongBandwidth_GBytes=3.36689 |
AvgPingPongBandwidth_GBytes=0.839076 | AvgPingPongBandwidth_GBytes=0.479826 |
NaturallyOrderedRingLatency_usec=585.89 | NaturallyOrderedRingLatency_usec=293.112 |
No comments:
Post a Comment