RISC-V Public Beta Platform Released · Stream Bandwidth Full Test

RISC-V Public Beta Platform Stream Program Path: /public/benchmark/stream/5.10

RISC-V Public Beta Platform Stream Program Path: /public/benchmark/stream/5.10

Introduction

“Stream” is a benchmark tool used to evaluate the memory bandwidth performance of a computer system. It simulates memory access patterns to test the speed of reading and writing contiguous memory blocks, measuring the memory performance and data transfer efficiency of the system.

Platform Information
Hardware Specification
Processor:SOPHON SG2042
DDR: 128G,3200Hz

Chip Specification
Clock Frequency: 2.0GHz
Number of Cores: 64 cores
L1 Cache: I:64KB and D:64KB(Per Core)
L2 Cache: 1MB/Cluster(Per Cluster,X16 Cluster)
L3 Cache: 64MB System Cache

Software Specification
Linux Version: Ubuntu 22.10
GCC Version: 12.2.0(GNU)


Explanation of Parameters

Let’s first understand the specific usage of the test parameters.

1. ARRAY_SIZE:
Used to specify the size of the array used during testing. This parameter defines the size of the memory block to be manipulated during the test, usually in bytes. By changing the value of ARRAY_SIZE, the system’s performance can be evaluated under different memory workloads. Regarding the size setting, we can refer to the official documentation for guidance here

The general rule for STREAM is that each array must be at least 4x the size of the sum of all the last-level caches used in the run.

In other words, we need to set it to 4 times the sum of the last-level cache.

2. OpenMP :
This parameter enables multi-threading support by adding the “-fopenmp” option to the GCC compiler.

Single-Thread Stream Test
Since our L3 Cache size is 64MB, following the recommendation from the official documentation, for testing accuracy, we have selected data size four times larger, i.e., an array size of 33554432, as the base. We increase the array size incrementally by 2621440 and observe its impact on the test results. By the way, we use the GCC compiler for this single-thread test.

Single-thread test command:

ubuntu@perfxlab:~/STREAM$ gcc -O3 -DSTREAM_ARRAY_SIZE=【ARRAY_SIZE】 stream.c

Test results are shown in the following table:


Multi-Thread Stream Test

Similar to the single-thread Stream test, we can conduct multi-threaded Stream tests with different array sizes, maintaining the same array size as in the single-thread test.

Multi-thread test command:

ubuntu@perfxlab:~/STREAM$ gcc -O3 -fopenmp -DSTREAM_ARRAY_SIZE=【ARRAY_SIZE】 stream.c

Test results are shown in the following table:


Conclusion


From the results, it can be observed that multi-core testing shows significantly higher performance compared to single-core testing. The multi-core test utilizes multiple processing cores to execute tasks in parallel, thus demonstrating greater data bandwidth. In the multi-core test, the performance of the Copy operation is approximately 5 times that of the single-core test, while the performance of the Scale operation is around 6 times that of the single-core test.

In conclusion, multi-core testing exhibits significant performance advantages in Copy and Scale operations, while the performance improvement in Add and Triad operations is relatively small. This is due to the parallel processing capabilities and data dependencies of multi-core processors. The multi-core test results demonstrate the advantages of multi-core processors in parallel computing and data processing.

Reprinted from RISC-V Public Beta Platform Released · Stream Bandwidth Full Test