DESIGN TOOLS
storage

Micron 6500 ION provides massive WEKA performance on AMD-based servers

Sujit Somandepalli | July 2023

With the launch of the Micron 6500 ION NVMe SSD, we recently had the opportunity to execute some interesting scaling studies with WEKA™ 1 on 4th Gen AMD EPYC™ 9554-based (64-core) Supermicro platforms.

WEKA is a high performance, NVMe™-based software defined storage solution that is commonly used in large-scale file storage deployments for a variety of use cases, including high-performance computing (HPC) and artificial intelligence (AI).

We tested on a cluster of six Supermicro AS-1115CS-TNR2 single-socket servers based on AMD EPYC 9004 Series processors, with Micron DDR5 memory and 400GbE-capable, PCIe® Gen5 networking. This server is well-suited for designing a clustered storage system with WEKA due to its high-performance Zen 4 cores and simplified single-socket design. These servers can support up to 10 NVMe SSDs per node, for a total of 60 drives in the test cluster.

This 6-node WEKA storage cluster was connected to 12 clients, each of which ran flexible input/output tester (fio )3 with 32 jobs at various IO depths (queue depths).

Our initial testing with this cluster focused on using 36 drives (six drives per six nodes), which we then expanded to 60 drives (10 drives per six nodes). The results speak for themselves.

Sequential performance

In the 1MB sequential read workloads, we observed that going from six drives to 10 drives per node gave us up to 229GB/s for reads, up from about 164GB/s. This results represents a nearly 40% improvement in sequential reads. For 128KB sequential read workloads, we observed almost double the performance from the six-drive configuration.

The write performance for the sequential workloads maxes out around 106GB/s, which is limited by the compute on the WEKA backends.

1M Seq Read - 12 Clients Graph
128K Seq Read - 12 Clients Graph
128K Seq Write - 12 Clients Graph

4KB random performance

We also ran small block random IO testing (four corners, 4KB) with fio, and we observed that, in addition to getting higher input/output operations per second (IOPs), the 10-drive configuration provides lower latency for each of the operations. For the 100% random read workload, the 10-drive configuration was able to achieve more than 16 million IOPs at 0.59ms average read latency, and the 100% random write workload achieved more than 3.1 million IOPs at 3.19ms average write latency.

4K Random Read and Avg Latency (ms) - 12 Clients Graph
4K Random Write and Avg Latency (ms) - 12 Clients Graph

Conclusion

We saw that WEKA provides near-linear performance scaling as we moved from six to 10 drives per node. This result proves an easy way to grow your WEKA deployments on 4th Gen AMD EPYC 9004 Series processors by using cost-competitive 30TB Micron 6500 NVMe SSDs. 

  1. For more information on Weka, see https://www.weka.io/ 
  2. For details on the Supermicro AS-115CS-TNS platform, see https://www.supermicro.com/en/products/system/clouddc/1u/as-1115cs-tnr
  3. For details on FIO, see https://fio.readthedocs.io/en/latest/fio_doc.html

Principal Storage Solutions Engineer

Sujit Somandepalli

Sujit Somandepalli is Principal Storage Solutions Engineer at Micron Technology.