Mobile Collateral - HPE

Architecture guide

Architecture guide for HPE servers and WekaIO Matrix

Executive summary

A storage bottleneck problem

For High Performance Computing (HPC) and Deep Learning clusters, the improvements in compute speed by CPUs and GPUs pose a problem. Compute speed has dramatically outpaced bandwidth improvements in storage solutions historically used by these applications.
A good example of the issue is seen in parallel file system design, a frequently used storage paradigm for technical compute. Parallel file systems provide storage built from fabric-connected servers - typically server-attached SAS/SATA hard drives or SAN/NAS storage solutions using the same. To efficiently deliver tens or hundreds of GB/s of bandwidth to compute clusters with SAS/SATA devices, the building blocks must contain a lot of storage devices or many less dense building blocks and an increased number of network ports.
Even if application capacity demands aren't high, a significant number of storage devices may still be needed to hit high bandwidth requirements - particularly with hard drive-based solutions; each device provides a small percentage of the total performance needed. The parallel file systems used by the industry for decades weren't designed for the latency or IOPS performance requirements that the latest in block storage technology can deliver. So yesterday's solution may not meet newer technical compute requirements, regardless of the hardware it runs on.