AIME A8004 - Multi GPU HPC Rack Server
The AIME A8004 is the enterprise Deep Learning server based on the ASUS ESC8000A-E12 barebone, configurable with up to 8 of the most advanced deep learning accelerators and GPUs.
Enter the Peta FLOPS HPC computing area with more then 8 Peta TensorOps Deep Learning performance. The A8004 is the ultimate multi-GPU server: Dual EPYC Genoa CPUs with up to 2 TB DDR5 main memory, the fastest PCIe 5.0 bus speeds and up to 100 GBE network connectivity.
Built to perform 24/7 for most reliable high performance computing. Either at your inhouse data center, co-location or as a hosted solution.
AIME A8004 - Deep Learning Server
If you are looking for a server specialized in maximum deep learning training, inference performance, and the highest demands in HPC computing, the AIME A8004 multi-GPU 4U rack server takes on the task of delivering.
The AIME A8004 is based on the new ASUS ESC8000A-E12 barebone, powered by two AMD EPYC™ Genoa processors, each with up to 96 cores, totaling a CPU performance of up to 384 parallel computable CPU threads.
Its GPU-optimized design with high airflow cooling allows the use of eight high-end double-slot GPUs like the latest NVIDIA H100, NVIDIA A100, NVIDIA L40S, RTX 6000 Ada, and RTX 5000 Ada GPU models.
Definable GPU Configuration
Choose the desired configuration among the most powerful NVIDIA GPUs for Deep Learning and Rendering:
Up to 8x NVIDIA H100
The NVIDIA H100 is the flagship of the NVIDIA Hopper processor generation and the successor to the common NVIDIA A100 accelerator cards. The NVIDIA H100 is based on the GH-100 processor in TSMC 4N manufacturing with 14,592 CUDA cores, 456 fourth-generation Tensor cores, and 80 GB HBM2 memory. A single NVIDIA H100 GPU already breaks the peta-fp16 performance barrier. Eight accelerators of this type add up to more than 6,000 teraFLOPS tf32 and up to 72,000 teraTOPS fp8 performance. The NVIDIA H100 is currently the most efficient and fastest deep learning accelerator card available. The H100 is also the first accelerator card that benefits from up to 64 GB/s transfer rates through the PCIe 5.0 bus of the AIME A8004.
Up to 8x NVIDIA A100
The NVIDIA A100 is the flagship of the NVIDIA Ampere processor generation and the successor to the legendary NVIDIA Tesla V100 accelerator cards. The NVIDIA A100 is based on the GA-100 processor in 7nm manufacturing with 6,912 CUDA cores, 432 third-generation Tensor cores, and 40 or 80 GB HBM2 memory with the highest data transfer rates. A single NVIDIA A100 GPU already breaks the peta-TOPS performance barrier. Eight accelerators of this type add up to more than 2,000 teraFLOPS fp32 performance.
Up to 8x NVIDIA L40S 48GB
The NVIDIA L40S is built on the latest NVIDIA GPU architecture: Ada Lovelace. It is the direct successor of the RTX A40 and the passive-cooled version of the RTX 6000 Ada. The L40S combines 568 fourth-generation Tensor Cores and 18,176 next-gen CUDA® cores with 48GB GDDR6 graphics memory for unprecedented rendering, AI, graphics, and compute performance.
Up to 8x NVIDIA RTX 6000 Ada
The RTX™ 6000 Ada is built on the latest NVIDIA GPU architecture: Ada Lovelace. It is the direct successor of the RTX A6000 and the Quadro RTX 8000. The RTX 6000 Ada combines 568 fourth-generation Tensor Cores and 18,176 next-gen CUDA® cores with 48GB of graphics memory for unprecedented rendering, AI, graphics, and compute performance.
Up to 8x NVIDIA RTX 5000 Ada
The RTX™ 5000 Ada is built on the latest NVIDIA GPU architecture: Ada Lovelace. It is the direct successor of the RTX A5000/A5500 and the Quadro RTX 6000. The RTX 5000 Ada combines 400 fourth-generation Tensor Cores and 12,800 next-gen CUDA® cores with 32GB of graphics memory for convincing rendering, AI, graphics, and compute performance.
Up to 8x NVIDIA RTX A5000
With its 8,192 CUDA and 256 Tensor cores of the Ampere generation, the NVIDIA RTX A5000 has about the performance of an RTX 3090. However, with its 230 watts of power consumption and 24 GB of memory, it is a very efficient accelerator card and especially for inference tasks a very interesting option.
All NVIDIA GPUs are supported by NVIDIA’s CUDA-X AI SDK, including cuDNN, TensorRT, which power nearly all popular deep learning frameworks.
Dual EPYC CPU Power
The latest AMD EPYC server CPU with support for DDR5 and PCIe 5.0 delivers up to 96 cores with a total of 192 threads per CPU with an unbeaten price-performance ratio.
The available 2x 128 PCI 5.0 CPU lanes of the AMD EPYC CPU allow the highest interconnect and data transfer rates between the CPU and the GPUs and ensure that all GPUs are connected with full x16 PCI 5.0 bandwidth.
A large number of available CPU cores can improve performance dramatically when the CPU is used for preprocessing and delivering data to optimally feed the GPUs with workloads.
Up to 30 TB Direct NVMe SSD Storage
Deep Learning is most often linked to a high amount of data to be processed and stored. High throughput and fast access times to the data are essential for fast turnaround times.
The AIME A8004 can be configured with up to two exchangeable U.2 NVMe triple-level cell (TLC) SSDs with a capacity of up to 15.36 TB each, adding up to a total capacity of 30 TB of the fastest NVMe SSD storage.
Since each of the SSDs is directly connected to the CPU and the main memory via PCI 4.0 lanes, they achieve consistently high read and write rates of more than 4000 MB/s.
Optional RAID: Up to 60 TB NVMe SSD Storage
For further storage of large datasets and training checkpoints, often additional storage capacity is required. The A8004 offers the option to use its six additional drive bays with a hardware-reliable RAID configuration. Up to 60 TB fastest NVMe SSD storage with RAID levels 0 / 1 / 5 / 10 and 50.
As usual in the server sector, the SSDs have an MTBF of 2,000,000 hours with 1 DWPD and a 5-year manufacturer's guarantee.
High Connectivity and Management Interface
The A8004 can be fitted with standard 2x 10 Gbit/s RJ45/SFP+ LAN ports or with up to 2x 100 Gbit/s (GBE) network adapters for the highest interconnect to NAS resources and big data collections. Also, for data interchange in a distributed computing cluster, the highest available LAN connectivity is a must-have.
The AIME A8004 is completely remote manageable through ASMB11-iKVM IPMI/BMC, powered by AST2600, which makes a successful integration of the AIME A8004 into larger server clusters possible.
Optimized for Multi GPU Server Applications
The AIME A8004 offers energy efficiency with redundant Titanium grade power supplies, which enable long-time fail-safe operation.
Its thermal control technology provides more efficient power consumption for large-scale environments.
All setup, configured, and tuned for perfect Multi GPU and deep learning performance by AIME.
The A8004 comes with preinstalled Linux OS configured with the latest drivers and frameworks like Tensorflow, Keras, and PyTorch. Ready after boot up to start right away to accelerate your deep learning applications.
Technical Details
Type | Rack Server 4U, 80cm depth |
CPU (configurable) |
EPYC Bergamo 2x EPYC 9754 (128 cores, 2.25 / 3.1 GHz) 2x EPYC 9734 (112 cores, 2.2 / 3.0 GHz) EPYC Genoa 2x EPYC 9124 (16 cores, 3.0 / 3.7 GHz) 2x EPYC 9224 (24 cores, 2.5 / 3.7 GHz) 2x EPYC 9354 (32 cores, 3.25 / 3.8 GHz) 2x EPYC 9454 (48 cores, 2.75 / 3.8 GHz) 2x EPYC 9554 (64 cores, 3.1 / 3.75 GHz) 2x EPYC 9654 (96 cores, 2.4 / 3.7 GHz) |
RAM | 256 / 512 / 1024 / 1536 / 2048 / 3072 GB DDR5 ECC memory |
GPU Options |
1 to 8x NVIDIA H100 NVL 94GB or 1 to 8x NVIDIA H100 80GB or 1 to 8x NVIDIA A100 80GB or 1 to 8x NVIDIA RTX L40S 48GB or 1 to 8x NVIDIA RTX 6000 Ada 48GB or 1 to 8x NVIDIA RTX 5000 Ada 32GB or 1 to 8x NVIDIA RTX A5000 24GB |
Cooling | GPUs are cooled with an air stream provided by 5 high performance temperature controlled fans > 100000h MTBF System fans. CPUs and mainboard are cooled with an air stream provided by 6 independet high performance temperature controlled fans > 100000h MTBF |
Storage |
Up to 4 TB built-in M.2 NVMe PCIe 4.0 SSD (optional) Up to 2x 15.36 TB U.2 NVMe PCIe 4.0 SSD Tripple Level Cell (TLC) quality 6800 MB/s read, 4000 MB/s write MTBF of 2,000,000 hours and 5 years manufacturer's warranty with 1 DWPD Optional Hardware RAID: Up to 6x SSD 7.68 TB SATA RAID 0/1/5/10 or Up to 6x SSD 3.84 TB NVMe RAID 0/1/5/10 or Up to 6x SSD 7.68 TB NVMe RAID 0/1/5/10 or Up to 6x SSD 15.36 TB NVMe RAID 0/1/5/10 |
Network |
1 x IPMI LAN 2 x 10 GBE LAN RJ45 or 2 x 10 GBE LAN SFP+ or 2 x 25 GBE LAN SFP28 or 1 x 100 GBE QSFP28 |
USB | 2 x USB 3.2 ports (front) |
PSU | 2+2x 3000W redundant power 80 PLUS Titanium certified (96% efficiency) |
Noise-Level | 88dBA |
Dimensions (WxHxD) | 440mm x 176mm (4U) x 800mm
17.6" x 6.92" x 31.5" |
Operating Environment | Operation temperature: 10℃ ~ 35℃
Non operation temperature: -40℃ ~ 70℃ |