NVIDIA Unveils DGX-2H Server with 450W Tesla V100 GPUs
by Anton Shilov on November 20, 2018 4:30 PM ESTNVIDIA has introduced a new version of its DGX-2 server that is outfitted with higher-performing CPUs and GPUs. The DGX-2H server is powered by 16 Tesla V100 GPUs that run at higher clocks and feature a 450 W TDP each. The whole system consumes up to 12 kW of power and delivers 2.1 PetaFLOPS of compute horsepower.
NVIDIA’s DGX-2H is an updated version of the DGX-2 machine the company introduced earlier this year. The new system is based on Intel’s two 24-core Intel Xeon Platinum 8174 processor accompanied by 1.5 TB of DDR4 memory, as well as 30 TB of NVMe storage. The key improvements of the new server versus the previous one are faster NVIDIA Tesla V100 GPUs featuring 512 GB of HBM2 memory in total. Meanwhile, the new DGX-2H similar networking capabilities: 10/25/40/50/100 GbE.
UPDATE 11/29: NVIDIA has reached out to clarify a number of data points regarding the DGX servers, so the story has been updated.
NVIDIA DGX Series (with Volta) | |||
DGX-2H | DGX-2 | DGX-1 | |
CPUs | 2 x Intel Xeon Platinum 8174 |
2 x Intel Xeon Platinum 8168 |
2 x Intel Xeon E5-2600 v4 |
GPUs | 16 x NVIDIA Tesla V100 32GB HBM2 (450 W) |
16 x NVIDIA Tesla V100 32GB HBM2 (350 W) |
8 x NVIDIA Tesla V100 32 GB HBM2 |
System Memory | Up to 1.5 TB DDR4 | Up to 0.5 TB DDR4 | |
GPU Memory | 512 GB HBM2 (16 x 32 GB) |
256 GB HBM (8 x 32 GB) |
|
Storage | 30 TB NVMe Up to 60 TB |
4 x 1.92 TB NVMe | |
Networking | 8 x Infiniband or Dual 100 GbE |
8 x Infiniband or Dual 100 GbE |
4 x IB + 2 x 10 GbE |
Power | 12 kW | 10 kW | 3.5 kW |
Size | 360 lbs | 360 lbs | 134 lbs |
GPU Throughput | Tensor: 2100 TFLOPs FP16: ? TFLOPs FP32: ? TFLOPs FP64: ? TFLOPs |
Tensor: 1920 TFLOPs FP16: 480 TFLOPs FP32: 240 TFLOPs FP64: 120 TFLOPs |
Tensor: 960 TFLOPs FP16: 240 TFLOPs FP32: 120 TFLOPs FP64: 60 TFLOPs |
Cost | ? | $399,000 | $149,000 |
Thanks to faster graphics processors with a 450 W TDP each, the system now can deliver 2.1 PFLOPS of compute performance, up from 2 PFLOPS before. Meanwhile, in a bid to increase power, it looks like NVIDIA had to switch to a new cooling method. ServeTheHome believes that NVIDIA also uses a new cooling subsystem as the DGX-2H weighs 20 pounds more than its predecessor (360 pounds vs. 340 pounds), though the company has not confirmed this. Along with performance improvements NVIDIA had to decrease maximum operating temperature of the DGX-2H from 35C to 25C.
NVIDIA has not disclosed pricing of the DGX-2H, though it is likely that it will cost more than $399,000, the price of the DGX-2. What remains to be seen is whether NVIDIA customers find the DGX-2H performance good enough for extra 2 kW of power consumption.
Related Reading:
- NVIDIA’s DGX-2: Sixteen Tesla V100s, 30 TB of NVMe, only $400K
- NVIDIA Unveils & Gives Away New Limited Edition 32GB Titan V "CEO Edition"
- GIGABYTE Launches Two 4U NVIDIA Tesla GPU Servers: High Density for Deep Learning
Sources: NVIDIA, ServeTheHome
14 Comments
View All Comments
mode_13h - Wednesday, November 21, 2018 - link
Got it. Thanks.BTW, I previously read the mezzanine V100's were rated at 300 W. Maybe the DGX-1 was already overclocking them.
DanNeely - Wednesday, November 21, 2018 - link
I'm not sure if the newer model really makes a lot of sense unless you need the better networking. 10% faster, 20% (28% if you just look at the tesla cards share, - might be relevant if running a workload that has the CPU and network at idle) more power used isn't an attractive option unless there're scalability issues with spreading workloads across multiple boxes.Yojimbo - Wednesday, November 21, 2018 - link
In situations where the performance is being bound by thermal constraints in the original DGX-2 the increase in the theoretical throughput is not useful to compare the utility of the new system's higher thermal allowance. I think it's safe to assume that it is exactly those situations this new system is meant to target. We would need real world benchmarks to draw any conclusions, but the safer assumption would be that NVIDIA didn't make this system just to keep their systems engineers and salesmen busy because they had no other work to do.Impetuous - Wednesday, November 21, 2018 - link
you know you're getting old when no one has asked if it can run Crysis yet...