Comments on: Sandia To Push Both HPC And AI With Cerebras “Kingfisher” Cluster https://www.nextplatform.com/2024/11/14/sandia-to-push-both-hpc-and-ai-with-cerebras-kingfisher-cluster/ In-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Fri, 22 Nov 2024 19:21:19 +0000 hourly 1 https://wordpress.org/?v=6.7.1 By: Aiden Lee https://www.nextplatform.com/2024/11/14/sandia-to-push-both-hpc-and-ai-with-cerebras-kingfisher-cluster/#comment-240108 Mon, 18 Nov 2024 08:14:44 +0000 https://www.nextplatform.com/?p=144996#comment-240108 I believe this article has significant implications for us.

In my view, utilizing transformer models for scientific advancement could create more value than their application in assistant-like functions. However, I think computational accuracy is crucial for this purpose. The current FP16 or lower precision used in LLMs may result in lower accuracy of generated molecular structures or materials.

If we were to increase the data format to FP32 or FP64, I believe we would see a dramatic decrease in floating-point operations per second due to the bottleneck between GPU and memory. To minimize these negative effects, while a Wafer Scale AI accelerator could be one approach, I think another viable alternative would be a structure that connects SRAM to memory components like DRAM/NAND and Processing Elements (PEs) through packaging.

]]>
By: Eric Olson https://www.nextplatform.com/2024/11/14/sandia-to-push-both-hpc-and-ai-with-cerebras-kingfisher-cluster/#comment-240046 Sun, 17 Nov 2024 05:07:11 +0000 https://www.nextplatform.com/?p=144996#comment-240046 In reply to Hubert.

The Maryland Berkely entry is titled “Open-source Scalable LLM Training on GPU-based Supercomputers.”

Using a computer designed for science to train AI to me seems the opposite of using a computer designed for AI to do science. Does the wolf dwell with the lamb? It seems all mixed up. Luckily I don’t have a vote either.

]]>
By: Eric Olson https://www.nextplatform.com/2024/11/14/sandia-to-push-both-hpc-and-ai-with-cerebras-kingfisher-cluster/#comment-239984 Sat, 16 Nov 2024 06:19:40 +0000 https://www.nextplatform.com/?p=144996#comment-239984 In reply to Donna Hartd.

Strong scaling is difficult and needed for dynamical systems when the time window is increased rather than the size of the phase space. Since the scaling on Frontier is limited to 32 GPUs, that means 8 nodes. So forget about ultra Ethernet; will CXL on the next generation PCIe fabric allow hundreds of GPUs to sit in the same node?

I’m as skeptical about latency with clustering lots of wafer-scale engines together as I was when Los Alamos National Labs built a huge cluster of Raspberry Pi computers to test fault tolerant scheduling and error recovery. On the other hand, since new functionality doesn’t evolve when environmental pressures are too high, maybe low-stakes hobby projects are needed for high-stakes gains.

]]>
By: Hubert https://www.nextplatform.com/2024/11/14/sandia-to-push-both-hpc-and-ai-with-cerebras-kingfisher-cluster/#comment-239954 Fri, 15 Nov 2024 16:05:09 +0000 https://www.nextplatform.com/?p=144996#comment-239954 Great work by the Tri-Labs Vanguard and Cerebras … and yes to wafer-scale FP64 dataflow SIMDs (for the future)! Breaking through “the 1 second simulation barrier” would be revolutionary and I hope they can make it.

But, the 6 Gordon Bell Prize Finalists for 2024 ( https://sc24.supercomputing.org/2024/10/presenting-the-finalists-for-the-2024-gordon-bell-prize/ ) include a team from University of Maryland College Park (with Max Planck and Berkeley) … so, ahem, I might have to give them my vote (if I had one that is)!

]]>
By: Timothy Prickett Morgan https://www.nextplatform.com/2024/11/14/sandia-to-push-both-hpc-and-ai-with-cerebras-kingfisher-cluster/#comment-239949 Fri, 15 Nov 2024 14:02:33 +0000 https://www.nextplatform.com/?p=144996#comment-239949 In reply to Donna Hartd.

Thank you for the chuckles. . . .

]]>
By: Donna Hartd https://www.nextplatform.com/2024/11/14/sandia-to-push-both-hpc-and-ai-with-cerebras-kingfisher-cluster/#comment-239932 Fri, 15 Nov 2024 05:48:32 +0000 https://www.nextplatform.com/?p=144996#comment-239932 If this works out, Sandia and Cerebras may single-handedly reinvigorate 450 mm wafers. Imagine what those could do. Probably not, but one can dream!

We may even get nuclear fusion power in 47 instead of 50 years. And then use that power to bootstrap HPC for running Ansys, I mean, developing even better chips.

]]>