Comments on: Argonne Aurora A21: All’s Well That Ends Better https://www.nextplatform.com/2023/06/27/argonne-aurora-a21-alls-well-that-ends-better/ In-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Fri, 11 Aug 2023 12:47:38 +0000 hourly 1 https://wordpress.org/?v=6.7.1 By: Timothy Prickett Morgan https://www.nextplatform.com/2023/06/27/argonne-aurora-a21-alls-well-that-ends-better/#comment-212266 Fri, 11 Aug 2023 12:47:38 +0000 https://www.nextplatform.com/?p=142599#comment-212266 In reply to peter j connell.

I don’t recall saying that. I have written the pieces over the years chronicling the pain. What I am saying is that it is built, it has a lot of performance, and Intel adjusted the price down such that A21 provides the best price/performance in the world.

]]>
By: peter j connell https://www.nextplatform.com/2023/06/27/argonne-aurora-a21-alls-well-that-ends-better/#comment-212249 Fri, 11 Aug 2023 04:35:23 +0000 https://www.nextplatform.com/?p=142599#comment-212249 In reply to JayN.

So you say Intel DID NOT screw DOE around? U need to re-read the piece & comments.

I wonder why they took a $300m hit in the $500m deal?

]]>
By: Timothy Prickett Morgan https://www.nextplatform.com/2023/06/27/argonne-aurora-a21-alls-well-that-ends-better/#comment-210933 Mon, 10 Jul 2023 02:13:14 +0000 https://www.nextplatform.com/?p=142599#comment-210933 In reply to JayN.

It is running a year behind just like Frontier was as well at Oak Ridge. We are many years later with any exascale machines–excepting China, which apparently had two in the field in early 2022.

]]>
By: JayN https://www.nextplatform.com/2023/06/27/argonne-aurora-a21-alls-well-that-ends-better/#comment-210930 Sun, 09 Jul 2023 21:09:58 +0000 https://www.nextplatform.com/?p=142599#comment-210930 In reply to emerth.

El Capitan is now expected to be accepted in mid 2024. So is AMD two years late?

https://insidehpc.com/2019/08/cray-to-build-el-capitan-exascale-supercomputer-at-llnl/

“To be hosted at LLNL, El Capitan will have a peak performance of more than 1.5 exaflops and an anticipated delivery in late 2022”

]]>
By: JayN https://www.nextplatform.com/2023/06/27/argonne-aurora-a21-alls-well-that-ends-better/#comment-210925 Sun, 09 Jul 2023 17:01:17 +0000 https://www.nextplatform.com/?p=142599#comment-210925 In reply to hoohoo.

nVidia has software that can run on their GPUs. What Intel provided is a heterogeneous programming solution, including dpc++, fortran, python, SYCL, their performance analysis tools, openMP, MPI, CUDA conversion tools, their whole oneAPI group of tools … oneDNN, oneTBB, oneVPL, oneDAL, oneMKL … converted to run on CPUs and/or GPUs.

Much of this is open source, and that includes examples to port to other accelerators.

]]>
By: Timothy Prickett Morgan https://www.nextplatform.com/2023/06/27/argonne-aurora-a21-alls-well-that-ends-better/#comment-210902 Sat, 08 Jul 2023 16:10:37 +0000 https://www.nextplatform.com/?p=142599#comment-210902 In reply to emerth.

It’s a little different if, as we suspect, Intel wrote off $300 million of the $500 million price tag on the A21 contract and Argonne got a 2 exaflops machine for $200 million. That’s four years late, approximately 2X the performance of the second or third iteration of the contract, at 60 percent off the cost.

]]>
By: hoohoo https://www.nextplatform.com/2023/06/27/argonne-aurora-a21-alls-well-that-ends-better/#comment-210901 Sat, 08 Jul 2023 15:39:21 +0000 https://www.nextplatform.com/?p=142599#comment-210901 In reply to Matt.

I should add, there are niche players that are not as hidebound as AWS but most are not quite up to the level of useful either. Example, one company was offering GTX1080ti, RTX A6000 and also A100 and was going to roll out Ada GPUs as well, perfect for some samisdat AI training and experimentation. But until very recently the company did not offer attachable/composable storage. You got to pay GPU prices while uploading your training data and when the run was complete and you shut down the VM your data would go away.

]]>
By: hoohoo https://www.nextplatform.com/2023/06/27/argonne-aurora-a21-alls-well-that-ends-better/#comment-210900 Sat, 08 Jul 2023 15:33:27 +0000 https://www.nextplatform.com/?p=142599#comment-210900 In reply to Matt.

Can only speak to AWS. You have to build up credibility there and get them to increase your quota, you cannot simply make an account and fire up 1000 GPUs. If you want to stand up accelerated spot instances on AWS you can get enough quota for a set of four 8 GPU A10G machines without too much problem, but beyond that the gotchas start. Getting quota for on demand GPU/AI training instances is an extremely bureaucratic and BS laden process.

Also, take a look at the hourly cost of, say, an on demand 8 GPU A100 instance… AWS is not actually cheap. You can buy GPU servers outright and they’ll pay for themselves in a year or two compared to similar capacity AWS.

]]>
By: emerth https://www.nextplatform.com/2023/06/27/argonne-aurora-a21-alls-well-that-ends-better/#comment-210899 Sat, 08 Jul 2023 15:22:43 +0000 https://www.nextplatform.com/?p=142599#comment-210899 In reply to Timothy Prickett Morgan.

They should have ditched Intel and gone with AMD after the third year of no chips. Didn’t they build an Epyc based test bed cluster at one point just because with Intel they had exactly no CPUs and a handful of frighteningly expensive & barely manufacturable FrankenGPUs ?

This is a bit rich, it’s a lot like the pentagon talking about how great it’s new fighter/carrier/tank is now that it is finally delivered years late and 300% over budget.

Intel should be sued and some managers at Argonne should be fired.

]]>
By: hoohoo https://www.nextplatform.com/2023/06/27/argonne-aurora-a21-alls-well-that-ends-better/#comment-210898 Sat, 08 Jul 2023 15:14:13 +0000 https://www.nextplatform.com/?p=142599#comment-210898 In reply to JayN.

Well, nVidia had it ready at the beginning, didn’t it?

Uncle Sam had to pay AMD and Intel for the code.

]]>