Nvidia Blackwell GB202 GPU Rumored to Feature 384-bit GDDR7

imaginary_num6er@alien.top · 10 months ago

Nvidia Blackwell GB202 GPU Rumored to Feature 384-bit GDDR7

Masters_1989@alien.top · 10 months ago

Couldn’t care less unless it’s priced much lower. (Same goes for lower-end SKUs.)

dog-gone-@alien.top · 10 months ago

Wonder how expensive these cards will be. The cost is getting out of hand and after all these years, I am not excited about new GPUs.

JuanElMinero@alien.top · 10 months ago

Am I reading those Cuda core projections right?

GA102 to AD102 increased by about 80%, but the jump from Ad102 to GB202 is only slightly above 30%, aside from no large gains going to 3nm?

Might not turn out that impressive after all.

ResponsibleJudge3172@alien.top · 10 months ago

Its expected to be like Ampere, Ampere was 17% increase in SMs (rtx 3090ti vs rtx Titan) but the SM itself was improved such that they yielded about 33% improvement per SM in ‘raster’ and massive improvements in occupency for RT workloads. So 3090ti ended up 46% faster in ‘raster’ vs rtx Titan.

The TPC and GPC of Blackwell are rumored to be overhauled with a more hesitant rumor about the SM also being improved.

Qesa@alien.top · 10 months ago

It’s highly likely to be a major architecture update, so core count alone won’t be a good indicator of performance.

ResponsibleJudge3172@alien.top · 10 months ago

‘Ampere Next’ referred to datacenter lineup, which ended being the biggest architectural change in datacenter GPUs since Volta vs GP100. And Ampere Next Next, referred to datacenter Blackwell, which is MCM so again a big change

Eitan189@alien.top · 10 months ago

It isn’t a major architecture update. Nvidia’s slides from Ampere’s release stated that the next two architectures after Ampere would be part of the same family.

Performance gains will be had by improving the RT & tensor cores, using an improved node, probably N4X, to facilitate clock speed increases at the same voltages, and by increasing the number of SMs across the product stack. The maturity of the 5nm process will allow Nvidia to use larger die than they could in Ada.

rorschach200@alien.top · 10 months ago

by improving the RT & tensor cores

and HW support for DLSS features and CUDA as a programming platform.

It might be “a major architecture update” by the amount of work that Nvidia engineering will have to put in to pull off all the new features and RT/TC/DLSS/CUDA improvements without regressing PPA - that’s where the years of effort will be sunk - and possibly large improvements in perf in selected application categories and operating modes, but a very minor improvement in “perf per SM per clock” in no-DLSS rasterization on average.

2GisColorless@alien.top · 10 months ago

lmao

Baalii@alien.top · 10 months ago

You should be looking at transistor amount if anything at all, “cuda cores” is only somewhat useful when looking at different products within the same generation.

ResponsibleJudge3172@alien.top · 10 months ago

Still very accurate if you know what to look for.

For example, the reason why Ampere vs Turing CUDA cores scale different will let you predict how an Ampere GPU scales vs Turing GPU.

It’s also why we knew how Ada would scale linearly except with 4090 that was nerfed to be more efficient

ResponsibleJudge3172@alien.top · 10 months ago

I guess people don’t dig into white papers to learn about how and why the architectures perform as they do

DevAnalyzeOperate@alien.top · 10 months ago

I honestly don’t know how well a 24gb 5090 will move, no matter how fast it is. I feel like the gamers will go for stuff like 4080 super, 4070 ti super, next gen AMD. For productivity users, there’s 3090, 4090, A6000.

Maybe I’m wrong and the card doesn’t need to be very good to sell because GPUs are so burning hot right now.

soggybiscuit93@alien.top · 10 months ago

or productivity users, there’s… A6000.

A6000 is a lot of money. For productivity users for say, Blender, you can get the same 48GB of VRAM and more compute for a lower cost if you go with dual 4090’s.

JuanElMinero@alien.top · 10 months ago

GDDR7 memory chips will be in production with either 2 or 3 GB sizes, which means 36GB of VRAM on 384-bit bus could be a possibility for next gen.

rorschach200@alien.top · 10 months ago

Why actually build the 36 GB one though? What gaming application will be able to take advantage of more than 24 for the lifetime of 5090? 5090 will be irrelevant by the time the next gen of consoles releases, and the current one has 16 GB for VRAM and system RAM combined. 24 is basically perfect for top end gaming card.

And 36 will be even more self-canibalizing for professional cards market.

So it’s unnecessary, expensive, and canibalizing. Not happening.

FloorEntire7762@alien.top · 10 months ago

Don’t think so. Rtx Titan from 2018 much faster than ps 5 gpu from 2020. I suppose next gen console gpu will get rtx 4070 level perfomance or slightly above. Ps 4 had hd 7850 perfomance in 2013 so…

soggybiscuit93@alien.top · 10 months ago

36GB is certainly a possibility. VRAM demand is high across multiple markets. Currently you can get a 24GB 4090 or 48GB A6000 Ada. There’s certainly a possibility of seeing 36GB 5090 and 72GB A6000 Blackwell (B6000?)

Flowerstar1@alien.top · 10 months ago

Gaming applications didn’t take advantage of the 24GB when it debuted on the 3090 and they still don’t do for the 4090 now. That’s not what drives these decisions.

rorschach200@alien.top · 10 months ago

The bus width needed to be what it needed to be. That left 2 possibilities - 12 GB and 24 GB. The former was way low for 4090 to work in its target applications. 24 it became.

This is exactly what drives these decisions.

What do you think drives them?

ResponsibleJudge3172@alien.top · 10 months ago

Of course not, memory bandwidth matters much more. We always say this, and now people finally see proof with 4060ti

Dangerman1337@alien.top · 10 months ago

Can see 5090 30GB and 5090 Ti 36GB and then a 72GB Quadro.

ZaadKanon69@alien.top · 10 months ago

32GB 5090 and 24GB 5080 is the most realistic configuration.

Also expect both of them to have ridiculous prices. $1500+ for the 5080 and $2500 FE MSRP for the 5090 wouldn’t surprise me. AMD is skipping high-end for 1 generation so their competition will likely be a $1000 5070Ti. The 7900XTX or a refresh of it will be AMD’s flahship until RDNA5. They have their valid reasons for that but it’s very bad news for Nvidia customers, as much as they like to bash AMD.

Nvidia also wants to protect their way more expensive professional lineup so especially the 32GB 5090 will be priced to the moon.

lusuroculadestec@alien.top · 10 months ago

Rumors have a 5090 with GDDR7 and a 384-bit bus. Micron has GDDR7 modules on their roadmap as 2GB and 3GB. This means that the memory configurations for 2GB modules will be 24 or 48GB, and with 3GB modules it will be 36GB or 72GB.

32GB would imply it’s a 256 or 512-bit bus, neither of which are very likely for a xx90. I could see them maybe going as low as a 320-bit bus for 30GB. Even 33GB with a 352-bit bus is more likely.

The 5080 will be another thing, 24GB would imply a 256-bit bus with 3GB modules. Nvidia has been all over the map with the xx80 memory width, so it will be anyone’s guess. If they prioritize memory bandwidth and use a 320-bit bus, a 20GB card is most likely.

GDDR prevents having arbitrary memory sizes.

ZaadKanon69@alien.top · 10 months ago

A 20GB 5080, probably at an even higher MSRP than the 4080 due to a lack of competition, would be criminal… There was supposed to be a 20GB 3080 for crying out loud. And games will def go over 16GB before next next gen so 4080 owners will face a VRAM bottleneck and then their upgrade option is a $1500 5080 omg.

I heard the 512-bit rumor and thought Nvidia was FINALLY fixing their VRAM issue across their entire product stack… Sigh.

Keulapaska@alien.top · 10 months ago

32GB 5090 and 24GB 5080 is the most realistic configuration.

A 32GB 5090 would mean 512bit bus, which this rumor is saying it will not have, contrary to the original rumor. So it’ll either be 24GB or 36GB(48GB also, but that’s probably not happening) as there will be 24Gb GDDR7 modules as well in addition to 16Gb ones(idk if mixing different capacity memory modules is a thing though to get like 30GB out of 384 bit bus, I’m just guessing no). Maybe they do both versions, or leave the 36GB as a potential 5090ti for later on, who knows.

Jeep-Eep@alien.top · 10 months ago

Any chance the next RDNA will sport GDDR7?

Wfing@alien.top · 10 months ago

The current lineup doesn’t even use GDDR6X and it’s selling like shit. There’s just no way.

imaginary_num6er@alien.top · 10 months ago

I thought Nvidia has an exclusivity deal with Micron to ban AMD from using GDDR6X?

2GisColorless@alien.top · 10 months ago

Nah, it’s just expensive + power hungry

ZaadKanon69@alien.top · 10 months ago

Maybe, but it would be unnecessary because RDNA4 caps out at midrange so GDDR6 would suffice. AMD decided to cut RDNA4 high-end to produce more AI chips and earn more money, and give their engineers more time to get multiple GPU chiplets on 1 card working for RDNA5, which is where the real performance boom is at. RDNA3 still has 1 graphics die, only the memory controller and cache is on separate chiplets.

So the 7900XTX will remain AMD’s flagship until RDNA5.

There might be some kind of refresh of the 7900XT(X) with slightly better performance and efficiency, maybe those would use GDDR7 if possible and economical.

The good news for current owners is the 7900 cards have plenty of VRAM to last until RDNA5. The bad news is there will be no competition fir the 5080 and 5090 so expect even higher MSRPs than the 4000 series. $2500 MSRP for a 32GB 5090 wouldn’t surprise me. And $1500 for the 5080, the “gaming flagship”.

If you were waiting for next gen hoping value would improve vs the 4000 series… I hope you have even more patience.

The moment I heard the news about RDNA4 high-end being scrapped and the monster chiplet design was moved to RDNA5, as well as the high AI demand and lack of production capacity I pulled the trigger on a 7900XT because next gen is going to be absolutely bonkers on the Nvidia side and nothing better will be released on the AMD side other than software improvements, maybe a refresh of the 7900 cards but that’s it. This card with 20GB VRAM will last me until RDNA5/RTX6000.

Jayz2cents also made a video a while ago voicing his opinion to buy a GPU now cause it’s only gonna get worse in the coming years. A situation arguably worse than the crypto boom, combined with a lack of competition for 80 and 90 series and Nvidia’s Apple approach… Bad news.

Intel won’t have a truly viable product for general gaming within this timeframe either. Even today their drivers are lightyears behind both AMD and Nvidia with performance all over the place depending on each individual game. And Intel too is making AI chips based on GPUs. The consumer GPUs are like a proof of concept.

Jeep-Eep@alien.top · 10 months ago

It might be a way to improve perf either straight up or by cutting wattage and even with the scrapped big models - assuming that a possible AI crash doesn’t have those tapeouts pulled from storage - they likely have RDNA 4’s GDDR7 controller taped out if the line was to use it.

ZaadKanon69@alien.top · 10 months ago

It would also increase cost, and Navi43/44 don’t need that extra performance. They will probably be slower or at best the same speed as a 7800XT. On the flipside they’ll be dirt cheap too, probably with a good chunk of VRAM, so really good low-midrange cards that actually work in games unlike Intel.

Unless AMD is already going for multi-graphics chiplets and they just slap on four Navi43 chiplets to create a flagship. That would be pretty epic and is an option still on the table, they already have a proper functional chiplet design fir their AI cards. But I think we won’t see that until RDNA5.