Since for at least 2010 we’ve had laptops with integrated GPUs inside the chipset. These GPUs have historically been very lacking (I’d say, extremely so, to the point that a Tiger Lake 11400H CPU, which is quite powerful, wouldn’t reach 60fps on CSGO a 1080p with the iGPU. AMD SoCs fared better in that aspect, but are still very lacking, even in their most recent RDNA3-based iterations, due to the poor bandwidth these laptops usually ship with (at best, dual channel DDR5 ram, but mostly dual channel DDR4). As such, dedicated GPUs with their own GDDR6 RAM and big dies have been necessary for both laptops and desktops whenever performance is a requirement, and lowend dedicated GPUs have been considered for those manufacturers that want slim, performant laptops with a decent battery life.
At the same time, there have been 3 important milestones for the APU market:
- In 2007 the Xbox 360 shifted from a Dedicated GPU + CPU combo for a single GPU combining both in the same die. The PS3 still follows the usual architecture of separate GPU and CPU.
- Both Sony and Microsoft release the PS4 and Xbox One (and their future successors) with an APU combining both. The Xbox One is fed with DDR3 RAM (don’t know how many channels) + a fast ESRAM, and it seems the bandwidth was a huge problem for them and part of the reason why it performed worse than the PS4.
- Apple released the Apple-silicon powered Macbooks, shipping powerful GPUs inside the laptops on a single die. Powerful at the expense of being extremely big chips (see the M2 Max and Ultra), and maybe not as powerful as a 3070 mobile in most cases, but still quite powerful (and pricey, but I wonder if this is because of Apple or because APUs are, for the same performance level, more expensive, we’ll get to that).
- The Steam Deck is released, featuring a 4 cores/8threads CPU + RDNA2 GPU packed with a quad-channel DDR5 RAM at 5.5GHz, totalling 88GB/s.
Now, for price-sensitive products (such as the Steam Deck, or the other game consoles), APUs seem to be the way to go. You can even make powerful ones, as long as they have enough bandwidth. It’d seem to me that it’d be clear that APUs provide a much better bang for the buck for manufacturers and consumers, as long as they’re paired with a nice memory architecture. I understand desktop PCs care about upgreadability and modularity, but why is gaming APUs not a thing in laptops/cheap gaming OEM Desktops? With 16gb 4-channel DDR5 or even GDDR6 RAM, those things would compete really well against game consoles, while avoiding all the duplicated costs that are incurred in when pairing a laptop with a DGPU. And in the end, laptops already have custom motherboards, so what’s the issue at all? What are the reasons why even cheap gaming laptops pick RTX 3050’s instead of having some love from AMD?
Bonus question: How come the DDR3 RAM at 1066MHz in the Xbox One is 68.3GB/s while the Steam Deck, with a much newer 5500MHz RAM and quad-channel is able to provide just 88GB/s?
Simple reason. An entry level gaming laptop with a decent CPU and an acceptable NVIDIA dGPU for 1080p 60Hz gaming will typically cost less than $1000.
A premium AMD ultrabook with the fastest LPDDR5/x RAM with at least 16 GB memory that will actually not limit the iGPU, and with acceptable power limits will be closer to $1500, if not more.
The dGPU will run circles around the fastest laptop of the second type you can get.
I mean, comparing entry level gaming laptop to premium non-gaming laptop price and performance is WTF?
Cheap non-gaming laptop with top tier APU like 5800H/6800HS going to be 600/800$. Sure it will still be weaker than an laptop with GPU but price is not reason. Amd or Intel simply doesn’t do chips with powerfull iGPU included cuz it doesn’t make sense. Too little memory bandwidth and monolitic dies cost way more the bigger they are.
People could think, so why don’t they use GDDr6 like consoles. Well, cuz GDDr6 provide a lot of bandwidth but has much worse latency. Which makes CPU part slower.
Ultimate move would be if they used 64bit (aka single channel) LP/DDR5 and then put also GDDr6 memory controller for GPU. No need to unify memory (still not here, shared memory is different).
That would make best of worlds. Tho it would work only on laptops where everything is soldered down. On PCs it would be hell, or mobo would include GDDr6 memory soldered down even if no chips that would use it was used, which doesn’t make sense. And could make it even worse at some point (there would be probably ones with 4/8GB ggdr6 or without any) which is hell for customers while buying mobos
About the bonus question, its all about bus width
Steam deck has a 128 bit memory bus, whereas the Xbox one has a 256 bit memory bus.
Channels are also mostly desktop terminology, they are technically part of the DDR spec but they change around so it’s easier to just compare bus width. Steam Deck has 4 32 bit channels, while Xbox One has 4 64 bit channels.
You are also confusing the clock speed with the transfer rate, the memory in Xbox One switches at 1.066GHz and transfers at 2.1GT/s, since its “DDR” or “dual data rate”, which means that it does a transfer with both the rising and the falling edge of the clock.
End result:
- 256bit * 2.1GT/s / 8 = 67GB/s
- 128bit * 5.5GT/s / 8 = 88GB/s
(the division by 8 is there to go from bits to bytes)
Its GDDR6
Not DDR1
Clock in DDR memory is complicated…GDDR5 used in the Xbox One is DDR memory. It just means Double Data Rate.
GDDR6 used in the Series X however isn’t. It’s technically Quad Data Rate, but they decided to keep the naming scheme.
How can someone research so much but also completely miss basic specifications like bus width? Like really how does someone type this out and not ask themselves, ‘are these devices free to make or do they have a cost?’
Fun little trivia - When you fps cap on a dGPU will have significantly less power usage than iGPU at the same cap
If you are on laptop a 4060 will have nearly 2x the battery life of 780m if both capped to 30 fps
Surely that’s game dependent? Try it woth something ancient like HL1 and i imagine the iGPU would use less power. Interesting there is a crossover somewhere though.
The bandwidth issues of these APUs can be solved with large on-die cache. But the problem with SRAM is that it takes up a lot of die space. And with CPU + GPU already on the die, there isn’t much space left for the cache.
However, this problem can be solved with chiplets. Though I’m not sure if there’s a market for ‘high-end’ APUs!
Such 3D cache also bring real issues with cooling and laptops doesn’t do that one efficiently.
Maybe chiplet tech allows a much better yield of GPU + CPU [ + NPU ] on the same chip, with the resulting benefits of low latency / fast interconnect, shared cache etc.
Also offers a cheaper more flexible way to mix-n-match compute subunits / cores to suit market niches.
Well see a lot more of this in future.
Laptops use the APU’s GPU to save power instead of keeping the larger DGPU active, it also spreads out the heat by having them separate.
For larger chips for desktops, memory bandwidth/latency and yields becomes an issue.
By combining a GPU and CPU, each part can have it’s own defects, increasing your possible product stack for a small market segment.
CPUs like low latency, GPUs like high bandwidth. Quad channel DDR5 is still like, 150GB/s. A 6600XT and 7600M are 256GB/s. You’d want 300-400GB/s of memory bandwidth while being low latency, otherwise you’re losing performance for no reason.
apu’s have come a long way, but bandwidth is still a bottleneck. they’re great for price-sensitive products like the steam deck, but dedicated gpus are necessary for performance. not sure why gaming apus aren’t more common in laptops/desktops, but maybe it’s due to upgradability concerns. the whole ram comparison thing is confusing, but mt/s and bus width are better metrics than mhz and channel numbers.
apu’s have come a long way, but bandwidth is still a bottleneck. they’re great for price-sensitive products like the steam deck, but dedicated gpus are necessary for performance. not sure why gaming apus aren’t more common in laptops/desktops, but maybe it’s due to upgradability concerns. the whole ram comparison thing is confusing, but mt/s and bus width are better metrics than mhz and channel numbers.
I think you already made the main argument clear. It is a cost cutting measure, and a very effective one at that!
Today’s hardware is powerful enough that an APU is “enough” and again the Steam Deck is the most impressive of all of the examples.
But for a more premium product, there are 3 main pluses for CPU+dGPU, first, disparity in needed performance, if I’m gaming most of the power I need is GPU based, but if I’m doing CAD or CFD I’m suddenly CPU bound (ST for CAD, MT for CFD), and if I’m buying a 1.5k€ system, I do prefer it to be optimised for my use case and being able to choose CPU and GPU deparatedly allows for that, while for APU it’d become a skew nightmare! Second, upgradeability, as we have seen with things like 4th gen Intel processors keeping up until practically 3rd gen Ryzen took over as still relevant for gaming, where the main upgrade needed was the GPU, for a more premium product I do consider it important. And third, repairability, even if in some categories like laptops dGPU vs iGPU is not the main bottleneck for it, there is a big amount of computers I’ve rescued with a GPU swap, for sure.
I think you already made the main argument clear. It is a cost cutting measure, and a very effective one at that!
Today’s hardware is powerful enough that an APU is “enough” and again the Steam Deck is the most impressive of all of the examples.
But for a more premium product, there are 3 main pluses for CPU+dGPU, first, disparity in needed performance, if I’m gaming most of the power I need is GPU based, but if I’m doing CAD or CFD I’m suddenly CPU bound (ST for CAD, MT for CFD), and if I’m buying a 1.5k€ system, I do prefer it to be optimised for my use case and being able to choose CPU and GPU deparatedly allows for that, while for APU it’d become a skew nightmare! Second, upgradeability, as we have seen with things like 4th gen Intel processors keeping up until practically 3rd gen Ryzen took over as still relevant for gaming, where the main upgrade needed was the GPU, for a more premium product I do consider it important. And third, repairability, even if in some categories like laptops dGPU vs iGPU is not the main bottleneck for it, there is a big amount of computers I’ve rescued with a GPU swap, for sure.
Now, for price-sensitive products (such as the Steam Deck, or the other game consoles), APUs seem to be the way to go. You can even make powerful ones, as long as they have enough bandwidth. It’d seem to me that it’d be clear that APUs provide a much better bang for the buck for manufacturers and consumers
It’s actually the other way round.
APUs make sense in a power-constrained product, not a price-sensitive one.
The Steam Deck is a good example. It has a pretty weak CPU/GPU combo (4 Zen 2 cores and 8 RDNA 2 CUs), but this doesn’t matter, because what matters is being able to run games on battery for a decent amount of time.
When everything is on one chip, power requirements are lower, because there’s no need for inter-chip communication. Space is saved because there’s only one chip to use. This is great for small mobile products.
What about price?
APUs sell for cheap on the desktop because their performance is lower than other CPUs, but they aren’t cheap to make.
For example, Raven Ridge was 210 mm^(2), while Summit Ridge / Pinnacle Ridge were 213 mm^(2). So the chip price was about the same, but the Ryzen 1800X debuted at $500 and then dropped to $330, where the 2700X also sold, but the top of the range Raven Ridge, the 2400G, was sold for $170.
So even though it cost AMD the same to make these chips, Raven Ridge was sold for half the price (or a third for the 1800X). AMD therefore made a lot less money on each Raven Ridge chip.
The console prices are deceptive. Microsoft and Sony subsidise the consoles because they make money on game sales and subscriptions. The consoles are often sold at a loss. If the consoles were not subsidised, they’d have sold for double the price, and would have likely lost to a similarly priced PC.
Flexibility
Even though laptops aren’t user-expandable, they are still configurable. When it comes to gaming laptops, there are a lot of CPU/GPU combinations. It’s impossible to create a special chip for each combination, and binning is hardly enough to create them.
Without having separate CPUs and GPUs, you’d get an ecosystem similar to Apple’s, or the consoles, where there is a very small number of models available for purchase. That would kill one of the benefits of the Windows ecosystem, the ability to make your purchase fit performance and price targets.
A silver lining
Chiplets do make it possible to pair different CPUs and GPUs on the same chip, even together with RAM. You could call that an APU.
You’re probably the most right question. Thank you for your excellent insight. It makes total sense :)
Speaking of powerful APUs, the rumor mill points to a 40 CU (basically a 6700 XT) APU being in the works. I imagine AMD will be insisting on shared GDDR memory, because otherwise, it won’t have enough bandwidth to be worth it.
Integrated graphics are still lacking. To get 12 CUs you need to buy a 5ghz 8 core, even realistically a CPU like that would not bottleneck at even a 60 CU GPU. So they are still really imbalanced in the CPU to GPU ratio.
That issue can be solved with Intels tile approach, where they no longer use a monolithic die for everything, but a separate die for the GPU. So if the demand from partners is there, they can do like a 4+8 CPU tile and 128 EU GPU tile, instead of just making the expected 6+8 and 128 EU configuration. It allows far more freedom for product designs than just binning a monolithic die, but it’s on partners and consumers to ask for such combinations.
These GPUs have historically been very lacking (I’d say, extremely so, to the point that a Tiger Lake 11400H CPU, which is quite powerful, wouldn’t reach 60fps on CSGO a 1080p with the iGPU.
serious error in the second sentence.
i5-11400H was intended for use with a discrete GPU.
It had a ‘MS Office’-grade IGPU.
It was part of the H45 series of CPUs, which consumed 45W.
Intel also made H35 (35W), 28W, and 15W CPUs.
For the i5-11400H you got 16 GPU execution units, while for the i5-11300H you got 80, and for the i5-11320H 96.
The i5-11400H had more cache and more watts for CPU tasks, but virtually no GPU.
When you say ‘historically’, this is also not correct. Tiger Lake uses Intel’s current Xe GPU. It’s actually modern in terms of GPU.
Since Alder Lake, Intel mobile chips are typically 80 or 96EU.