Again at its Monetary Analyst Day in 2020, AMD confirmed a diagram emphasizing its server CPU design chops and that chiplets weren’t the final step within the evolution of its CPUs. AMD opted to attract a line from its first deployment of HBM in 2015 by to the launch of chiplets, together with a future CPU delivering X3D packaging with a mix of two.5D and 3D applied sciences.
AMD by no means introduced a selected product that may carry X3D to market, however a brand new rumor suggests the corporate is engaged on a product codenamed “Milan-X.” Milan-X can be based mostly on AMD’s most up-to-date Epyc processor structure, however it will deploy way more reminiscence bandwidth than we’ve seen in an AMD server earlier than.
Milan-X aka Milan-X(3D). Genesis IO-die with stacked chiplets
I really like lasagna 😋 https://t.co/O2FrGxyd8P
— ExecutableFix (@ExecuFix) May 25, 2021
AMD’s next-gen I/O die is supposedly referred to as Genesis I/O, and the whole mixed 2.5D/3D stack sits on high of a big interposer. AMD’s official diagram exhibits a Four-high stack of HBM per CPU cluster, with one HBM stack devoted to every chip.
It’s attainable that AMD’s diagram above is simply supposed to indicate the final idea of what the corporate intends to construct, not precisely convey the ultimate design of the product. If the diagram is correct, it suggests Milan-X will both function extra cores per chiplet (16 can be wanted to hit 64 cores in 4 chiplets) or that Milan-X will high out at 32 cores. The diagram additionally implies AMD’s interposer die should be beneath the cluster of chiplets.
This might positively qualify as 3D chip stacking, nevertheless it additionally raises questions on how a lot energy the I/O die will draw. It appears doubtless that AMD would have lastly shrunk all the way down to 7nm for I/O, simply to restrict the general energy consumption.
3D die stacking has all the time been troublesome outdoors of low-power environments, because of the downside of transferring warmth from the underside to the highest of the stack with out cooking some a part of the chip within the course of. The Holy Grail of chip stacking is to place a number of high-power chiplets on high of one another versus laying them out side-by-side, however Intel and AMD have each determined to deal with one thing a bit simpler first: placing a scorching chip on high of a cool one.
Intel doesn’t use the identical X3D know-how that AMD is rumored to be transport for Milan-X, however its Foveros 3D interconnect allowed the corporate’s low-power Lakefield processor to function one big-core Ice Lake CPU stacked on high of 4 low-power “Tremont” CPU cores. With Milan-X, AMD can be tackling one thing significantly extra complicated — once more, assuming each that this rumor is true and that the I/O die is beneath the chip cluster.
Milan-X is alleged to be a data-center-only chip and it isn’t clear what sort of cooling answer can be required to take care of the CPU’s distinctive construction. Presumably, AMD will need to stick with compelled air, however liquid and immersion cooling are additionally attainable.
The quantity of bandwidth Milan-X would provide on this configuration is unparalleled. Our latest TRACBench debut illustrated how a lot further reminiscence bandwidth might enhance the efficiency of the eight-channel 3995WX in contrast with the quad-channel 3990X, even when the latter is operating at the next clock velocity. In that comparability, a Threadripper 3995WX has as much as 204.8GB/s price of reminiscence bandwidth to separate throughout 64 cores.
If every Milan-X chiplet remains to be eight cores and the chip makes use of mainstream, commercially out there HBM2E, we’d be taking a look at someplace between 300-500GB/s price of reminiscence bandwidth per chiplet. Complete out there reminiscence bandwidth throughout the whole chip ought to break 1TB/s and will attain 2TB/s. No matter different constraints may bind Milan-X at that time, bandwidth wouldn’t be amongst them. The chip additionally presumably helps off-package reminiscence, nevertheless. Even when we assume a near-term breakthrough permitting for 32GB per HBM2E stack, 4 stacks would solely be 128GB of RAM and eight stacks would supply simply 256GB. AMD’s present servers assist 4TB of RAM per socket, so there’s no probability of changing that form of capability with an equal quantity of on-package HBM2E.
Milan-X seems just like the form of chip AMD might carry to bear in opposition to Sapphire Rapids. That CPU is anticipated to function someplace between 56 and 80 cores (reviews have diverse), and it additionally integrates HBM2 on-package. Sapphire Rapids is at the moment anticipated in late 2021 or early 2022. No launch date for Milan-X has been reported.