Tom's Hardware: "Intel's next-gen Arrow Lake GPU will have new Xe-LPG Plus Architecture with XMX"

Dakhil@alien.top · 10 months ago

Tom's Hardware: "Intel's next-gen Arrow Lake GPU will have new Xe-LPG Plus Architecture with XMX"

kingwhocares@alien.top · 10 months ago

XMX extensions are designed for matrix multiplication operations in FP64, FP32, FP16, and bfloat16 formats, which are crucial in many AI algorithms, including neural networks

From the article.

AnimeAlt44@alien.top · 10 months ago

bfloat16

I have no idea why this is important or really what it even is but Apple had a pretty video about it in their WWDC catalog so I guess it’s a trend now.

GomaEspumaRegional@alien.top · 10 months ago

Standard float16 uses 1bit sign + 5bit exponent + 10bit fraction.

bfloat16 uses 1bit sign + 8bit exponent + 7bit fraction.

bfloat16 basically gives the same exponent precision as a standard float32. But most neural networks don’t require a huge fraction range. So bfloat16 gives you the possibility of executing 2x 8bit NP FLOPs vs using a float32 to do the same 1x8bit NP FLOP.

Having the ALU support this format allows for the scheduler to pack 4xbfloat16 that can be executed in parallel in a standard 64bit ALU. So basically you double or quadruple the 8bit NP FLOPs that you would get from using traditional float16/32 representations.

AnimeAlt44@alien.top · 10 months ago

Makes sense. Thanks for that!