Along today’s announcement of the new Cortex-A77 CPU microarchitecture, the arguably bigger announcement is Arm’s unveiling of the new Valhall GPU architecture and the new Mali-G77 GPU. It’s been three years since the unveiling of the Bifrost architecture, and as the industry and workloads continue to evolve, so must the company’s GPUs.

Valhall and the new Mali-G77 follow up on the last three generations of Mali GPUs with some significant improvements in performance, density and efficiency. While last year’s G76 introduced some large changes to the compute architecture of the execution engines, the G77 goes a lot further and departs from Arm’s relatively unusual compute core design.

A look back at Bifrost – third time’s the charm

It’s not too big of a secret that the last few years haven’t been very kind to Arm’s GPU IP offerings. When the first Bifrost GPU - the Mali-G71 was announced back in 2016 and productised later that year in the Kirin 960 and Exynos 8895, we had expected good performance and efficiency gains.

Bifrost was Arm’s first scalar GPU architecture, departing from the previous generation’s (Midgard: T-600, 700 & 800 series) vector instruction design. The change was fundamental and akin to what we saw desktop GPU vendors like AMD and Nvidia introduce with their new GCN and Tesla architectures last decade.

Unfortunately the first two generations of Bifrost, the Mali-G71 and subsequent G72 weren’t very good GPUs. Arm’s two leading licensees, HiSilicon and Samsung, both came out with quite disappointing SoCs when it came to their GPUs these two generations. The Kirin 960 and 970 in particular were extremely bad in this regard and I’d argue it had quite a lot of impact on Huawei and Honor’s product planning and marketing.

GFXBench Aztec Ruins - Normal - Vulkan/Metal - Off-screen

GFXBench Manhattan 3.1 Offscreen Power Efficiency
(System Active Power)
  Mfc. Process FPS Avg. Power
(W)
Perf/W
Efficiency
iPhone XS (A12) Warm 7FF 76.51 3.79 20.18 fps/W
iPhone XS (A12) Cold / Peak 7FF 103.83 5.98 17.36 fps/W
Galaxy 10+ (Snapdragon 855) 7FF 70.67 4.88 14.46 fps/W
Galaxy 10+ (Exynos 9820) 8LPP 68.87 5.10 13.48 fps/W
Galaxy S9+ (Snapdragon 845) 10LPP 61.16 5.01 11.99 fps/W
Huawei Mate 20 Pro (Kirin 980) 7FF 54.54 4.57 11.93 fps/W
Galaxy S9 (Exynos 9810) 10LPP 46.04 4.08 11.28 fps/W
Galaxy S8 (Snapdragon 835) 10LPE 38.90 3.79 10.26 fps/W
LeEco Le Pro3 (Snapdragon 821) 14LPP 33.04 4.18 7.90 fps/W
Galaxy S7 (Snapdragon 820) 14LPP 30.98 3.98 7.78 fps/W
Huawei Mate 10 (Kirin 970) 10FF 37.66 6.33 5.94 fps/W
Galaxy S8 (Exynos 8895) 10LPE 42.49 7.35 5.78 fps/W
Galaxy S7 (Exynos 8890) 14LPP 29.41 5.95 4.94 fps/W
Meizu PRO 5 (Exynos 7420) 14LPE 14.45 3.47 4.16 fps/W
Nexus 6P (Snapdragon 810 v2.1) 20Soc 21.94 5.44 4.03 fps/W
Huawei Mate 8 (Kirin 950) 16FF+ 10.37 2.75 3.77 fps/W
Huawei Mate 9 (Kirin 960) 16FFC 32.49 8.63 3.77 fps/W
Huawei P9 (Kirin 955) 16FF+ 10.59 2.98 3.55 fps/W

The last iteration of the Bifrost architecture, the Mali-G76 was a more significant jump for Arm and the IP was largely able to resolve some of the critical issues of its predecessors, resulting in relatively good results for the Exynos 9820 and Kirin 980 chipsets.

Unfortunately while Arm was catching up and fixing Bifrost’s issues, the competition didn’t merely hold still and was pushing the envelope. Qualcomm’s Adreno GPU architecture had been leading the mobile landscape for several years now, and even though the Adreno 640 didn’t post quite as impressive improvements this year, it’s still clearly leading Arm in terms of performance, efficiency and density. More worrisome is the fact that Apple’s GPU in the A12 was an absolutely major jump in terms of performance and efficiency, performing massively better than even Qualcomm’s best, not to speak of Arm’s own Mali GPUs.

Introducing Valhall – A major revamp

Today we’ll be covering Arm’s brand-new GPU architecture: Valhall (anglicized version of the old Norse Valhöll, a.k.a. Valhalla).  The new architecture brings a brand-new ISA and compute core design that tries to address the major shortcomings of the Bifrost architecture, and looks to be a lot more similar to the design approaches we saw adopted by other GPU vendors.

The first iteration of the Valhall GPU is the new Mali-G77 which will implement all of the architectural and micro-architectural improvements we’ll be discussing today.

What’s being promised is a 30% gain in energy efficiency as well as area density (at ISO-performance & process) and a 60% increase in performance of machine learning inferencing workloads on the GPU.

More interestingly, upcoming end-of-2019 and 2020 SoCs are projected to see a 40% increase in performance over 2019 devices. Next-generation SoCs are projected to have only minor process node improvements, so most of the gains quoted here are due to the architectural and microarchitectural leaps made by the new Mali-G77 GPU.

Introducing Valhall: A New Compute Core & New ISA
Comments Locked

42 Comments

View All Comments

  • eastcoast_pete - Wednesday, May 29, 2019 - link

    Addition: Unless MS did reach out to ARM and ARM said no. If that's the case, that would be worth an article or two!
  • Andrei Frumusanu - Friday, May 31, 2019 - link

    Arm is open to Windows drivers but their official stance right now is to have Qualcomm take the lead. Demand is mostly based on the other chip-makers going into that market (HiSilicon, Samsung).
  • darkich - Tuesday, May 28, 2019 - link

    .. except that Intel's low power chips actually are that bad in graphics processing that they get easily trounced even by current gen phone GPU's, including the Mali ones. Get some clue
  • jackthepumpkinking6sic6 - Monday, June 3, 2019 - link

    And their latest in laptop chips... Well at least up to most 8th gen I think... Barely had any power or efficiency gains even over the 4000 series. Like i7 4800. In fact had more power than even most 7000 series ones. Well better bench scores.
  • jjj - Monday, May 27, 2019 - link

    They really need to push on the GPU side as foldables somewhat double pixel count and in theory, double fold or rollable can go even further. Larger screens do make mobile gaming more appealing so it's good for everybody.
  • PeeCee - Thursday, May 30, 2019 - link

    Exynos 9830 should incorporate Mali G77 MP18, which theoretically trump Adreno 640 by, say, 30%-40% in performance and 20%-40% in efficiency. Likewise, Kirin 990 must incorporate Mali G77 MP14 (at least) which should also be competitive enough with Adreno 650 (next generation Snapdragon GPU)
  • jackthepumpkinking6sic6 - Thursday, May 30, 2019 - link

    That will also depends on their insane profit greed and how lazy they are with their next generation of custom CPU cores. I understand there will be issues since they ventured into wide designs but so many things could have been implemented way better. And with those cores being so power hungry and large they have to account for that with heat/power envelope when consider how much gpu
  • ZolaIII - Sunday, June 2, 2019 - link

    Next year its time for new generation (seventh) of Adrenos (by QC usual schedule). The G77 is only competitive to current.
  • jackthepumpkinking6sic6 - Sunday, June 2, 2019 - link

    It could be released on current upcoming devices so yeah.
    And it will still have a good chance of staying competitive with the next adreno.
    You really are a special kind of fanboy aren't you...
  • jackthepumpkinking6sic6 - Monday, June 3, 2019 - link

    Because it's so much more powerful and efficient on top of Samsung's current custom CPU core issues they might not go for mp18. Though their 7nm Euv process and a better M4 custom core setup might change that.

Log in

Don't have an account? Sign up now