SPEC CPU - Multi-Threaded Performance

Moving onto multi-threaded SPEC CPU 2017 results, these are the same workloads as on the single-threaded test (we purposefully avoid Speed variants of the workloads in ST tests). The key to performance here is not only microarchitecture or core count, but the overall power efficiency of the system and the levels of performance we can fit into the thermal envelope of the device we’re testing.

It’s to be noted that among the four chips I put into the graph, the i9-11980HK is the only one at a 45W TDP, while the AMD competition lands in at 35W, and the i7-1185G7 comes at a lower 28W. The test takes several hours of runtime (6 hours for this TGL-H SKU) and is under constant full load, so lower duration boost mechanisms don’t come into play here.

SPECint2017 Rate-N Estimated Scores

Generally as expected, the 8-core TGL-H chip leaves the 4-core TGL-U sibling in the dust, in many cases showcasing well over double the performance. The i9-11980HK also fares extremely well against the AMD competition in workloads which are more DRAM or cache heavy, however falls behind in other workloads which are more core-local and execution throughput bound. Generally that’d be a fair even battle argument between the designs, if it weren’t for the fact that the AMD systems are running at 23% lower TDPs.

SPECfp2017 Rate-N Estimated Scores

In the floating-point multi-threaded suite, we again see a similar competitive scenario where the TGL-H system battles against the best Cezanne and Renoir chips.

What’s rather odd here in the results is 503.bwaves_r and 549.fotonik_r which perform far below the numbers which we were able to measure on the TGL-U system. I think what’s happening here is that we’re hitting DRAM memory-level parallelism limits, with the smaller TGL-U system and its 8x16b LPDDR4 channel memory configuration allowing for more parallel transactions as the 2x64b DDR4 channels on the TGL-H system.

SPEC2017 Rate-N Estimated Total

In terms of the overall performance, the 45W 11980HK actually ends up losing to AMD’s Ryzen 5980HS even with 10W more TDP headroom, at least in the integer suite.

We also had initially run the suite in 65W mode, the results here aren’t very good at all, especially when comparing it to the 45W results. For +40-44% TDP, the i9-11980HK in Intel’s reference laptop only performs +9.4% better. It’s likely here that this is due to the aforementioned heavy thermal throttling the system has to fall to, with long periods of time at 35W state, which pulls down the performance well below the expected figures. I have to be explicit here that these 65W results are not representative of the full real 65W performance capabilities of the 11980HK – just that of this particular thermal solution within this Intel reference design.

SPEC CPU - Single-Threaded Performance CPU Tests: Office and Science
POST A COMMENT

229 Comments

View All Comments

  • ozzuneoj86 - Monday, May 17, 2021 - link

    While it is nice that it supports gen 4, realistically you're just getting SSDs that put out more heat, with more power draw, while gaining performance benefits that are only measurable in benchmarks or very specific situations.

    I'm sure file copy performance is much higher, but how fast do you need that to be? Assuming you're copying to the drive itself or maybe to a Thunderbolt 4 external drive, it is the difference between copying 1TB of data in 2 minutes versus 6 minutes. You can (theoretically) completely fill a $400 2TB SSD in 4 minutes with gen4 vs maybe 12 minutes with Gen 3. If someone needs to do that all the time, then sure there's a difference... but that has to be pretty uncommon.

    For smaller amounts of data, any decent nvme drive is fast enough to make the difference between models almost unnoticeable. For the vast majority of users, even a SATA drive is plenty fast enough to provide a smooth and nearly wait-free experience.
    Reply
  • mode_13h - Monday, May 17, 2021 - link

    > realistically you're just getting SSDs that put out more heat, with more power draw,
    > while gaining performance benefits that are only measurable in benchmarks
    > or very specific situations.

    Exactly. Thank you.
    Reply
  • mode_13h - Monday, May 17, 2021 - link

    > Assuming you're copying to the drive itself or maybe to a Thunderbolt 4 external drive

    Oops! TB 4 is limited to PCIe 3.0 x4 speeds! So, it'd be little-to-no help there!
    Reply
  • Calin - Tuesday, May 18, 2021 - link

    Well, you could copy full blast to an external drive and have plenty of remaining performance to do other storage intensive things - that's assuming your external drives is fast enough to suffocate PCIe 3.0 x4, and your internal drive is faster still. Reply
  • mode_13h - Thursday, May 20, 2021 - link

    > Well, you could copy full blast to an external drive and have plenty of remaining performance

    I'm not one to turn down "free" performance, but PCIe 4 uses significantly more power. In a laptop, that's not a minor point.
    Reply
  • inighthawki - Monday, May 17, 2021 - link

    Sequential read and write speeds are basically just flexing. Very few people actually ever make significant use of such speeds in a way that saves more than a second or two here or there. Most laptop users are not sitting there copying a terabyte of sequential data over and over again. Reply
  • The_Assimilator - Monday, May 17, 2021 - link

    There is no laptop chassis on the market that can adequately handle the excess of 8W of heat that a PCIe 4.0 NVMe SSD can dissipate. Reply
  • Cooe - Monday, May 17, 2021 - link

    You're not getting those kind of speeds sustained in a laptop without RIDICULOUS thermal throttling. PCIe 4.0 in mobile atm is just a marketing checkmark & nothing more. Reply
  • Calin - Tuesday, May 18, 2021 - link

    It allows faster "races to sleep" for the processor. And, since the Core2 architecture, the winning move was "fast and power hungry processor that does what it must and then goes to a very low power state". This gives you very good burst speed and low average power - as soon as you finish, you can throttle everything down (CPU, caches, SSDs, ...) Reply
  • mode_13h - Thursday, May 20, 2021 - link

    > It allows faster "races to sleep" for the processor.

    Are we still talking about PCIe 4? I don't think it works like that.

    > since the Core2 architecture, the winning move was "fast and power hungry processor that does what it must and then goes to a very low power state".

    No, it's more energy-efficient to run at a slower clock speed. There's a huge difference between the amount of energy used in turbo and non-turbo modes. As it's far bigger than the performance difference, there's no way that going to idle a little sooner is going to make up for it.
    Reply

Log in

Don't have an account? Sign up now