Qualcomm Shows Off APQ8064: Quad-core Krait and Adreno 320
by Jason Inofuentes on July 16, 2012 11:11 PM ESTQualcomm's chips have been finding their way into more and more handsets, as they remain alone with a Cortex-A15 class core in the fight. But the dual-core 28nm, 1.5 GHz MSM8960 isn't the only horse in their stables. So at Uplinq, Qualcomm's developer conference, this year, Raj Talluri took the stage to introduce us to their latest addition, the quad-core APQ8064. Raj Talluri joined Qualcomm just 3 years ago, and is responsible for the teams that develop the Snapdragon family of chipsets. Prior to joining Qualcomm, Raj was with Texas Instruments and was closely involved with the development of the OMAP 3 and OMAP 4 chipsets that were in some of the most popular handsets of the last two years. At Uplinq, Raj had one key role: evangelize the future.
Raj Talluri - Image Courtesy of Qualcomm
During the keynote, Raj got to demo an APQ8064-powered development tablet. To review, APQ products forego basebands altogether, while MSM products include basebands alongside application processors and other necessary components. The APQ8064 slots into the S4 Pro family of Qualcomm’s marketing and features four Krait cores running at up to 1.5 GHz when all four cores are taxed, or up to 1.7 GHz when only one core is active. Adreno 320 makes its premier with this SoC, later to be followed by a S4 Pro version of the MSM8960 featuring the same GPU.
In the demo, Raj is showing off a fortress scene from Qualcomm’s in-house game development group. You can see in the video how the demo features cloth rendering, water, fire, multiple light sources and particle effects. The demo does more than just show off the GPU, it serves to demonstrate how developing a game that leverages all four cores can yield visible improvements in both image quality and performance. We haven’t had as much time to explore the architecture of the Adreno 3xx family as we did with Adreno 2xx, but there’s an expectation that while it will have vast improvements over its predecessor, it may have trouble keeping up with NVIDIA and Imagination’s wares next year. One indicator is the DirectX feature level that has so far been revealed. Imagination’s Rogue GPU will premier at 10_0 but be offered as high as 11_1, bringing them to feature parity with desktop GPUs currently on the market. Adreno 3xx is so far only being announced to run at 9_3, leaving it in the Geforce 200 space, and at feature parity with NVIDIA’s Kal-El Geforce GPU. If serious performance gains are realized with the Adreno 3xx, then it may be able to fend off competition from Kal-El/+, but any slips and NVIDIA may have a GPU advantage that could be difficult to overcome. With no other major updates expected in the mobile SoC space until 2013, Qualcomm will have a timing lead over the competition, but we won’t know whether it will be enough until we see shipping products of each.
There’s more happening in this space, though, than just GPU advances. Qualcomm’s recent advantage has in part been due to integration of radios and other components onto the SoC. There's still additional advantage to pursue in integratign components, though. Raj was non-committal on what we might see integrated into a potential S5 in the future. We can glean some insight from the S4 Prime. That line, which includes the MPQ8064, integrates some A/V components to accelerate video decoding and improve picture quality. Chips in the S4 Prime line are designed for set-top boxes and smart TVs; think ‘prime time.’ The IP used for the A/V components in S4 Prime will be 3rd party, however it’s always possible that Qualcomm will put their capable DSP and silicon engineers to work on their own proprietary A/V blocks. Moving to integrate more video components could be a useful gamble, as more and more our mobile devices also serve as our video sources. Another wise gamble would be to integrate wireless display technology, and even digital TV tuners, though no one would go further on what might be integrated next.
Aggressive integration paired with an aggressive transition to 28nm has lead to a product that is competitive in performance with other current offerings, while also offering the fastest air interfaces available, in a very power efficient package. There have been difficulties with the transition to 28nm, though. During a Q&A after the keynote, Dr. Jacobs fielded questions about a possible yield issue that has affected supplies of Snapdragon S4 products. He insisted that their yields were adequate, but that TSMC simply did not have enough 28nm capacity to meet demand. He also revealed that Qualcomm had invested money in another fab partner to bring them on-line at 28nm, but products from that fab wouldn’t be rolling out till Q4 at the earliest. Recent reports have suggested that the partner mentioned is UMC, a Taiwanese foundry that has actually been around for quite sometime. Not as well known as TSMC or GlobalFoundries, UMC is expecting to reach 700,000+ wafers per quarter of 28nm products within the next few years. That's far short of TSMC’s estimates of over 1 million wafers by the end of this year. But with a little investment UMC’s timeline could be adjusted.
Another rumor that has cropped up is that Samsung has joined in as a partner. Silicon vendors aren’t ever too forthcoming with the current state of products in development, less so when there’s an issue. Samsung’s next generation Exynos processors have been slow in trickling out since the Exynos 4210 took the world by storm last year. The quad core variant, the 4412, is shipping internationally in the Galaxy S III, and is by all accounts a scorcher. Being Cortex-A9 based, it retains the IPC deficit against Qualcomm’s Krait cores. With no clear timeline of when they might have Cortex-A15 products to produce, it’s possible that Samsung Semiconductor could have some spare fab capacity. As Dr. Jacobs touched on when asked whether Qualcomm would ever consider becoming a fabricator, the hardest part about having fab capacity is making sure as much of it is in use at a time. If Samsung Semiconductor found itself with spare capacity, an eager customer (Qualcomm), and a handset manufacturer (Samsung Mobile) interested in using a 28nm product with limited availability, it could be reasoned that they’d partner with Qualcomm to deliver SoCs for the US Galaxy S IIIs. Regardless, for all this talk of limited availability of S4s, you can’t swing a stick in the mobile landscape right now without hitting a Snapdragon S4 based handset.
We'll be taking a look at the broader SoC space in the next few weeks, so stay tuned. The short sheet is this, though: Cortex-A15 class or not, the competition isn't waiting to fight back.
33 Comments
View All Comments
twotwotwo - Tuesday, July 17, 2012 - link
I'm surprised by benchmarks that favor the international (Exynos-based) Samsung SGS3's over the US (S4-based) ones. Some numbers pulled from Anandtech tests (international first, then US):SunSpider: 1424 vs. 1751 (Exynos wins)
BrowserMark: 161710 vs. 114812 (Exynos wins)
Vellamo: 2072 vs. 2290 (S4 wins--note this is Qualcomm's benchmark)
GLBenchmark offscreen 720p: 103 fps vs. 54 (Exynos wins)
The Exynos has a better GPU (Mali-something), and will win on any highly multithreaded benchmarks, but I'm confused at how it gets that better SunSpider score, since I thought SunSpider was CPU-only and single-threaded. Does the A9 design inherently beat Krait at SunSpider, and if so does that mean anything for day-to-day performance? Is SunSpider more multithreaded than it seems? Does the Exynos' turbo-boost-type dynamic clocking favor it in benchmarks (with battery life suffering)? Are the two flavors of SGS3 not actually fair comparison platforms (different software or clocks or something?)? You guys know if anyone does. :)
For the record, here's an Engadget page about this and AnandTech sources for numbers:
http://www.engadget.com/2012/06/25/samsung-pegs-lt...
http://www.anandtech.com/show/5810/samsung-galaxy-...
http://www.anandtech.com/show/6022/samsung-galaxy-...
JasonInofuentes - Tuesday, July 17, 2012 - link
You'll get more of this in our State of the SoC discussion but I'll drop a preview here.Let's look at the Windows PC model real quick. You can go to the store, pick out your mobo, your CPU, your GPU and your memory and storage. Grab a wireless card and some HIDs, and a nice monitor and now you've got a lot of stuff that does nothing. So, you go grab your OS of choice and load it up, but let's say you don't load any device specific drivers. It'll work, it'll boot. But the display driver won't go over 1024x768 and is really sluggish. The WiFi doesn't work, at all. And your USB3 ports are running at USB2. Your CPU is working fine, except maybe it doesn't use the media acceleration components. So, you go find drivers.
That's pretty much the same thing that happens in the SoC space. When an OEM orders chips from Qualcomm, or any of the chip makers, they get Android builds that have been optimized to leverage all the silicon components on that SoC. It's the showcase build, and it's what is found in Qualcomm's Mobile Development Platform. Those builds are for the OEM to look at, and to select parts of and implement in their own software. But, they don't have to, and often they don't want to because they've already put in tons of work building their own version of Android and it would take more time and resources to integrate the new code when it probably works well enough.
In this case, Samsung's Android has always included it's own version of the Android browser that uses SOME GPU acceleration, and a few other silicon specific optimizations. When they use S4, they aren't likely to tear down and rebuild their software to leverage S4's strengths, knowing most likely that to the average user the difference will be invisible. And so, Sunspider on optimized software vs. unoptimized, we know who will win.
Now, that said, the best results from Qualcomm's Android build still don't beat Exynos in Sunspider (1532 vs. 1424). There are other optimizations that can be done that do things like recognize opportunities for parallelism in a program and implement them, that's a possibility. It's also possible that some of the Sunspider tests do lend themselves to multiple cores (the crypto ones come to mind). Regardless, you have to look at the broader case to make your choices, not SoC alone.
We love Krait's performance and power characteristics, but we hate it that they're alone. Competition is good. Even where physics poses limitations on how fast one can innovate, competition ensures that everyone's on their toes. I think it's a shame that the carriers were more interested in featuring their network speeds than offering the customer a chance to own a top notch quad-core phone. Hopefully that won't continue when Cortex-A15 designs start to make their way onto the market.
tipoo - Tuesday, July 17, 2012 - link
Just wanted to say I've been wanting an up to date SoC article, looking forward to that.Also to the OP my impression was that Samsung does a lot of their own optimization for the browser/SoC combo which leads to higher scores than you would expect.
twotwotwo - Wednesday, July 18, 2012 - link
Thanks--very helpful. And agree about competition--the folk wisdom seems to be nobody has but Qualcomm a next-gen CPU + LTE baseband with low enough wattage, which is frustrating when there are other great CPU/GPU designs out there. Interested to read the update.dagamer34 - Tuesday, July 24, 2012 - link
We also know from Microsoft's 1200 SunSpider Score from it's Windows Phone 8 Dev Summit that there's still a bit more performance to be extracted from these SoCs. Heck, the iPhone 4S is competitive with most Android phones and it has a Cortex A9 CPU clocked at ~800Mhz. So it really is all about optimizations.ltcommanderdata - Tuesday, July 17, 2012 - link
"Adreno 3xx is so far only being announced to run at 9_3, leaving it in the Geforce 200 space, and at feature parity with NVIDIA’s Kal-El Geforce GPU."With the Geforce 200 series being DX Level 10_0 parts, the Adreno 3xx would actually be comparable to the ATI X700/X800. I believe DX Level 9_3 is defined against SM2.0b, so feature-wise the Adreno 3xx should be below DX9.0c/SM3.0 GPUs like the ATI X1000 and nVidia 6000/7000 series.
"If Samsung Semiconductor found itself with spare capacity, an eager customer (Qualcomm), and a handset manufacturer (Samsung Mobile) interested in using a 28nm product with limited availability, it could be reasoned that they’d partner with Qualcomm to deliver SoCs for the US Galaxy S IIIs"
Recent Samsung fab coverage has been about their 32nm process. Are there any details on their 28nm process? With the 32nm process still ramping, was the 28nm process developed in parallel or is it's ramp a few months behind? How does Samsung's 28nm process compare to TSMC and UMCs?
rd_nest - Tuesday, July 17, 2012 - link
Some marketing info here: http://www.samsung.com/us/business/oem-solutions/m...http://www.samsung.com/us/business/semiconductor/n...
ltcommanderdata - Tuesday, July 17, 2012 - link
Thanks for the links. I wonder how Samsung's 28nm process compares to their 32nm process since all their comparisons are relative to the 45nm process? Half nodes have historically been focused on cost-optimization so the 28nm process's major benefit is probably just making smaller dies rather than major power consumption improvements relative to the 32nm process.esteinbr - Tuesday, July 17, 2012 - link
The DirectX level really doesn't have any correlation to performance of a part. It's actually tells you the functional capabilities. Generally, higher DirectX levels do correspond to new cards which are naturally faster but it doesn't have to be that way. A Bottom of the line budget directx 10 card isn't necessarily going to be faster than a top of the line directx 9.3 card from the previous generation.ltcommanderdata - Tuesday, July 17, 2012 - link
I never made any association to performance at all. I'm merely saying that a feature level 9_3 GPU like the Adreno 3xx is not at feature parity with the GeForce 200 series rather it's feature set is in line with SM2.0b GPUs like the X700/X800.