Tuesday, October 27, 2015

IT News Head Lines (AnandTech) 28/10/2015


ARM Announces New CCI-550 and DMC-500 System IPs
Today ARM announces two new additions to its CoreLink system IP design portfolio, the CCI-550 interconnect and DMC-500 memory controller. Starting off with the CCI announcement, we find the third iteration of the Cache Coherent Interconnect. The CCI is the cornerstone of ARM’s big.LITTLE strategy as it provides the required cache-coherent system interconnect between CPU clusters and other SoC blocks such as the main memory controllers and thus enabling heterogeneous multiprocessing between all the IP blocks.

The CCI-550 is an improvement to the CCI-500 which ARM announced back in February among other IPs such as the new Cortex A72 core design. Both the CCI-500 and the new CCI-550 are generational successors to the CCI-400 that is found in all currently released big.LITTLE SoCs such as Samsung’s Exynos, MediaTek’s Helio or Qualcomm’s Snapdragon designs. Back in February I was pretty excited to see ARM improve this part of their IP portfolio as it seemed that there was a lot of optimization that could be done in terms of performance and power.

As a reminder, the primary characteristics of the new CCI-5X0 designs is the addition of a snoop filter within the interconnect that is able to maintain a directory of all cache contents among its coherent agents. On previous IP such as the CCI-400, all coherency messages needed to be broadcasted among all agents, causing them to have to wake up and respond. This not only impacted performance due to the increased latency but also had a power impact caused by the processing overhead. For the new CCI family, ARM explains that in heavy use-cases the new snoop filter can save up to “100’s” of milliwatts of power which is a quite significant figure.

Due the broadcast nature of how the CCI-400 was operated, it meant that adding another coherent agent would have incurred a quadratical increase in the amount of messages such as snoop lookups. The CCI-500 on the other hand is able to take advantage of the new filter to increase the number of ACE (AXI Coherency Extension) master ports from 2 to 4 without increased overhead. This for example enabled the implementation of up to 4 CPU clusters if a vendor wished to do so. The new CCI-550 again improves this configuration option by raising the maximum number of ACE master ports to up to 6.

In the example SoC layout diagram that ARM provides, we see the CCI-550 configured with two CPU clusters such as the Cortex A53 and a Cortex A72. The remaining four ACE master ports could be then dedicated to a fully coherent GPU.

ARM explains that its still to-be-announced next-generation Mali IP codenamed "Mimir" will be fully cache-coherent and would be a perfect fit to take advantage of such a configuration (Current generation Midgard-based GPUs such as the T6-/7-/800 series are only I/O coherent). Fully coherent GPUs will be able to take advantage of shared virtual memory and new simplified programmers models provided by APIs such as OpenCL 2.0 and HSA.

While the amount of ACE master ports increases from 4 to 6, the amount of possible memory interfaces has also gone up from a maximum of 4 to up to 6. This allows an increase of up to 60% in the total peak interconnect bandwidth (total aggregate bandwidth). This improvement not only comes from the two additional memory interfaces, but also an additional increase which can be credited to micro-architectural improvements on the interconnect itself. For example, we're told the CCI-550 is able to reduce CPU-to-memory latency by 20% when compared to the CCI-500.

ARM explains that its CCI IP is highly customizable and thus each vendor can configure it to their needs. The IP will be able to scale in terms of physical implementation based on the number of desired interfaces and ports.

As an IP vendor, ARM is aiming to provide highly optimized integrated solutions, and memory controllers are consequently part of such designs. ARM previously offered the DMC-520 with DDR4 support but this memory controller was aimed at more complex enterprise designs employing AMBA 5 system IP such as ARM’s CCN (Cache Coherent Network). The DMC-500 announced today on the other hand is ARM’s first mobile-targeted memory controller with support for the new LPDDR4 memory standard. Aimed for AMBA 4 system IPs such as the CCI family, this is the memory controller IP we’ll most likely see adopted by vendors in consumer devices such as smartphones.

The DMC-500 promises support for LPDDR4 up to 2133MHz while still maintaining LPDDR3 compatibility. This is an important differentiation factor as in doing so ARM is able to offer maximum flexibility in terms of choice of implementation for vendors. Performance wise, ARM promises up to 27% increase in memory bandwidth utilization in a low power design.

All in all today’s announcements provide some solid improvements in ARM’s IP portfolio. On the memory controller side I’m not certain what the rate of adoption ARM’s DMC’s is; as far as I know the main "heavyweight" SoC vendors currently chose to employ their own memory controller IP. Those who don’t have their own IP and instead use ARM's designs are often hard to single out as many times the choice of memory controller is completely invisible to the system.

On the interconnect side I predict that we’ll be seeing a lot more discussions and developments from third-party vendors. Even among today’s higher-profile big.LITTLE SoCs I’m only aware of LG’s Odin to use ARM’s CCI as a “center-piece” in their SoC fabric while other vendors chose to implement it alongside their own interconnect fabric. Vendors who have the resources and design talent may also chose to implement cache coherency into their own interconnect IP. They would thus be able deploy big.LITTLE systems or other similar fully coherent SoCs without ARM’s CCI IP. For example, MediaTek is among the first to do exactly this in the Helio X20 with help of the in-house designed MCSI. Next year we should be seeing new big.LITTLE SoCs equipped with both ARM’s IP such as the CCI-500 or 550 alongside third-party IP, creating a new differentiation point for SoC vendors that will undoubtedly make competitive landscape much more interesting.

Read More ...

GeForce + Radeon: Previewing DirectX 12 Multi-Adapter with Ashes of the Singularity
Today we'll be taking an initial look into the unorthodox multi-GPU capabilities of DirectX 12. Developer Oxide Games, who is responsible for the popular Star Swarm demo we looked at earlier this year, has taken the underlying Nitrous engine and are ramping up for the 2016 release of the first retail game using the engine, Ashes of the Singularity. As part of their ongoing efforts to Nitrous as a testbed for DirectX 12 technologies and in conjunction with last week’s Steam Early Access release of the game, Oxide has sent over a very special build of Ashes.

Read More ...

Lenovo Whoa: Motorola Droid MAXX 2 and Turbo 2 Break Cover in Leaks
Droid MAXX 2 will target the mid-range while Turbo 2 will be the flagship variant

Read More ...

Leak: Apple Preps for First Real Android App Foray With New Apple Music App
App is expected to drop within the next month, appears to support all the same features that iOS users enjoy

Read More ...

Available Tags:GeForce , Lenovo , Motorola , Apple , Android , Apple

No comments: