AMD Southern Islands GPU Features
It’s always exciting to see AMD or NVIDIA come out with a completely new GPU architecture. The “Southern Islands” GPUs are AMD’s implementation of its “Graphics Core Next” architecture, and comprise three different families of GPUs:
- Tahiti for high-end cards like the Radeon 7970 and 7950
- Pitcairn for more mainstream users
- Cape Verde brings up the low end
Right now, the only Southern Islands GPUs available are the Tahiti-based Radeon HD 7970 and 7950 GPUs, but cards based on the Pitcairn and Cape Verde variants will appear in the coming months.
Graphics Core Next
AMD had several goals in mind for Graphics Core Next, and one of the main things they wanted to do was to catch up with NVIDIA in the “GPU compute” arena. Right now, NVIDIA’s “CUDA” (Compute Unified Device Architecture) dominates in GPU computing, with a robust set of developer tools and years of track record behind it. AMD’s “DirectCompute” alternative has been around almost as long but has failed to catch on with developers to the degree that CUDA has. AMD is making a real push for DirectCompute with these new GPUs, and claims that over 200 applications already benefit from DirectCompute technology.
For Southern Islands, AMD has grouped simple ALUs (arithmetic logic units) into a single SIMD (Single Instruction Multiple Data) unit. A number of SIMD units, along with instruction decoders and schedulers, branch units, vector processors, and other items comprise a compute unit, and a number of these compute units (along with memory controllers and whatnot) comprise a Southern Islands GPU chip. Each compute unit comprises 64 shaders, and while the 7970 has 32 of them (and thus 2,048 shaders), the Radeon 7950 gets by with only 28 (and 1,792 shaders). That’s a decrease of only 12.5%, which doesn’t seem like much. Additionally, the standard 7950 clock speed is 800MHz as compared to the 7970’s 925MHz.
AMD has tweaked their VLIW (very long instruction word) architecture to provide more consistent performance. Previous generations of AMD GPUs often left many compute units/stream processors idle, because dependencies in the data being worked on meant that not all the compute units could be used at once. Southern Islands architecture provides a greater degree of parallelism (it’s that SIMD stuff, really, being used effectively) and can keep most compute units working all the time, leading to more consistent (and higher) performance. This has obvious advantages in both graphics processing and general GPU-compute operations.
Other enhancements include:
Partially Resident Textures: As games increasingly use very large textures, loading and manipulating the texture data takes more time. A Souther Islands GPU can load only the part of the texture that will actually be visible in a frame, reducing the memory bandwidth and workload.
Error-correcting code support: There’s not much detail on this feature yet, but it looks as if AMD will be able to offer optional ECC support (important for industrial applications) without having to use ECC memory. This will detect and correct memory errors, although AMD’s tech white paper doesn’t go into specifics such as how many bits can be detected/corrected.
PowerTune and ZeroPower: These feature dynamically clock the card’s GPU and memory doesn (PowerTune) when high performance isn’t needed, and can shut off entire sections of the GPU (ZeroPower) when the card is idle. For example, the second card in a CrossFireX system can be idled down to less than 5 watts if you’re just browsing the Windows desktop; a single card system will power down if your display goes to sleep. Combined with the inherent efficiency of the 28nm fabrication process, this results in significant power savings. Side benefits you’ll notice include less heat and noise emanating from your rig, especially when you’re not gaming.
Eyefinity 2.0: New support for 5×1 monitor layouts, improved bezel correction, and support for custom resolutions enhance AMD’s existing Eyefinity feature. I saw a 5×1 system demonstrated at an AMD press even a few months ago and it was quite impressive.
28nm fabrication process: If you make ’em smaller, you can fit more of ’em in. The 7970 GPU has a staggering 4.3 billion transistors. The original Intel 4004 microprocessor had about 2,300.
PCI Express 3.0 support: This has twice the bandwidth of PCI-E 2.0, but I’m not sure what real-world effect this will have, especially on x16 slots. Even the beefiest current video cards aren’t hobbled by 8x PCI-E 2.0 bandwidth.
R7950 Black Edition Features
- GPU Edging: Black Edition Selection Process Double Dissipation
- Exclusive XFX Black Edition Selection Process to identify the top 1% of GPUs capable of reaching maximum overclock speeds.
- Ghost Thermal Technology
- Floating cover design maximizes airflow
- HydroCell Thermal Solution
- Hydrocell Vapor Chamber technology
- Duratec: Solid Capacitors
- Duratec: Ferrite Core
- Duratec: 2oz Copper PCB
- Duratec: IP-5X Dust Free Fan
- Duratec: XFX Bracket
In the next section, we detail our test methodology and give specifications for all of the benchmarks and equipment used in our testing process…