Investigating the GTX 970: Does Nvidia’s GPU have a memory problem?
Late last week, we covered claims that the GTX 970 had a major memory flaw that didn’t affect Nvidia’s top-end GPUs, like the GTX 980. According to memory bandwidth tests, the GTX 970’s performance drops above 3.2GB of memory use and craters above 3.5GB. Meanwhile, many users have published claims that the GTX 970 fights to keep RAM usage at or slightly below 3.5GB of total VRAM whereas the GTX 980 will fill the entire 4GB framebuffer.
There are three separate questions in play here, though they’ve often been conflated in the back-and-forth in various forum threads. First, does the small memory bandwidth benchmark by Nia actually test anything, or is it simply badly coded?
Second, does the GTX 970 actually hold memory use to the 3.5GB limit, and if it does, is this the result of a hardware bug or other flaw? Third, does this 3.5GB limit (if it exists) result in erroneous performance degradation against the GTX 980?
Memory bandwidth and allocation on the GTX 970 vs. the GTX 980
The GTX 970, like a number of other GPUs from Nvidia (and, historically, a few from AMD) uses an asymmetric memory layout. What this means, in practice, is that the GPU has a faster access path to some of its main memory than others. We reached out to Bryan Del Rizzo at Nvidia, who described the configuration as follows:
“[T]he 970 has a different configuration of SMs than the 980, and fewer crossbar resources to the memory system. To optimally manage memory traffic in this configuration, we segment graphics memory into a 3.5GB section and a 0.5GB section. The GPU has higher priority access to the 3.5GB section. When a game needs less than 3.5GB of video memory per draw command then it will only access the first partition, and 3rd party applications that measure memory usage will report 3.5GB of memory in use on GTX 970, but may report more for GTX 980 if there is more memory used by other commands. When a game requires more than 3.5GB of memory then we use both segments.”
In other words, the answer to the first question of “Does this memory benchmark test something accurately?” is that yes, it does. but does this limit actually impact game performance? Nvidia says that the difference in real-world applications is minimal, even at 4K with maximum details turned on.
Nvidia’s response also confirms that gamers who saw a gap between the 3.5GB of utilization on the GTX 970 and the 4GB on the GTX 980 were seeing a real difference. We can confirm that this gap indeed exists. It’s not an illusion or a configuration problem — the GTX 970 is designed to split its memory buffer in a way that minimizes the performance impact of using an asymmetric design.
We went looking for a problem with the GTX 970 vs. the 980 in two ways. First, wereconsidered our own data sets from the GTX 970 review, as well as reviews published on other sites. Even in 4K, and with all detail levels cranked, our original review shows no problematic issues. The GTX 970 may take a slightly larger hit in certain circumstances (Nvidia’s information suggests that the impact can be on the order of around 4%), but we don’t see a larger problem in terms of frame rates.
The next step was to benchmark a few additional titles. We tested the MSI program Kombustor and its RAM burner program, as well as the games Dragon Age: Inquisition and Shadows of Mordor. Both Dragon Age: Inquisition and Shadows of Mordor were tested at absolute maximum detail with all features and settings maxed out in 1080p and 4K.
No comments:
Post a Comment