ED#101 : NVIDIA Refutes Claims Of Bad GPUs
Charlie Demerjian of the Inquirer reported last month that NVIDIA G84 and G86 GPUs were failing left and right. Most affected were the mobile versions of these GPUs because of the on-off usage pattern of notebook users, as well as the constant thermal throttling (for power management). The more frequent hot-cold heat cycling resulted in many of these GPUs dying far, far earlier than they were supposed to.
NVIDIA flatly denied it, saying that only a small batch of parts delivered to HP was affected by the problem. Of course, now other notebook manufacturers like Dell and ASUS are also affected. In fact, practically all notebooks using these GPUs are affected by the problem.
Their solution, though, was not to replace the GPUs. Rather, they chose a simpler and certainly cheaper "solution". Instead of a costly recall, the affected notebook manufacturers sent out BIOS updates that set the internal fan to run at a higher speed and/or all the time, instead of only kicking at a certain temperature threshold.
However, this does not actually solve the problem because we are not talking about overheating GPUs here. The problem really is about the thermal cycling that the GPUs go through. To reduce the change in temperature, it is very likely that the BIOS update will also turn off or reduce the GPUs' power management features so that the GPUs will run at a more stable, albeit higher, temperature.
Even so, such measures can only buy these companies time. Perhaps that's all they really care about. As long as the GPU lasts until your warranty expires, they are in the clear. Sad to say, it's the consumer who gets it in the ass... AGAIN.
The G92 & G94 Too?
Unfortunately, not only the G84 and G86 parts are affected. According to the Inquirer, even G92 and G94 parts are failing as well. It looks like NVIDIA "cut too many corners" when they developed these GPUs, resulting in GPUs that fail far earlier than they should. If true, it is really very bad news, as the G92 and G94 are very popular parts, powering the GeForce 9800 GTX, the GeForce 9600 GT, the GeForce 8800 GTS 512MB and the GeForce 8800 GT.
Thanks to some spin-doctoring of the issue, some people think that it's related to the manufacturing process. Just today, a friend asked me if the G98 part used a different fab process from the G84 and G86 GPUs. Apparently, he was of the opinion that NVIDIA GPUs using a different fabrication process would not be affected by the problem. Unfortunately, that is completely untrue.
As much as NVIDIA would like to lay the blame on TSMC (the guys who actually fabricated the GPUs for them), the fabrication process is NOT the cause of the problem. After all, TSMC makes a lot of other chips for other companies using the same process. You do not see their other clients facing the same problem. Obviously, the problem lies closer to home.
Support Tech ARP!
NVIDIA Refutes Claims Of Bad GPUs