The Age Of Nvidia
Well it appears that the GPU era of computing is finally here! Intel is in deep trouble. For those of you who haven’t read my blog extensively over the years, I started the original DirectX team at Microsoft way back in 1994 and created the Direct3D API with the other early DirectX co-creators (Craig Eisler and Eric Engstrom) and promoted its adoption to the video game industry and to graphics chip makers. There are lots of stories about that here on this blog but the one that is relevant to this post was this blog article I wrote back in 2013.
” I think that Nvidia’s vision for the future of gaming is the right one and I’m very excited to be alive in an era when I can work with such amazing computing power. I feel like I lived to see the era when I could walk on the bridge of the Enterprise and play with warp cores…. literally… because that’s what Nvidia calls the smallest unit of parallel threads you can run on a GPU.”
For those of you who follow the stock market you may have noticed that Nvidia’s stock price has soared recently after many years of slow creeping progress. I’m going to observe that this is sudden surge in their share price heralds a revolutionary shift in computing that represents the culmination of many years of progress in GPGPU development. Up until now Intel has held a dominant monopoly over Enterprise computing for many years, successfully fending off all challengers to their supremacy in the Enterprise computing space. This dominance is ending this year and the market sees it coming. To understand what is going on and why it is happening now, I’m going to take a jump back in time to my earlier years at Microsoft.
In the 1990’s Bill Gates coined the term “Cooperatition” to describe Microsoft’s strained competitive partnerships with other leading industry tech giants of the era. The term came up a lot in reference to Intel. While Microsoft and Intel’s fates and success had become deeply intertwined the two companies constantly struggled for dominance over the other. Both Microsoft and Intel had teams of people “specialized” in trying to achieve a position of dominance over the other. At the time Microsoft Executive Paul Maritz (Then Executive VP of Platforms) was very concerned that Intel might attempt to “virtualize” Windows thus enabling many competing operating systems to enter the market and co-exist on the PC desktop with Windows. *Paul Maritz later went on to become CEO of VMWARE… just saying… Indeed Intel was heavily invested in just such an effort. One of their strategies was to attempt to emulate in software all conventional “special” hardware functionality that PC OEM’s typically included with a PC such as video cards, modems, sound cards, networking, etc. By moving all external computing onto the Intel processor, Intel could eliminate the sales and growth of all possible alternative computing platforms that might grow to become a computing threat to the value of an Intel CPU. It was specifically Intel’s announcement of 3DR in early 1994 that spurred Microsoft to create DirectX.
I worked for the team at Microsoft that was responsible for “positioning” Microsoft strategically against competitive threats in the market called DRG (Developer Relations Group). Intel had requested that Microsoft send a “representative” to speak at their launch event for 3DR. As DRG’s resident graphics and 3D expert I was sent on Microsoft’s behalf with the specific mission of evaluating the threat that Intel’s new initiative represented to Microsoft and formulating an effective counter-strategy. My assessment was that Intel was indeed attempting to virtualize Windows by software emulating all possible competitive external processing. I wrote a proposal called “Taking Fun Seriously” that suggested that the way to prevent Intel from making Windows “dispensable” was to create a competitive consumer marketplace for new hardware capabilities. The idea was to create a new suite of Windows drivers that enabled massive competition in the hardware market to enable new audio, input, video, networking and other media capabilities that would all depend on proprietary Windows drivers to work across a new market we would create for PC based video games. Intel would not be able to keep up with the free market competition we created among consumer hardware companies and therefore never be able to create a CPU that could effectively virtualize all of the functionality consumers demanded. Thus DirectX was born.
There are many stories on this blog about the events that surrounded the creation of DirectX but in short our “evil scheme” was wildly successful. Microsoft realized that the way to dominate the consumer market and keep Intel at bay was by focusing on video games and dozens of 3D video chip makers were born. Twenty some years later Nvidia is among the handful of survivors along with ATI, since acquired by AMD, that came to dominate first the consumer graphics market and increasingly the enterprise computing market.
This brings us to today, 2017, the year GPU’s finally begin to permanently displace the venerated x86 based CPU. Why now and why GPU’s? The secret to the x86 hegemony has been Windows and backwards compatibility of the x86 instruction set all the way to the 1970’s. Intel has been able to maintain and grow it’s enterprise Monopoly because the cost of porting applications to any other CPU instruction set with no market share is prohibitive. The phenomenal body of functionality enabled by the Windows OS and tied to the x86 platform has further entrenched Intel’s market position. The beginning of the end for Intel began when Microsoft AND Intel both failed to make the leap to also dominating the emerging mobile computing market. For the first time in decades a major crack in the x86 CPU market opened and ARM based CPU’s filled it and new alternative OS’s to Windows from Apple and Google were able to capture the newly opened market. Why did Microsoft and Intel fail to make the leap? There are a lot of interesting reasons but for the purpose of this article the one I would like to highlight is the baggage of X86 backwards compatibility. For the first time power efficiency became more important to the success of a CPU than speed. All of the transistors and all of the millions of lines of x86 code that Intel and Microsoft had invested in the PC became an obstacle to power efficiency. The most important aspect of Microsoft and Intel’s market hegemony became a liability over night.
Intel’s need to constantly achieve increased performance while maintaining backwards compatibility forced Intel to waste more and more power hungry transistors to achieve diminishing performance returns in each new generation of x86 CPU. Backwards compatibility also severely impeded Intel’s ability to make their chips more parallel. While the first GPU’s were highly parallel out of the gate in the 1990’s the first DUAL core CPU was not released by Intel until 2005. Even today in 2017, Intel’s most powerful CPU’s only manage 24 processing cores compared to the thousands found in most modern video cards. GPU’s which were intrinsically parallel, did not have any legacy compatibility baggage to carry and, enabled by architecture independent API’s like Direct3D and OpenGL, were free to innovate and increase their parallelism without compromising compatibility or transistor efficiency. By 2005 GPU’s had even become GENERAL PURPOSE computing platforms supporting heterogeneous generalized parallel computing. (Heterogeneous in this case is a reference to the fact that an AMD and an NVIDIA chip can run the same compiled programs despite having entirely different low level architectures and instruction sets.) While Intel Chips were achieving diminishing performance returns, GPU’s were doubling in performance while halving their power requirements every 12 months! Extreme parallelism enabled very efficient transistor utilization ensuring that each transistor added to a GPU could be effectively deployed for speed, while a growing percentage of new x86 transistors were going to waste.
Although GPU’s were increasingly making inroads into enterprise super computing, media production and VDI solutions, the major market turning point came when Google began using GPU’s effectively to train neural networks to do really useful things. The market realized that artificial intelligence would be the future of big-data processing and would open vast new automation markets. GPU’s were ideally suited to run neural networking applications. Until this point Intel had successfully relied on two approaches to suppress the growing influence of GPU’s in enterprise computing.
- Intel keeps the PCIe bus slow and limits the number of IO lanes that an Intel CPU supports thus ensuring that GPU’s are always dependent on an Intel CPU to serve their workload and remain separated from many valuable real-time and HPC applications by latency and PCIe bandwidth constraints. As long as their CPU’s could throttle application access to GPU performance, Nvidia remained safely marooned on the other side of the PCIe bus from many practical enterprise work loads.
- Provide cheap but minimally functional GPU’s on consumer CPU’s to isolate Nvidia and AMD to the premium gaming market and out of mainstream adoption.
The growing threat from Nvidia and Intel’s own failed attempts to create x86 compatible super-computing accelerators caused Intel to try another new tactic. They acquired Alterra and plan to include programmable FPGA’s with next generation Intel CPU’s. This is a very clever way of enabling Intel CPU’s to support dramatically better IO capabilities than their PCIe constrained counterparts while preventing GPU’s from benefiting from those enhancements. Backing FPGA’s also gave Intel a way to move towards supporting greater parallelism on their chips without benefiting the growing GPU based application market. It also enabled enterprise hardware vendors to create highly specialized custom hardware solutions that were still x86 dependent. The move was tactically brilliant on Intel’s part because it acted to exclude GPU’s from penetrating the enterprise market on several axis’ at once. Brilliant but probably doomed to fail.
Now in five easy news clips, the reason I believe that the x86 party ends in 2017…
- SoftBank raises 93B from companies with a common interest in displacing Intel
- SoftBank buys ARM
- SoftBank “buys” Nvidia
- Nvidia launches an ARM/Hybrid mobile chip… the X2 with…
- …GPU accelerated ARM cores on it
Why is this sequence of events important? Because this is the year that the first generation of self-hosting GPU’s are widely available on the market, able to run their own OS with no PCIe obstacles. NVIDIA does not need an x86 CPU anymore. ARM has a vast body of consumer and enterprise OS’s and applications ported to it. All enterprise and cloud hardware makers adopt ARM chips as controllers for a vast array of their current market solutions. ARM chips are already integrated with all leading FPGA solutions. ARM chips are low power at the cost of performance but GPU’s are extremely fast and also power efficient so the GPU can provide the processing muscle while the ARM cores can handle the mundane IO and UI management tasks that don’t demand a lot of compute power. The growing body of Big-data, HPC, and especially machine learning applications don’t need Windows and don’t perform on X86. So 2017 is the year Nvidia slips its leash and breaks free to become a genuinely viable competitive alternative to x86 based enterprise computing in valuable new markets that are unsuited to x86 based solutions.
If an ARM processor isn’t beefy enough for your big-data computing needs, IBM has also partnered with Nvidia to produce a generation of monster number crunching CPU’s with the Power9 sporting 160 PCIe lanes.
AMD has also launched their new Ryzen CPU’s and unlike Intel, AMD has no strategic interest in choking off PCIe performance. Their consumer grade chips sport 64 PCIe 3.0 lanes and their pro chips will support 128. AMD is also launching a new HIP cross compiler that makes CUDA applications designed for Nvidia CPU’s compatible with AMD GPU’s. Despite being competitors, both companies will benefit from flanking Intel in the enterprise market with alternative approaches to GPU based computing.
All of this means that GPU based solutions will sweep enterprise computing at an accelerating rate in coming years with the world of desktop UI driven computing increasingly relegated to virtualization in the cloud or running on mobile ARM processors as even Microsoft has announced Windows support for ARM.
Put it all together and I predict that within a few years all we will hear about is the battle between GPU’s and FPGA’s for enterprise computing supremacy as the CPU era fades into slow decline.
*…and Quantum computing will prove to be irrelevant the entire time…