Buy the ARP T-Shirt! BIOS Optimization Guide Money Savers!

 20 September 2009
 Dr. Adrian Wong
 Discuss here !
Desktop Graphics Card Comparison Guide Rev. 33.0
Covering 628 desktop graphics cards, this comprehensive comparison allows you ... Read here
BIOS Option Of The Week - Virtualization Technology
Since 1999, we have been developing the BIOS Optimization Guide, affectionately known... Read here
Buy The BOG Book Subscribe To The BOG! Latest Money Savers!
3D Gaming Advances In Microsoft Windows 7 Rev. 2.0
Digg! Reddit!Add to Reddit | Bookmark this article:

Direct3D 11

Windows 7 introduces the next generation Direct3D 11 API, which is a strict superset of Direct3D 10. The Direct3D 11 API enables Windows 7 to take advantage of DirectX 11 hardware. Direct3D 11 adds features to the existing DirectX 10 (and 10.1) pipeline to improve 3‑D performance and support data-parallel computing on the GPU.

The Direct3D 11 API significantly advances graphics technology in two important areas :

  • Compute Shader for data-parallel computing
  • 3D graphics

In addition, it includes numerous other improvements that are based on extensive feedback from third-party hardware and software vendors. Let's take a look at the details on the advancs in Direct3D 11.


Compute Shader

As the processing power of the GPU has increased, it has become a viable processor for not only games but also for general computing applications. DirectX 11 introduces a new Compute Shader API that enables the GPU to be used for data-parallel computing.

Data-parallel programming is a way to target parallel processors with code that scales to any number of processor cores. Direct3D is based on this programming model, but until Direct3D 11, various restrictions limited developer options and flexibility. The Compute Shader in DirectX 11 enables a broader set of algorithms to make use of the graphics processor's parallel computing power.

A high-end GPU today can deliver about 4 teraflops of processing power for graphics applications, and one of those teraflops provides full 32-bit single-precision floating-point math that is generally useful. In addition, graphics processors often include a dedicated memory subsystem that has roughly 10 times the bandwidth of the CPU’s interface to main memory. Many applications beyond graphics have already achieved substantial performance improvements by using the higher processing power and the greater memory bandwidth of the GPU.

The field of general purpose computation on GPUs (GPGPU) has shown that algorithms for fast Fourier transforms (FFT), sort, and linear algebra; to name a few; can all attain higher performance levels by using Direct3D than by running on the CPU. Additional graphics-related applications that are not 3D, such as image and media processing and composition, are common uses for Direct3D 9 and Direct3D 10.

The DirectX 11 Compute Shader provides a general mechanism for applications access the computing power and bandwidth of SIMD cores such as those used in graphics processors. By using this flexible technology, many more applications can use the data-parallel processor capability of the GPU.

The Compute Shader was designed to meet the following requirements :

  • A simpler API than Direct3D. The compute shader does not require an application to set up all the parameters and state that are required for 3-D rendering. To send work to the GPU in previous DirectX versions, algorithms that are totally unrelated to graphics had to draw rectangles. With the compute shader, an application can launch a thread by explicit request.

  • Explicit separation between data-parallel code and serial code. Data-parallel code is usually run on SIMD processors and is typically used for substantially different algorithms than serial code. The application developer can completely control the programming model that each code segment uses.

  • A single consistent programming model that spans hardware implementations.

  • Automatic conversion of data types for read and write operations. Media applications often use data types that are smaller than 32-bit floating point values. Explicit conversion means that applications can avoid unnecessarily wasting bandwidth on 32-bit.

  • Interactive display of computation results, ideally updated at monitor refresh rate. Therefore, the computational work must be tightly integrated with the graphics tasks so that the total time constraint of 10-20ms is achieved (resulting in 50-100 frames per second). The DirectX compute shader is tightly integrated with Direct3D for this reason. This integration includes both the ability for compute shaders to read and write the array and surface objects typically used by DirectX, and the ability for the graphics pipeline to use scattered writes to update the more general data structures that compute-oriented algorithms rely on.

The compute shader supports runtime compilation and runtime data binding to further increase flexibility, polymorphism, and opportunities for optimization by specialization.

Some algorithms are difficult to implement because Direct3D is pure data-parallel with no shared memory constructs or atomic operations. Because the compute shader is more general and contains the core data parallel constructs for shared memory accesses with atomic operations, Direct3D 11 can execute a broader set of algorithms than previous versions, and some of these might be faster than the pure data-parallel versions.


Support Tech ARP!

If you like our work, you can help support out work by visiting our sponsors, participate in the Tech ARP Forums, or even donate to our fund. Any help you can render is greatly appreciated!




Windows 7 & Direct3D 10


Desktop Windows Manager (DWM)
DirectX 10-Level-9


Remote Rendering
Direct2D API


Direct3D 11 Introduction
Compute Shader


New DirectX 11 Features For The Compute Shader
   - Explicit Thread Dispatch
   - Random Access I/O (Scatter)
   - Interthread Communications That Use Locally Shared Registers
   - Ability To Read And Sample DirectX Data Objects
   - Atomic Operators On Shared Memory Locations


Supported Configurations For The Compute Shader
Target Applications For The Compute Shader


3D Graphics Improvements In Direct3D 11
   - Use of multiple CPU cores
   - Tessellation
   - High-level shading language
   - Cross-platform development with the XBox 360 platform


Additional Direct3D 11 Features
   - Improved Texture Compression
   - Shader Model 5.0
   - Stream Output Flexibility
   - Depth Buffer Capabilities


Multi-GPU Support
   - Homogenous Configurations
   - Heterogenous Configurations


Linked Display Adapters
Multi-GPU & Aero Glass
Other Microsoft Scoops!

Camera Shootout : The Samsung Galaxy S6 & Galaxy S6 Edge Vs. The Apple iPhone 6
Western Digital Blue (WD10SPCX) 1 TB Slim Mobile Hard Disk Drive
OCZ RevoDrive 80 GB PCI Express Solid State Drive Review Rev. 2.0
Computex 2008 - Live From Taipei
NVIDIA Tegra - Intel Atom's Silver Bullet? Rev. 2.0
PC Buying Guide
Taskbar Shuffle Review
Compression Comparison Guide Rev. 2.0
Gainward BLISS 7300 GT PCX Golden Sample Graphics Card Review
The Task Killer In Windows XP


Copyright © Tech All rights reserved.