Supported Configurations For The Compute Shader
The DirectX 11 compute shader ships with DirectX 11. It will be available on Windows Vista and other Microsoft OS platforms beyond Windows Vista.
A core subset of compute shader features is offered on recent DirectX 10-class hardware so that developers can start to create DirectX 11 Compute Shader applications now using a hardware accelerated development platform. The following table lists the shader features that DirectX 10 and DirectX 11 hardware support. Note that a “group” is the name for a collection of threads operating in the same instance of the Compute Shader.
Features |
DirectX 10 Hardware |
DirectX 11 Hardware |
|
Number of 32-bit shared registers |
4K |
8K |
|
Shared register access |
Private write / shared read |
Fully indexed |
|
Maximum group dimensions |
(768, 768, 1) |
(1024, 1024, 64) |
|
Maximum group threads |
768 |
1024 |
|
Atomic operators |
None |
Full set |
|
Double precision |
None |
Check feature |
|
DispatchIndirect() |
None |
Supported |
Target Applications For The Compute Shader
Many applications can benefit from the compute shader technology. The following list describes several such applications, ordered by similarity to current Direct3D applications that are implemented today.
-
Photo and imaging
Although most imaging tasks are similar enough to graphics that Direct3D is a good solution, some operations such as FFT benefit substantially from the random-access I/O capabilities of the compute shader. -
Video
Video compression and encoding are difficult to accomplish without random access I/O. The compute shader enables many key algorithms used in video encoding that simply were not feasible previously. Currently, applications typically use fixed-function cores to decode video. The compute shader adds a teraflop of flexible computing power, which enables applications to perform many additional tasks in real time instead. Such tasks include super-resolution scaling, fast noise removal, and de-ringing. -
Advanced rendering
A-Buffer OIT, real-time radiosity/GI -
Search, sort, and query
Databases often handle large numbers of small records that data-parallel programming can target efficiently. Many search, sort, and query algorithms can easily be parallelized at a fine-grained level. Some of them benefit from processing models that do not enforce ordering of results, which is the case for datasets that will be sorted later. -
Technical and Scientific
Academic researchers who required increased performance were the first to program graphics processors for applications beyond 3-D rendering. Scientific and technical computing can benefit greatly by using the compute shader, especially when single-precision math is often used. -
Cryptography
Since the release of Direct3D 10, GPUs have supported internal computations on integer data types. Combined with the new flexibility of the compute shader, integer arithmetic enables client PCs to treat their GPU as an encryption/decryption accelerator and enables server-clustered PCs to process large data sets.
Support Tech ARP!
If you like our work, you can help support out work by visiting our sponsors, participate in the Tech ARP Forums, or even donate to our fund. Any help you can render is greatly appreciated!
<<< New DirectX 1 Features For The Compute Shader : Previous Page | Next Page : 3D Graphics Improvements In Direct3D 11 >>>