-
-
Notifications
You must be signed in to change notification settings - Fork 214
RealTimeDXTCompression
The compressors available in the NVIDIA Texture Tools are not designed for real-time compression. If you need to compress textures in real-time, I'd recommend to look at the following sources:
J.M.P. van Waveren was the first one to describe a real-time DXT compressor. This compressor is part of the real-time texture streaming pipeline used in some id Software games. He obtains the following results with his SSE2 optimized implementation:
CPU | DXT1 | DXT5 |
---|---|---|
Intel 2.8 GHz Dual Core Xeon | 112.05 MP/s | 66.43 MP/s |
Intel 2.9 GHz Core 2 Extreme | 200.62 MP/s | 127.55 MP/s |
The same algorithm described by Waveren can also be adapted easily to the GPU. This is what Simon Green did in this OpenGL SDK example. The performance on the GPU is much more impressive:
GPU | DXT1 | DXT5 |
---|---|---|
GeForce 8800 GTX | 1,547 MP/s | - |
GeForce 8600 GTS | 461 MP/s | - |
A later whitepaper by Waveren and Ignacio Castaño provides even higher results on the GPU:
GPU | DXT1 | YCoCg-DXT5 |
---|---|---|
GeForce 8800 GTX | 1,690 MP/s | 939 MP/s |
GeForce 8600 GTS | 520 MP/s | 279 MP/s |
Peter Uliciansky optimized Waveren's algorithm further and published his results in this pdf: Extreme DXT Compression, New algorithm for real-time DXT compression.
CPU | DXT1 | DXT5 |
---|---|---|
Intel Pentium 4 2.8 GHz | 241.2 MP/s | 206.5 MP/s |
Intel Core 2 3.0 GHz | 910.0 MP/s | 620.4 MP/s |
These numbers are very impressive and start to get closer to the results of the GPU implementation. However, as pointed by Charles Bloom this implementation has some errors.
Moreover, the latest GPU implementation also offers higher performance.
- http://developer.download.nvidia.com/SDK/10/opengl/samples.html#compress_YCoCgDXT
- http://developer.download.nvidia.com/SDK/10/opengl/samples.html#compress_NormalDXT
GPU | DXT1 | YCoCg-DXT5 | BC5 | DXT5n |
---|---|---|---|---|
GeForce 8800 GTX | 3450 MP/s | 1880 MP/s | 5665 MP/s | 5804 MP/s |
GeForce GTX 280 | 9920 MP/s | 7900 MP/s | 12850 MP/s | 13150 MP/s |
A more recent implementation is available from Intel, but I haven't benchmarked it:
https://software.intel.com/en-us/articles/fast-cpu-dxt-compression
Other links: