NVIDIA has posted a new release of the CUDA Toolkit. Of note is a significant enhancement to the NVIDIA Performance Primitives (NPP) library, a collection of GPU-accelerated image, video and signal processing functions that deliver 5x to 10x faster performance than comparable CPU-only implementations.
Using NPP, developers can take advantage of over 2000 image processing and signal processing primitives to achieve significant improvements in application performance. Whether you are simply replacing CPU primitives with GPU-accelerated versions or integrating NPP primitives with your existing GPU-accelerated pipeline, NPP delivers high performance while reducing development time. We have already seen speedups in PCL trunk by compiling and using our Kinect Fusion implementation as well as other GPU code with CUDA 4.1.
Other new features in CUDA 4.1 include:
- 10% performance improvement with new LLVM-based CUDA compiler.
- Re-designed Visual Profiler to give you step-by-step performance optimizations.
Find out more at http://bit.ly/xRYur6.