![]() Also, CUDA is sufficiently similar to OpenCL, so you should not have much trouble in moving to OpenCL later. Undiscovered due to holes in automated tests. There may be redundant computation performed in the CUDA code or the computed results do not actually match completely. The first OpenCL implementations lack maturity, and at the current stage OpenCL does not even have mandatory support for floats (see the spec, or ). Brainstorming in random order, before the first coffee of the day: (1) The CUDA and OpenCL version do not actually perform the same calculations. If you want to experiment with GPUs, I would stick to CUDA for the moment. Algorithms working well on normal CPUs (often tuned to optimize L2-cache access) sometimes dont work well on GPUs. In some cases, it is necessary to design your implementation specifically for GPUs. If you have only short problems to compute on the GPU, this can kill any speedup. An important bottleneck currently is the data transfer between graphics card and main memory. Memory bound algorithms can benefit as well, if they have simple access patterns (basically linear access). ![]() It has a lot of potential, in particular if you algorithm is CPU bound and has a fairly simple control flow (such as many image processing algorithms) I am watching the GP-GPU arena in my professional life as a software developer. Juan suggested making this thread sticky but I don't have permission to do so. I'm starting this thread to gauge interest and see if anyone has experience in these matters already. It's also not quite clear to me how OpenCL allows you to use CPU and GPU as 'similar' resources. The copy operations while fast (PCI-E DMA after all) are still an extra operation that can be skipped if the data is processed by the host CPU. Clearly it won't make sense to process all data this way. I've reviewed some of the documentation and it looks pretty straight forward to initialize buffers on the GPU, fill them, upload code to process the data and then retrieve the result. Nvidia already published the first version. ATI is on board with OpenCL even if they don't have a driver yet. CUDA is Nvidia specific and therefore less interesting for us. It promises to allow thread creation on both CPU and GPU which would allow PCL to spread the processing load across all available processing resources. You may have read about OpenCL here and here.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |