CUDA 7.5 (from synaptic package manager ("nvidia-cuda-toolkit"))
cuda
When trying to compile opencv (ubuntu 16.04) i get the following error:
[ 9%] Building NVCC (Device) object modules/core/CMakeFiles/cuda_compile.dir/src/cuda/cuda_compile_generated_gpu_mat.cu.o
/usr/include/string.h: In function ‘void* __mempcpy_inline(void_, const void_, size_t)’:
/usr/include/string.h:652:42: error: ‘memcpy’ was not declared in this scope
return (char *) memcpy (__dest, __src, __n) + __n;
^
CMake Error at cuda_compile_generated_gpu_mat.cu.o.cmake:264 (message):
Error generating file
/home/mag/opencv/build_opencv_master/modules/core/CMakeFiles/cuda_compile.dir/src/cuda/./cuda_compile_generated_gpu_mat.cu.o
The problem is known, but you need to make a workaround to support cuda. Here is a link to the same Proplem in the caffe git, but unfortunetly the solution doesnt work for me:
https://github.com/BVLC/caffe/issues/4046
Actually I get this error if I try to build all CUDA architectures. Instead I just change 2 CMAKE variables, CUDA_ARCH_BIN
and CUDA_ARCH_PTX
, to only contain the CUDA compute capability supported by my graphics card, which can be found here.
i have tried it on two pc's but it doesnt' work for me
Edit:
i have tried it with a Geforce GTX 980 and a Geforce GTX 780. so generation is kepler (and Maxwell) with architecture 3.0 3.5
if nessesary i can try it on other pc too. But for the next two weeks i can let one pc in this configuration, if you need more testing.
One fast workaround is to use as "CUDA_HOST_COMPILER" the "clang-3.5"
@StevenEWright Hey, can you give an example what exactly you changed in -DCUDA_ARCH_BIN=xx
and `-DCUDA_ARCH_PTX=xx
? I am using Nvidia 940M and I can't find in the link you posted how to deduce from the product page what version I should include.
The Geforce 940m is 5.0 compatible.
@jamesapollo2016 I understand it uses CUDA Compute 5.0, but what what should I insert in -DCUDA_ARCH_BIN=xx
and -DCUDA_ARCH_PTX=xx
? The options seem to be 20 21 30 35, not 5.
This is what I get if I set no changes:
-- NVIDIA CUDA
-- Use CUFFT: YES
-- Use CUBLAS: NO
-- USE NVCUVID: NO
-- NVIDIA GPU arch: 20 21 30 35
-- NVIDIA PTX archs: 30
-- Use fast math: NO
I tried all combinations of
cmake -DOPENCV_EXTRA_MODULES_PATH=../opencv_contrib/modules/ -DBUILD_opencv_legacy=OFF -DCUDA_ARCH_BIN=2.0 -DCUDA_ARCH_PTX=2.0 .
Where I changed the two 2.0
variables in the examples to 2.0 2.1 3.0 3.5, and no luck
Compiled with 5 just fine for me using cmake-gui. Compiling CUDA for a single version is MUCH FASTER.
@jamesapollo2016
I tried addin 50. I get the exact memcpy error. It might not even be in
those variables.
On May 15, 2016 20:57, "Philip" [email protected] wrote:
Philip's message seems to be deleted, its available here:
@guysoft <https://github.com/guysoft> to your question what exactly you
> have to insert here
>
> -DCUDA_ARCH_PTX=xx
> -DCUDA_ARCH_BIN=xx
>
> you can insert 50 for compute capability 5.0 in both options. The defaults
> 20 21 30 and 35 are for the versions 2.0,2.1,3.0 and 3.5 which are included
> in most the gpus where these lines were added
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly or view it on GitHub
> <https://github.com/Itseez/opencv/issues/6500#issuecomment-219300448>
since my assumption was wrong after testing this i deleted this again. Since you have to state the version with a dot.
cmake-gui output:
Commandline options:
-DCUDA_ARCH_BIN:STRING="5.0" -DCUDA_ARCH_PTX:STRING="5.0"
Cache file:
CUDA_ARCH_BIN:STRING=5.0
CUDA_ARCH_PTX:STRING=5.0
so this wouldn't change a thing. Seems to be different issue for you
So did you try already the option "5.0" since this has to be compatible with your 940M?
@Dikay900 I did it with the dots. I am using the commandline, this is my command:
cmake -DBUILD_opencv_legacy=OFF -DCUDA_ARCH_BIN=5.0 -DCUDA_ARCH_PTX=5.0 -DWITH_CUDA=ON .
can you try making a build folder for building the source and then build again using the commandline option my gui is stating? So something like this:
mkdir build
cd build
cmake -DBUILD_opencv_legacy=OFF -DCUDA_ARCH_BIN:STRING="5.0" -DCUDA_ARCH_PTX:STRING="5.0" -DWITH_CUDA=ON ..
make
@Dikay900 Does not work, get the same error. I should stress, opencv compiles fine if I do -DWITH_CUDA=ON
. Its quite likely something to do with the CUDA settings/code.
I have tried it also on 3 clean installations with ubuntu 16.04 and the "nvidia-cuda-toolkit" isntalled from synaptic. All failed, are you all sure you have the same system?
@guysoft
as i write above a workaroud was to compile it with CUDA_HOST_COMPILER=clang-3.5 (you need to install the clang 3.5 first and if i remember right, you need to Disable the tests also)
@tommy87 It works if you disable the nvidia cuda with -DWITH_CUDA=OFF
. But then you wont have GPU acceleration. Also you must make clean and delete all cache cmake files. After I extracted opencv I create a git repo and add all extracted files to it. That way I can do git clean -df
and delete all the cmake files, they seem to remember settings otherwise.
The reason I don't want to use clang is because I am using the OpenCV installtion to link against Caffe. Correct me if I am wrong, but using different compiles should result in a linking error.
Im Not sure if it leeds to a linking error... I want to used it also with Caffe, but i hadn't the time to test it yet. But maybe i will try this today and than i can tell you if it works or not
@guysoft i have tried to compile caffe and i didnt get linking errors
So i have compiled openCV with CUDA and clang 3.5 and the NVIDIA version of caffe 0.15 with the cc compiler. To get the cc compiler running you need to follow this https://github.com/BVLC/caffe/issues/4046
@all
the solution for caffe is to add "set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_FORCE_INLINES")" to the CMakeList.txt i have tried it with opencv also but i was unable to find the right place to insert this. Maybe someone knows better where the line must be insert (if you tell me, i can try it by myself)
Simple replace in opencv/cmake/OpencvDetectCuda.cmake
set(NVCC_FLAGS_EXTRA ${NVCC_FLAGS_EXTRA} -gencode arch=compute_${CMAKE_MATCH_2},code=sm_${CMAKE_MATCH_1})
to
set(NVCC_FLAGS_EXTRA ${NVCC_FLAGS_EXTRA} -D_FORCE_INLINES -gencode arch=compute_${CMAKE_MATCH_2},code=sm_${CMAKE_MATCH_1})
it's work for me!
@chapaev28 's solution works. Added a pull request.
:+1:
Similar answer here https://github.com/BVLC/caffe/wiki/Ubuntu-16.04-or-15.10-Installation-Guide
Most helpful comment
Simple replace in opencv/cmake/OpencvDetectCuda.cmake
to
it's work for me!