-
dlib ver : 19.24 We are successfully using the mmod_human_face_detector in two modes by building dlib with DLIB_USE_CUDA set to ON and OFF for the CUDA and CPU versions, respectively. In the CUDA version, when GPU resources are stable, the detector always returns consistent results. However, when GPU resources are constrained (e.g., SM usage approaches 100), the returned rect coordinates and confidence values occasionally differ from the usual results. Interestingly, these "different" rect coordinates match exactly with those returned by the CPU version of the mmod_human_face_detector. (In such cases, the confidence values in the CUDA version are slightly different from those in the CPU version.) Given this behavior, does dlib—even when built with DLIB_USE_CUDA=ON—have an automatic fallback mechanism to use the CPU for calculations during runtime when GPU resources are insufficient? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
Currently, if you compile dlib with |
Beta Was this translation helpful? Give feedback.
It's because changing the order in which you add floating point numbers slightly changes the resulting value. And in cuda there are a ton of threads doing just that. So when something changes the way those threads get scheduled you will sometimes get slightly different results.