Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"illegal memory access" and "numel: integer multiplication overflow" #3832

Open
jhyuuu opened this issue Dec 29, 2024 · 0 comments
Open

"illegal memory access" and "numel: integer multiplication overflow" #3832

jhyuuu opened this issue Dec 29, 2024 · 0 comments

Comments

@jhyuuu
Copy link

jhyuuu commented Dec 29, 2024

when I use mmseg to train deeplabv3 on datasets such as cityscape, ade20k, the error occurs,
site-packages/mmseg/models/losses/accuracy.py", line 85, in accuracy total_num = target[target != ignore_index].numel() + eps RuntimeError: numel: integer multiplication overflow terminate called after throwing an instance of 'c10::Error' what(): CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:31 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x2b371a00d457 in /HOME/usr/.conda/envs/env/lib/python3.7/site-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x2b3719fd73ec in /HOME/usr/.conda/envs/env/lib/python3.7/site-packages/torch/lib/libc10.so) frame #2: c10::cuda::c10_cuda_check_implementation(std::string const&, std::string const&, int, bool) + 0xb4 (0x2b36eda3ec64 in /HOME/usr/.conda/envs/env/lib/python3.7/site-packages/torch/lib/libc10_cuda.so) frame #3: <unknown function> + 0x1e0dc (0x2b36eda160dc in /HOME/usr/.conda/envs/env/lib/python3.7/site-packages/torch/lib/libc10_cuda.so) frame #4: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x244 (0x2b36eda19054 in /HOME/usr/.conda/envs/env/lib/python3.7/site-packages/torch/lib/libc10_cuda.so) frame #5: <unknown function> + 0x4f6823 (0x2b36c1686823 in /HOME/usr/.conda/envs/env/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #6: c10::TensorImpl::~TensorImpl() + 0x1a0 (0x2b3719fed9e0 in /HOME/usr/.conda/envs/env/lib/python3.7/site-packages/torch/lib/libc10.so) frame #7: c10::TensorImpl::~TensorImpl() + 0x9 (0x2b3719fedaf9 in /HOME/usr/.conda/envs/env/lib/python3.7/site-packages/torch/lib/libc10.so) frame #8: <unknown function> + 0x767108 (0x2b36c18f7108 in /HOME/usr/.conda/envs/env/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #9: THPVariable_subclass_dealloc(_object*) + 0x2f8 (0x2b36c18f7508 in /HOME/usr/.conda/envs/env/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #10: /HOME/usr/.conda/envs/env/bin/python() [0x4a0a87] frame #11: /HOME/usr/.conda/envs/env/bin/python() [0x4a0c9d] frame #12: /HOME/usr/.conda/envs/env/bin/python() [0x4cb2f4] frame #13: /HOME/usr/.conda/envs/env/bin/python() [0x49a222] frame #14: _PyGC_CollectNoFail + 0x2b (0x570aab in /HOME/usr/.conda/envs/env/bin/python) frame #15: PyImport_Cleanup + 0x24e (0x56f8de in /HOME/usr/.conda/envs/env/bin/python) frame #16: Py_FinalizeEx + 0x67 (0x56b1a7 in /HOME/usr/.conda/envs/env/bin/python) frame #17: /HOME/usr/.conda/envs/env/bin/python() [0x53fc79] frame #18: _Py_UnixMain + 0x3c (0x53fb3c in /HOME/usr/.conda/envs/env/bin/python) frame #19: __libc_start_main + 0xf5 (0x2b368125d555 in /lib64/libc.so.6) frame #20: /HOME/usr/.conda/envs/env/bin/python() [0x53f9ee]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant