You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello I am a beginner trying to utilise your repository in order to produce a video description caption based on my inputted videos.
I believe in order to do that I have to
Download MSVD dataset, extract features, and then run prepare_captions.py and train.py in order to get the checkpoint model
After which in order to use my own videos as input I have to replace the MSVD dataset with my own folder of videos, extract features and run eval.py to print the captions, also I no longer need to run prepare_captions or train.py as it is optional, is this correct?
The problem I am facing now is that I am unable to run extract_features.py. Running your instructed command "python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16_bn" returns error "extract_features.py: error: argument --model: invalid choice: 'vgg16_bn' (choose from 'vgg16', 'resnet152', 'inception_v4')". If I change the model to "vgg16" it then tells me that "extract_features.py: error: the following arguments are required: --mode". I have tried both "fix" and "free" but it returns
python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16 --mode fix
Extract Features with vgg16, device: cuda
Extracting~: 0%| | 0/3 [00:00<?, ?it/s] cleanup: _frames_out/
Extracting~: 0%| | 0/3 [00:07<?, ?it/s]
Traceback (most recent call last):
File "extract_features.py", line 175, in
mode=args['mode'],
File "extract_features.py", line 140, in extract
fix_frame_extract(temp_path, feats_path, frames_num, model, video.stem)
File "extract_features.py", line 99, in fix_frame_extract
img_list = [img_list[i] for i in samples_ix]
File "extract_features.py", line 99, in
img_list = [img_list[i] for i in samples_ix]
IndexError: list index out of range
or
python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16 --mode free
Extract Features with vgg16, device: cuda
Extracting~: 0%| | 0/3 [00:00<?, ?it/s] cleanup: _frames_out/
Extracting~: 0%| | 0/3 [00:09<?, ?it/s]
Traceback (most recent call last):
File "extract_features.py", line 175, in
mode=args['mode'],
File "extract_features.py", line 142, in extract
extract_feats(temp_path, feats_path, interval, model, video.stem)
File "extract_features.py", line 74, in extract_feats
feats = model(imgs)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/pretrainedmodels/models/torchvision_models.py", line 487, in forward
x = self.features(input)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/pretrainedmodels/models/torchvision_models.py", line 473, in features
x = x.view(x.size(0), -1)
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous
The text was updated successfully, but these errors were encountered:
Hello I am a beginner trying to utilise your repository in order to produce a video description caption based on my inputted videos.
I believe in order to do that I have to
The problem I am facing now is that I am unable to run extract_features.py. Running your instructed command "python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16_bn" returns error "extract_features.py: error: argument --model: invalid choice: 'vgg16_bn' (choose from 'vgg16', 'resnet152', 'inception_v4')". If I change the model to "vgg16" it then tells me that "extract_features.py: error: the following arguments are required: --mode". I have tried both "fix" and "free" but it returns
python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16 --mode fix
Extract Features with vgg16, device: cuda
Extracting~: 0%| | 0/3 [00:00<?, ?it/s] cleanup: _frames_out/
Extracting~: 0%| | 0/3 [00:07<?, ?it/s]
Traceback (most recent call last):
File "extract_features.py", line 175, in
mode=args['mode'],
File "extract_features.py", line 140, in extract
fix_frame_extract(temp_path, feats_path, frames_num, model, video.stem)
File "extract_features.py", line 99, in fix_frame_extract
img_list = [img_list[i] for i in samples_ix]
File "extract_features.py", line 99, in
img_list = [img_list[i] for i in samples_ix]
IndexError: list index out of range
or
python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16 --mode free
Extract Features with vgg16, device: cuda
Extracting~: 0%| | 0/3 [00:00<?, ?it/s] cleanup: _frames_out/
Extracting~: 0%| | 0/3 [00:09<?, ?it/s]
Traceback (most recent call last):
File "extract_features.py", line 175, in
mode=args['mode'],
File "extract_features.py", line 142, in extract
extract_feats(temp_path, feats_path, interval, model, video.stem)
File "extract_features.py", line 74, in extract_feats
feats = model(imgs)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/pretrainedmodels/models/torchvision_models.py", line 487, in forward
x = self.features(input)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/pretrainedmodels/models/torchvision_models.py", line 473, in features
x = x.view(x.size(0), -1)
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous
The text was updated successfully, but these errors were encountered: