Unable to extract features #2

CalvinLeow · 2024-06-21T14:37:12Z

Hello I am a beginner trying to utilise your repository in order to produce a video description caption based on my inputted videos.

I believe in order to do that I have to

Download MSVD dataset, extract features, and then run prepare_captions.py and train.py in order to get the checkpoint model
After which in order to use my own videos as input I have to replace the MSVD dataset with my own folder of videos, extract features and run eval.py to print the captions, also I no longer need to run prepare_captions or train.py as it is optional, is this correct?

The problem I am facing now is that I am unable to run extract_features.py. Running your instructed command "python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16_bn" returns error "extract_features.py: error: argument --model: invalid choice: 'vgg16_bn' (choose from 'vgg16', 'resnet152', 'inception_v4')". If I change the model to "vgg16" it then tells me that "extract_features.py: error: the following arguments are required: --mode". I have tried both "fix" and "free" but it returns
python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16 --mode fix
Extract Features with vgg16, device: cuda
Extracting~: 0%| | 0/3 [00:00<?, ?it/s] cleanup: _frames_out/
Extracting~: 0%| | 0/3 [00:07<?, ?it/s]
Traceback (most recent call last):
File "extract_features.py", line 175, in
mode=args['mode'],
File "extract_features.py", line 140, in extract
fix_frame_extract(temp_path, feats_path, frames_num, model, video.stem)
File "extract_features.py", line 99, in fix_frame_extract
img_list = [img_list[i] for i in samples_ix]
File "extract_features.py", line 99, in
img_list = [img_list[i] for i in samples_ix]
IndexError: list index out of range

or

python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16 --mode free
Extract Features with vgg16, device: cuda
Extracting~: 0%| | 0/3 [00:00<?, ?it/s] cleanup: _frames_out/
Extracting~: 0%| | 0/3 [00:09<?, ?it/s]
Traceback (most recent call last):
File "extract_features.py", line 175, in
mode=args['mode'],
File "extract_features.py", line 142, in extract
extract_feats(temp_path, feats_path, interval, model, video.stem)
File "extract_features.py", line 74, in extract_feats
feats = model(imgs)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/pretrainedmodels/models/torchvision_models.py", line 487, in forward
x = self.features(input)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/pretrainedmodels/models/torchvision_models.py", line 473, in features
x = x.view(x.size(0), -1)
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to extract features #2

Unable to extract features #2

CalvinLeow commented Jun 21, 2024 •

edited

Loading

Unable to extract features #2

Unable to extract features #2

Comments

CalvinLeow commented Jun 21, 2024 • edited Loading

CalvinLeow commented Jun 21, 2024 •

edited

Loading