Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to extract features #2

Open
CalvinLeow opened this issue Jun 21, 2024 · 0 comments
Open

Unable to extract features #2

CalvinLeow opened this issue Jun 21, 2024 · 0 comments

Comments

@CalvinLeow
Copy link

CalvinLeow commented Jun 21, 2024

Hello I am a beginner trying to utilise your repository in order to produce a video description caption based on my inputted videos.

I believe in order to do that I have to

  1. Download MSVD dataset, extract features, and then run prepare_captions.py and train.py in order to get the checkpoint model
  2. After which in order to use my own videos as input I have to replace the MSVD dataset with my own folder of videos, extract features and run eval.py to print the captions, also I no longer need to run prepare_captions or train.py as it is optional, is this correct?

The problem I am facing now is that I am unable to run extract_features.py. Running your instructed command "python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16_bn" returns error "extract_features.py: error: argument --model: invalid choice: 'vgg16_bn' (choose from 'vgg16', 'resnet152', 'inception_v4')". If I change the model to "vgg16" it then tells me that "extract_features.py: error: the following arguments are required: --mode". I have tried both "fix" and "free" but it returns
python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16 --mode fix
Extract Features with vgg16, device: cuda
Extracting~: 0%| | 0/3 [00:00<?, ?it/s] cleanup: _frames_out/
Extracting~: 0%| | 0/3 [00:07<?, ?it/s]
Traceback (most recent call last):
File "extract_features.py", line 175, in
mode=args['mode'],
File "extract_features.py", line 140, in extract
fix_frame_extract(temp_path, feats_path, frames_num, model, video.stem)
File "extract_features.py", line 99, in fix_frame_extract
img_list = [img_list[i] for i in samples_ix]
File "extract_features.py", line 99, in
img_list = [img_list[i] for i in samples_ix]
IndexError: list index out of range

or

python extract_features.py --video_path ./data/YouTubeClips --feat_path ./data/feats/msvd_vgg16_bn --model vgg16 --mode free
Extract Features with vgg16, device: cuda
Extracting~: 0%| | 0/3 [00:00<?, ?it/s] cleanup: _frames_out/
Extracting~: 0%| | 0/3 [00:09<?, ?it/s]
Traceback (most recent call last):
File "extract_features.py", line 175, in
mode=args['mode'],
File "extract_features.py", line 142, in extract
extract_feats(temp_path, feats_path, interval, model, video.stem)
File "extract_features.py", line 74, in extract_feats
feats = model(imgs)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/pretrainedmodels/models/torchvision_models.py", line 487, in forward
x = self.features(input)
File "/home/user/miniconda3/envs/s2vt/lib/python3.6/site-packages/pretrainedmodels/models/torchvision_models.py", line 473, in features
x = x.view(x.size(0), -1)
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant