Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kie预测ser+re的时候报这个错误ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0 #13589

Closed
3 of 4 tasks
freezehe opened this issue Aug 4, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@freezehe
Copy link

freezehe commented Aug 4, 2024

Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.

  • I have searched the PaddleOCR Issues and found no similar bug report.

  • I have searched the PaddleOCR Discussions and found no similar bug report.

Bug

我先是训练完了ser模型,接着也训练完了re模型,执行这个命令! python3 ./tools/infer_kie_token_ser_re.py -c configs/kie/vi_layoutxlm/re_vi_layoutxlm_xfund_zh.yml -o Architecture.Backbone.checkpoints=./output/re_vi_layoutxlm_xfund_zh/best_accuracy/ Global.infer_img=./train_data/XCCIC_8020/zh_val/val.json Global.infer_mode=False -c_ser configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh.yml -o_ser Architecture.Backbone.checkpoints=./output/ser_vi_layoutxlm_xfund_zh/best_accuracy/
的时候报错:
image

Environment

我是在百度飞浆 aistudio 采用gpu的方式训练自定义的数据集,环境如下:

aiofiles==23.2.1
aiohttp==3.9.5
aiosignal==1.3.1
aistudio-sdk @ file:///home/aistudio/aistudio_sdk-0.2.4-py3-none-any.whl#sha256=d93411cc8764e465860cbf2f97f787dddd1548595d4776c97ddf0ea787dedd81
albucore==0.0.13
albumentations==1.4.10
altair==4.2.2
annotated-types==0.7.0
anyio==4.4.0
astor==0.8.1
asttokens==2.4.1
async-timeout==4.0.3
attrdict==2.0.1
attrdict3==2.0.2
attrs==23.2.0
Babel==2.15.0
bce-python-sdk==0.9.17
beautifulsoup4==4.12.3
blinker==1.8.2
cachetools==5.3.3
certifi==2024.7.4
charset-normalizer==3.3.2
click==8.1.7
colorama==0.4.6
coloredlogs==15.0.1
colorlog==6.8.2
comm==0.2.2
contourpy==1.2.1
cssselect==1.2.0
cssutils==2.11.1
cycler==0.12.1
Cython==3.0.10
datasets==2.20.0
debugpy==1.8.2
decorator==5.1.1
dill==0.3.4
dnspython==2.6.1
easydict==1.13
email_validator==2.2.0
entrypoints==0.4
et-xmlfile==1.1.0
exceptiongroup==1.2.1
executing==2.0.1
fastapi==0.111.0
fastapi-cli==0.0.4
ffmpy==0.3.2
filelock==3.15.4
fire==0.6.0
Flask==3.0.3
Flask-Babel==2.0.0
flatbuffers==24.3.25
fonttools==4.53.0
frozenlist==1.4.1
fsspec==2024.5.0
future==1.0.0
gitdb==4.0.11
GitPython==3.1.43
gradio==3.40.0
gradio_client==1.0.2
gunicorn==22.0.0
h11==0.14.0
httpcore==1.0.5
httptools==0.6.1
httpx==0.27.0
huggingface-hub==0.23.4
humanfriendly==10.0
idna==3.7
imageio==2.34.2
imgaug==0.4.0
importlib_metadata==8.0.0
importlib_resources==6.4.0
ipykernel==6.29.5
ipython==8.26.0
itsdangerous==2.2.0
jedi==0.19.1
jieba==0.42.1
Jinja2==3.1.4
joblib==1.4.2
jsonschema==4.22.0
jsonschema-specifications==2023.12.1
jupyter_client==8.6.2
jupyter_core==5.7.2
kiwisolver==1.4.5
lanms_neo==1.0.2
lazy_loader==0.4
linkify-it-py==2.0.3
lmdb==1.5.1
lxml==5.2.2
markdown-it-py==2.2.0
MarkupSafe==2.1.5
matplotlib==3.9.1
matplotlib-inline==0.1.7
mdit-py-plugins==0.3.3
mdurl==0.1.2
more-itertools==10.3.0
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.12.2
nest-asyncio==1.6.0
networkx==3.3
numpy==1.26.4
nvidia-cublas-cu11==11.11.3.6
nvidia-cuda-cupti-cu11==11.8.87
nvidia-cuda-nvrtc-cu11==11.8.89
nvidia-cuda-runtime-cu11==11.8.89
nvidia-cudnn-cu11==8.7.0.84
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.3.0.86
nvidia-cusolver-cu11==11.4.1.48
nvidia-cusparse-cu11==11.7.5.86
nvidia-nccl-cu11==2.19.3
nvidia-nvtx-cu11==11.8.86
onnx==1.16.1
onnxruntime==1.18.1
opencv-contrib-python==4.10.0.84
opencv-python==4.10.0.84
opencv-python-headless==4.10.0.84
openpyxl==3.1.5
opt-einsum==3.3.0
orjson==3.10.6
packaging==24.1
paddle-bfloat==0.1.7
paddle2onnx==1.2.4
paddlefsl==1.1.0
paddlehub==2.4.0
paddlenlp==2.5.2
paddleocr==2.6.1.0
paddlepaddle-gpu==2.5.1
pandas==2.2.2
parso==0.8.4
pdf2docx==0.5.8
pexpect==4.9.0
pickleshare==0.7.5
pillow==10.4.0
platformdirs==4.2.2
Polygon3==3.0.9.1
premailer==3.10.0
prettytable==3.10.0
prompt_toolkit==3.0.47
protobuf==3.20.3
psutil==6.0.0
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==16.1.0
pyarrow-hotfix==0.6
pybind11==2.13.1
pyclipper==1.3.0.post5
pycryptodome==3.20.0
pydantic==2.8.2
pydantic_core==2.20.1
pydeck==0.9.1
pydub==0.25.1
Pygments==2.18.0
Pympler==1.1
PyMuPDF==1.19.0
pypandoc==1.13
pyparsing==3.1.2
python-dateutil==2.9.0.post0
python-docx==1.1.2
python-dotenv==1.0.1
python-multipart==0.0.9
pytz==2024.1
PyYAML==6.0.1
pyzmq==26.0.3
rapidfuzz==3.9.5
rarfile==4.2
referencing==0.35.1
regex==2024.5.15
requests==2.32.3
rich==13.7.1
rpds-py==0.18.1
ruff==0.5.0
safetensors==0.4.3
scikit-image==0.24.0
scikit-learn==1.5.1
scipy==1.14.0
semantic-version==2.10.0
semver==3.0.2
sentencepiece==0.2.0
seqeval==1.2.2
shapely==2.0.5
shellingham==1.5.4
six==1.16.0
smmap==5.0.1
sniffio==1.3.1
soupsieve==2.5
stack-data==0.6.3
starlette==0.37.2
streamlit==1.13.0
streamlit-image-comparison==0.0.4
sympy==1.12.1
termcolor==2.4.0
threadpoolctl==3.5.0
tifffile==2024.7.24
toml==0.10.2
tomli==2.0.1
tomlkit==0.12.0
tool-helpers==0.1.1
toolz==0.12.1
tornado==6.4.1
tqdm==4.66.4
traitlets==5.14.3
typer==0.12.3
typing_extensions==4.12.2
tzdata==2024.1
tzlocal==5.2
uc-micro-py==1.0.3
ujson==5.10.0
urllib3==2.2.2
uvicorn==0.30.1
uvloop==0.19.0
validators==0.30.0
visualdl==2.4.2
watchdog==4.0.1
watchfiles==0.22.0
wcwidth==0.2.13
websockets==11.0.3
Werkzeug==3.0.3
xxhash==3.4.1
yacs==0.1.8
yarl==1.9.4
zipp==3.19.2

Minimal Reproducible Example

[2024-08-04 17:44:30,395] [   ERROR] check_version.py:39 - Error fetching version info
Traceback (most recent call last):
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/albumentations/check_version.py", line 29, in fetch_version_info
    with opener.open(url, timeout=2) as response:
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/urllib/request.py", line 1352, in do_open
    r = h.getresponse()
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/http/client.py", line 1374, in getresponse
    response.begin()
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/http/client.py", line 337, in begin
    self.headers = self.msg = parse_headers(self.fp)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/http/client.py", line 234, in parse_headers
    headers = _read_headers(fp)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/http/client.py", line 214, in _read_headers
    line = fp.readline(_MAXLINE + 1)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/socket.py", line 705, in readinto
    return self._sock.recv_into(b)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/ssl.py", line 1274, in recv_into
    return self.read(nbytes, buffer)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/ssl.py", line 1130, in read
    return self._sslobj.read(len, buffer)
TimeoutError: The read operation timed out
[2024/08/04 17:44:30] ppocr INFO: ********** re config **********
[2024/08/04 17:44:30] ppocr INFO: Architecture : 
[2024/08/04 17:44:30] ppocr INFO:     Backbone : 
[2024/08/04 17:44:30] ppocr INFO:         checkpoints : ./output/re_vi_layoutxlm_xfund_zh/best_accuracy/
[2024/08/04 17:44:30] ppocr INFO:         mode : vi
[2024/08/04 17:44:30] ppocr INFO:         name : LayoutXLMForRe
[2024/08/04 17:44:30] ppocr INFO:         pretrained : True
[2024/08/04 17:44:30] ppocr INFO:     Transform : None
[2024/08/04 17:44:30] ppocr INFO:     algorithm : LayoutXLM
[2024/08/04 17:44:30] ppocr INFO:     model_type : kie
[2024/08/04 17:44:30] ppocr INFO: Eval : 
[2024/08/04 17:44:30] ppocr INFO:     dataset : 
[2024/08/04 17:44:30] ppocr INFO:         data_dir : train_data/XCCIC_8020/zh_val/image
[2024/08/04 17:44:30] ppocr INFO:         label_file_list : ['train_data/XCCIC_8020/zh_val/val.json']
[2024/08/04 17:44:30] ppocr INFO:         name : SimpleDataSet
[2024/08/04 17:44:30] ppocr INFO:         transforms : 
[2024/08/04 17:44:30] ppocr INFO:             DecodeImage : 
[2024/08/04 17:44:30] ppocr INFO:                 channel_first : False
[2024/08/04 17:44:30] ppocr INFO:                 img_mode : RGB
[2024/08/04 17:44:30] ppocr INFO:             VQATokenLabelEncode : 
[2024/08/04 17:44:30] ppocr INFO:                 algorithm : LayoutXLM
[2024/08/04 17:44:30] ppocr INFO:                 class_path : train_data/XCCIC_8020/class_list_xfun.txt
[2024/08/04 17:44:30] ppocr INFO:                 contains_re : True
[2024/08/04 17:44:30] ppocr INFO:                 order_method : tb-yx
[2024/08/04 17:44:30] ppocr INFO:                 use_textline_bbox_info : True
[2024/08/04 17:44:30] ppocr INFO:             VQATokenPad : 
[2024/08/04 17:44:30] ppocr INFO:                 max_seq_len : 512
[2024/08/04 17:44:30] ppocr INFO:                 return_attention_mask : True
[2024/08/04 17:44:30] ppocr INFO:             VQAReTokenRelation : None
[2024/08/04 17:44:30] ppocr INFO:             VQAReTokenChunk : 
[2024/08/04 17:44:30] ppocr INFO:                 max_seq_len : 512
[2024/08/04 17:44:30] ppocr INFO:             TensorizeEntitiesRelations : None
[2024/08/04 17:44:30] ppocr INFO:             Resize : 
[2024/08/04 17:44:30] ppocr INFO:                 size : [224, 224]
[2024/08/04 17:44:30] ppocr INFO:             NormalizeImage : 
[2024/08/04 17:44:30] ppocr INFO:                 mean : [123.675, 116.28, 103.53]
[2024/08/04 17:44:30] ppocr INFO:                 order : hwc
[2024/08/04 17:44:30] ppocr INFO:                 scale : 1
[2024/08/04 17:44:30] ppocr INFO:                 std : [58.395, 57.12, 57.375]
[2024/08/04 17:44:30] ppocr INFO:             ToCHWImage : None
[2024/08/04 17:44:30] ppocr INFO:             KeepKeys : 
[2024/08/04 17:44:30] ppocr INFO:                 keep_keys : ['input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'entities', 'relations']
[2024/08/04 17:44:30] ppocr INFO:     loader : 
[2024/08/04 17:44:30] ppocr INFO:         batch_size_per_card : 8
[2024/08/04 17:44:30] ppocr INFO:         drop_last : False
[2024/08/04 17:44:30] ppocr INFO:         num_workers : 8
[2024/08/04 17:44:30] ppocr INFO:         shuffle : False
[2024/08/04 17:44:30] ppocr INFO: Global : 
[2024/08/04 17:44:30] ppocr INFO:     cal_metric_during_train : False
[2024/08/04 17:44:30] ppocr INFO:     epoch_num : 20
[2024/08/04 17:44:30] ppocr INFO:     eval_batch_step : [0, 19]
[2024/08/04 17:44:30] ppocr INFO:     infer_img : ./train_data/XCCIC_8020/zh_val/val.json
[2024/08/04 17:44:30] ppocr INFO:     infer_mode : False
[2024/08/04 17:44:30] ppocr INFO:     kie_det_model_dir : None
[2024/08/04 17:44:30] ppocr INFO:     kie_rec_model_dir : None
[2024/08/04 17:44:30] ppocr INFO:     log_smooth_window : 10
[2024/08/04 17:44:30] ppocr INFO:     print_batch_step : 10
[2024/08/04 17:44:30] ppocr INFO:     save_epoch_step : 2000
[2024/08/04 17:44:30] ppocr INFO:     save_inference_dir : None
[2024/08/04 17:44:30] ppocr INFO:     save_model_dir : ./output/re_vi_layoutxlm_xfund_zh
[2024/08/04 17:44:30] ppocr INFO:     save_res_path : ./output/ccic/re/xfund_zh/with_gt
[2024/08/04 17:44:30] ppocr INFO:     seed : 2022
[2024/08/04 17:44:30] ppocr INFO:     use_gpu : True
[2024/08/04 17:44:30] ppocr INFO:     use_visualdl : False
[2024/08/04 17:44:30] ppocr INFO: Loss : 
[2024/08/04 17:44:30] ppocr INFO:     key : loss
[2024/08/04 17:44:30] ppocr INFO:     name : LossFromOutput
[2024/08/04 17:44:30] ppocr INFO:     reduction : mean
[2024/08/04 17:44:30] ppocr INFO: Metric : 
[2024/08/04 17:44:30] ppocr INFO:     main_indicator : hmean
[2024/08/04 17:44:30] ppocr INFO:     name : VQAReTokenMetric
[2024/08/04 17:44:30] ppocr INFO: Optimizer : 
[2024/08/04 17:44:30] ppocr INFO:     beta1 : 0.9
[2024/08/04 17:44:30] ppocr INFO:     beta2 : 0.999
[2024/08/04 17:44:30] ppocr INFO:     clip_norm : 10
[2024/08/04 17:44:30] ppocr INFO:     lr : 
[2024/08/04 17:44:30] ppocr INFO:         learning_rate : 1e-05
[2024/08/04 17:44:30] ppocr INFO:         warmup_epoch : 10
[2024/08/04 17:44:30] ppocr INFO:     name : AdamW
[2024/08/04 17:44:30] ppocr INFO:     regularizer : 
[2024/08/04 17:44:30] ppocr INFO:         factor : 0.0
[2024/08/04 17:44:30] ppocr INFO:         name : L2
[2024/08/04 17:44:30] ppocr INFO: PostProcess : 
[2024/08/04 17:44:30] ppocr INFO:     name : VQAReTokenLayoutLMPostProcess
[2024/08/04 17:44:30] ppocr INFO: Train : 
[2024/08/04 17:44:30] ppocr INFO:     dataset : 
[2024/08/04 17:44:30] ppocr INFO:         data_dir : train_data/XCCIC_8020/zh_train/image
[2024/08/04 17:44:30] ppocr INFO:         label_file_list : ['train_data/XCCIC_8020/zh_train/train.json']
[2024/08/04 17:44:30] ppocr INFO:         name : SimpleDataSet
[2024/08/04 17:44:30] ppocr INFO:         ratio_list : [1.0]
[2024/08/04 17:44:30] ppocr INFO:         transforms : 
[2024/08/04 17:44:30] ppocr INFO:             DecodeImage : 
[2024/08/04 17:44:30] ppocr INFO:                 channel_first : False
[2024/08/04 17:44:30] ppocr INFO:                 img_mode : RGB
[2024/08/04 17:44:30] ppocr INFO:             VQATokenLabelEncode : 
[2024/08/04 17:44:30] ppocr INFO:                 algorithm : LayoutXLM
[2024/08/04 17:44:30] ppocr INFO:                 class_path : train_data/XCCIC_8020/class_list_xfun.txt
[2024/08/04 17:44:30] ppocr INFO:                 contains_re : True
[2024/08/04 17:44:30] ppocr INFO:                 order_method : tb-yx
[2024/08/04 17:44:30] ppocr INFO:                 use_textline_bbox_info : True
[2024/08/04 17:44:30] ppocr INFO:             VQATokenPad : 
[2024/08/04 17:44:30] ppocr INFO:                 max_seq_len : 512
[2024/08/04 17:44:30] ppocr INFO:                 return_attention_mask : True
[2024/08/04 17:44:30] ppocr INFO:             VQAReTokenRelation : None
[2024/08/04 17:44:30] ppocr INFO:             VQAReTokenChunk : 
[2024/08/04 17:44:30] ppocr INFO:                 max_seq_len : 512
[2024/08/04 17:44:30] ppocr INFO:             TensorizeEntitiesRelations : None
[2024/08/04 17:44:30] ppocr INFO:             Resize : 
[2024/08/04 17:44:30] ppocr INFO:                 size : [224, 224]
[2024/08/04 17:44:30] ppocr INFO:             NormalizeImage : 
[2024/08/04 17:44:30] ppocr INFO:                 mean : [123.675, 116.28, 103.53]
[2024/08/04 17:44:30] ppocr INFO:                 order : hwc
[2024/08/04 17:44:30] ppocr INFO:                 scale : 1
[2024/08/04 17:44:30] ppocr INFO:                 std : [58.395, 57.12, 57.375]
[2024/08/04 17:44:30] ppocr INFO:             ToCHWImage : None
[2024/08/04 17:44:30] ppocr INFO:             KeepKeys : 
[2024/08/04 17:44:30] ppocr INFO:                 keep_keys : ['input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'entities', 'relations']
[2024/08/04 17:44:30] ppocr INFO:     loader : 
[2024/08/04 17:44:30] ppocr INFO:         batch_size_per_card : 2
[2024/08/04 17:44:30] ppocr INFO:         drop_last : False
[2024/08/04 17:44:30] ppocr INFO:         num_workers : 4
[2024/08/04 17:44:30] ppocr INFO:         shuffle : True
[2024/08/04 17:44:30] ppocr INFO: 

[2024/08/04 17:44:30] ppocr INFO: ********** ser config **********
[2024/08/04 17:44:30] ppocr INFO: Architecture : 
[2024/08/04 17:44:30] ppocr INFO:     Backbone : 
[2024/08/04 17:44:30] ppocr INFO:         checkpoints : ./output/ser_vi_layoutxlm_xfund_zh/best_accuracy/
[2024/08/04 17:44:30] ppocr INFO:         mode : vi
[2024/08/04 17:44:30] ppocr INFO:         name : LayoutXLMForSer
[2024/08/04 17:44:30] ppocr INFO:         num_classes : 5
[2024/08/04 17:44:30] ppocr INFO:         pretrained : True
[2024/08/04 17:44:30] ppocr INFO:     Transform : None
[2024/08/04 17:44:30] ppocr INFO:     algorithm : LayoutXLM
[2024/08/04 17:44:30] ppocr INFO:     model_type : kie
[2024/08/04 17:44:30] ppocr INFO: Eval : 
[2024/08/04 17:44:30] ppocr INFO:     dataset : 
[2024/08/04 17:44:30] ppocr INFO:         data_dir : train_data/XCCIC_8020/zh_val/image
[2024/08/04 17:44:30] ppocr INFO:         label_file_list : ['train_data/XCCIC_8020/zh_val/val.json']
[2024/08/04 17:44:30] ppocr INFO:         name : SimpleDataSet
[2024/08/04 17:44:30] ppocr INFO:         transforms : 
[2024/08/04 17:44:30] ppocr INFO:             DecodeImage : 
[2024/08/04 17:44:30] ppocr INFO:                 channel_first : False
[2024/08/04 17:44:30] ppocr INFO:                 img_mode : RGB
[2024/08/04 17:44:30] ppocr INFO:             VQATokenLabelEncode : 
[2024/08/04 17:44:30] ppocr INFO:                 algorithm : LayoutXLM
[2024/08/04 17:44:30] ppocr INFO:                 class_path : train_data/XCCIC_8020/class_list_xfun.txt
[2024/08/04 17:44:30] ppocr INFO:                 contains_re : False
[2024/08/04 17:44:30] ppocr INFO:                 order_method : tb-yx
[2024/08/04 17:44:30] ppocr INFO:                 use_textline_bbox_info : True
[2024/08/04 17:44:30] ppocr INFO:             VQATokenPad : 
[2024/08/04 17:44:30] ppocr INFO:                 max_seq_len : 512
[2024/08/04 17:44:30] ppocr INFO:                 return_attention_mask : True
[2024/08/04 17:44:30] ppocr INFO:             VQASerTokenChunk : 
[2024/08/04 17:44:30] ppocr INFO:                 max_seq_len : 512
[2024/08/04 17:44:30] ppocr INFO:             Resize : 
[2024/08/04 17:44:30] ppocr INFO:                 size : [224, 224]
[2024/08/04 17:44:30] ppocr INFO:             NormalizeImage : 
[2024/08/04 17:44:30] ppocr INFO:                 mean : [123.675, 116.28, 103.53]
[2024/08/04 17:44:30] ppocr INFO:                 order : hwc
[2024/08/04 17:44:30] ppocr INFO:                 scale : 1
[2024/08/04 17:44:30] ppocr INFO:                 std : [58.395, 57.12, 57.375]
[2024/08/04 17:44:30] ppocr INFO:             ToCHWImage : None
[2024/08/04 17:44:30] ppocr INFO:             KeepKeys : 
[2024/08/04 17:44:30] ppocr INFO:                 keep_keys : ['input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels']
[2024/08/04 17:44:30] ppocr INFO:     loader : 
[2024/08/04 17:44:30] ppocr INFO:         batch_size_per_card : 8
[2024/08/04 17:44:30] ppocr INFO:         drop_last : False
[2024/08/04 17:44:30] ppocr INFO:         num_workers : 4
[2024/08/04 17:44:30] ppocr INFO:         shuffle : False
[2024/08/04 17:44:30] ppocr INFO: Global : 
[2024/08/04 17:44:30] ppocr INFO:     amp_custom_white_list : ['scale', 'concat', 'elementwise_add']
[2024/08/04 17:44:30] ppocr INFO:     cal_metric_during_train : False
[2024/08/04 17:44:30] ppocr INFO:     d2s_train_image_shape : [3, 224, 224]
[2024/08/04 17:44:30] ppocr INFO:     epoch_num : 50
[2024/08/04 17:44:30] ppocr INFO:     eval_batch_step : [0, 19]
[2024/08/04 17:44:30] ppocr INFO:     infer_img : train_data/XCCIC_8020/zh_val/val.json
[2024/08/04 17:44:30] ppocr INFO:     infer_mode : False
[2024/08/04 17:44:30] ppocr INFO:     kie_det_model_dir : None
[2024/08/04 17:44:30] ppocr INFO:     kie_rec_model_dir : None
[2024/08/04 17:44:30] ppocr INFO:     log_smooth_window : 10
[2024/08/04 17:44:30] ppocr INFO:     print_batch_step : 10
[2024/08/04 17:44:30] ppocr INFO:     save_epoch_step : 2000
[2024/08/04 17:44:30] ppocr INFO:     save_inference_dir : None
[2024/08/04 17:44:30] ppocr INFO:     save_model_dir : ./output/ser_vi_layoutxlm_xfund_zh
[2024/08/04 17:44:30] ppocr INFO:     save_res_path : ./output/ccic/ser/xfund_zh/res
[2024/08/04 17:44:30] ppocr INFO:     seed : 2022
[2024/08/04 17:44:30] ppocr INFO:     use_gpu : True
[2024/08/04 17:44:30] ppocr INFO:     use_visualdl : False
[2024/08/04 17:44:30] ppocr INFO: Loss : 
[2024/08/04 17:44:30] ppocr INFO:     key : backbone_out
[2024/08/04 17:44:30] ppocr INFO:     name : VQASerTokenLayoutLMLoss
[2024/08/04 17:44:30] ppocr INFO:     num_classes : 5
[2024/08/04 17:44:30] ppocr INFO: Metric : 
[2024/08/04 17:44:30] ppocr INFO:     main_indicator : hmean
[2024/08/04 17:44:30] ppocr INFO:     name : VQASerTokenMetric
[2024/08/04 17:44:30] ppocr INFO: Optimizer : 
[2024/08/04 17:44:30] ppocr INFO:     beta1 : 0.9
[2024/08/04 17:44:30] ppocr INFO:     beta2 : 0.999
[2024/08/04 17:44:30] ppocr INFO:     lr : 
[2024/08/04 17:44:30] ppocr INFO:         epochs : 50
[2024/08/04 17:44:30] ppocr INFO:         learning_rate : 1e-05
[2024/08/04 17:44:30] ppocr INFO:         name : Linear
[2024/08/04 17:44:30] ppocr INFO:         warmup_epoch : 2
[2024/08/04 17:44:30] ppocr INFO:     name : AdamW
[2024/08/04 17:44:30] ppocr INFO:     regularizer : 
[2024/08/04 17:44:30] ppocr INFO:         factor : 0.0
[2024/08/04 17:44:30] ppocr INFO:         name : L2
[2024/08/04 17:44:30] ppocr INFO: PostProcess : 
[2024/08/04 17:44:30] ppocr INFO:     class_path : train_data/XCCIC_8020/class_list_xfun.txt
[2024/08/04 17:44:30] ppocr INFO:     name : VQASerTokenLayoutLMPostProcess
[2024/08/04 17:44:30] ppocr INFO: Train : 
[2024/08/04 17:44:30] ppocr INFO:     dataset : 
[2024/08/04 17:44:30] ppocr INFO:         data_dir : train_data/XCCIC_8020/zh_train/image
[2024/08/04 17:44:30] ppocr INFO:         label_file_list : ['train_data/XCCIC_8020/zh_train/train.json']
[2024/08/04 17:44:30] ppocr INFO:         name : SimpleDataSet
[2024/08/04 17:44:30] ppocr INFO:         ratio_list : [1.0]
[2024/08/04 17:44:30] ppocr INFO:         transforms : 
[2024/08/04 17:44:30] ppocr INFO:             DecodeImage : 
[2024/08/04 17:44:30] ppocr INFO:                 channel_first : False
[2024/08/04 17:44:30] ppocr INFO:                 img_mode : RGB
[2024/08/04 17:44:30] ppocr INFO:             VQATokenLabelEncode : 
[2024/08/04 17:44:30] ppocr INFO:                 algorithm : LayoutXLM
[2024/08/04 17:44:30] ppocr INFO:                 class_path : train_data/XCCIC_8020/class_list_xfun.txt
[2024/08/04 17:44:30] ppocr INFO:                 contains_re : False
[2024/08/04 17:44:30] ppocr INFO:                 order_method : tb-yx
[2024/08/04 17:44:30] ppocr INFO:                 use_textline_bbox_info : True
[2024/08/04 17:44:30] ppocr INFO:             VQATokenPad : 
[2024/08/04 17:44:30] ppocr INFO:                 max_seq_len : 512
[2024/08/04 17:44:30] ppocr INFO:                 return_attention_mask : True
[2024/08/04 17:44:30] ppocr INFO:             VQASerTokenChunk : 
[2024/08/04 17:44:30] ppocr INFO:                 max_seq_len : 512
[2024/08/04 17:44:30] ppocr INFO:             Resize : 
[2024/08/04 17:44:30] ppocr INFO:                 size : [224, 224]
[2024/08/04 17:44:30] ppocr INFO:             NormalizeImage : 
[2024/08/04 17:44:30] ppocr INFO:                 mean : [123.675, 116.28, 103.53]
[2024/08/04 17:44:30] ppocr INFO:                 order : hwc
[2024/08/04 17:44:30] ppocr INFO:                 scale : 1
[2024/08/04 17:44:30] ppocr INFO:                 std : [58.395, 57.12, 57.375]
[2024/08/04 17:44:30] ppocr INFO:             ToCHWImage : None
[2024/08/04 17:44:30] ppocr INFO:             KeepKeys : 
[2024/08/04 17:44:30] ppocr INFO:                 keep_keys : ['input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels']
[2024/08/04 17:44:30] ppocr INFO:     loader : 
[2024/08/04 17:44:30] ppocr INFO:         batch_size_per_card : 8
[2024/08/04 17:44:30] ppocr INFO:         drop_last : False
[2024/08/04 17:44:30] ppocr INFO:         num_workers : 4
[2024/08/04 17:44:30] ppocr INFO:         shuffle : True
[2024/08/04 17:44:30] ppocr INFO: train with paddle 2.5.1 and device Place(gpu:0)
W0804 17:44:31.987720 758615 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.8
W0804 17:44:31.989112 758615 gpu_resources.cc:149] device: 0, cuDNN Version: 8.9.
[2024/08/04 17:44:36] ppocr INFO: resume from ./output/ser_vi_layoutxlm_xfund_zh/best_accuracy/
[2024/08/04 17:44:36] ppocr WARNING: The first GPU is used for inference by default, GPU ID: 0
[2024/08/04 17:44:37] ppocr WARNING: The first GPU is used for inference by default, GPU ID: 0
[2024-08-04 17:44:38,120] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/layoutxlm-base-uncased/sentencepiece.bpe.model
[2024-08-04 17:44:38,726] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/layoutxlm-base-uncased/tokenizer_config.json
[2024-08-04 17:44:38,726] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/layoutxlm-base-uncased/special_tokens_map.json
[2024/08/04 17:44:40] ppocr INFO: resume from ./output/re_vi_layoutxlm_xfund_zh/best_accuracy/
Traceback (most recent call last):
  File "/home/aistudio/PaddleOCR/./tools/infer_kie_token_ser_re.py", line 216, in <module>
    result = ser_re_engine(data)
  File "/home/aistudio/PaddleOCR/./tools/infer_kie_token_ser_re.py", line 151, in __call__
    preds = self.model(re_input)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/aistudio/PaddleOCR/ppocr/modeling/architectures/base_model.py", line 85, in forward
    x = self.backbone(x)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/aistudio/PaddleOCR/ppocr/modeling/backbones/vqa_layoutlm.py", line 248, in forward
    x = self.model(
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in __call__
    return self.forward(*inputs, **kwargs)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1412, in forward
    loss, pred_relations = self.extractor(sequence_output, entities, relations)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in __call__
    return self.forward(*inputs, **kwargs)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1304, in forward
    relations, entities = self.build_relation(relations, entities)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1248, in build_relation
    all_possible_relations = paddle.stack(
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/tensor/manipulation.py", line 1842, in stack
    return _C_ops.stack(x, axis)
ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0
  [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180)

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!
@freezehe freezehe added the bug Something isn't working label Aug 4, 2024
@SWHL
Copy link
Collaborator

SWHL commented Aug 5, 2024

方便把模型提供一下吗?这种看,看不出来啥错误

@freezehe
Copy link
Author

freezehe commented Aug 5, 2024

我是在你们平台上使用的,我提供我的项目编号,你们可以在后台查看我的项目?我是在百度studio的https://aistudio.baidu.com/projectdetail/8221656

@SWHL
Copy link
Collaborator

SWHL commented Aug 5, 2024

PaddleOCR 现在是社区人员在维护,不是百度官方维护了。我们项目管理人员大部分都不是百度的哈,看不到你这项目的。

@freezehe
Copy link
Author

freezehe commented Aug 5, 2024

那我模型怎么发你呢?很大的,几个G

@freezehe
Copy link
Author

freezehe commented Aug 5, 2024

PaddleOCR 现在是社区人员在维护,不是百度官方维护了。我们项目管理人员大部分都不是百度的哈,看不到你这项目的。

我在issuse搜了一下,这个问题还是有不少人遇到过的,#11261 这个issue你们解决了吗?

@GreatV GreatV closed this as completed Sep 12, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants