我先是训练完了ser模型,接着也训练完了re模型,执行这个命令! python3 ./tools/infer_kie_token_ser_re.py -c configs/kie/vi_layoutxlm/re_vi_layoutxlm_xfund_zh.yml -o Architecture.Backbone.checkpoints=./output/re_vi_layoutxlm_xfund_zh/best_accuracy/ Global.infer_img=./train_data/XCCIC_8020/zh_val/val.json Global.infer_mode=False -c_ser configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh.yml -o_ser Architecture.Backbone.checkpoints=./output/ser_vi_layoutxlm_xfund_zh/best_accuracy/ 的时候报错:
我是在百度飞浆 aistudio 采用gpu的方式训练自定义的数据集,环境如下:
aiofiles==23.2.1 aiohttp==3.9.5 aiosignal==1.3.1 aistudio-sdk @ file:///home/aistudio/aistudio_sdk-0.2.4-py3-none-any.whl#sha256=d93411cc8764e465860cbf2f97f787dddd1548595d4776c97ddf0ea787dedd81 albucore==0.0.13 albumentations==1.4.10 altair==4.2.2 annotated-types==0.7.0 anyio==4.4.0 astor==0.8.1 asttokens==2.4.1 async-timeout==4.0.3 attrdict==2.0.1 attrdict3==2.0.2 attrs==23.2.0 Babel==2.15.0 bce-python-sdk==0.9.17 beautifulsoup4==4.12.3 blinker==1.8.2 cachetools==5.3.3 certifi==2024.7.4 charset-normalizer==3.3.2 click==8.1.7 colorama==0.4.6 coloredlogs==15.0.1 colorlog==6.8.2 comm==0.2.2 contourpy==1.2.1 cssselect==1.2.0 cssutils==2.11.1 cycler==0.12.1 Cython==3.0.10 datasets==2.20.0 debugpy==1.8.2 decorator==5.1.1 dill==0.3.4 dnspython==2.6.1 easydict==1.13 email_validator==2.2.0 entrypoints==0.4 et-xmlfile==1.1.0 exceptiongroup==1.2.1 executing==2.0.1 fastapi==0.111.0 fastapi-cli==0.0.4 ffmpy==0.3.2 filelock==3.15.4 fire==0.6.0 Flask==3.0.3 Flask-Babel==2.0.0 flatbuffers==24.3.25 fonttools==4.53.0 frozenlist==1.4.1 fsspec==2024.5.0 future==1.0.0 gitdb==4.0.11 GitPython==3.1.43 gradio==3.40.0 gradio_client==1.0.2 gunicorn==22.0.0 h11==0.14.0 httpcore==1.0.5 httptools==0.6.1 httpx==0.27.0 huggingface-hub==0.23.4 humanfriendly==10.0 idna==3.7 imageio==2.34.2 imgaug==0.4.0 importlib_metadata==8.0.0 importlib_resources==6.4.0 ipykernel==6.29.5 ipython==8.26.0 itsdangerous==2.2.0 jedi==0.19.1 jieba==0.42.1 Jinja2==3.1.4 joblib==1.4.2 jsonschema==4.22.0 jsonschema-specifications==2023.12.1 jupyter_client==8.6.2 jupyter_core==5.7.2 kiwisolver==1.4.5 lanms_neo==1.0.2 lazy_loader==0.4 linkify-it-py==2.0.3 lmdb==1.5.1 lxml==5.2.2 markdown-it-py==2.2.0 MarkupSafe==2.1.5 matplotlib==3.9.1 matplotlib-inline==0.1.7 mdit-py-plugins==0.3.3 mdurl==0.1.2 more-itertools==10.3.0 mpmath==1.3.0 multidict==6.0.5 multiprocess== nest-asyncio==1.6.0 networkx==3.3 numpy==1.26.4 nvidia-cublas-cu11== nvidia-cuda-cupti-cu11==11.8.87 nvidia-cuda-nvrtc-cu11==11.8.89 nvidia-cuda-runtime-cu11==11.8.89 nvidia-cudnn-cu11== nvidia-cufft-cu11== nvidia-curand-cu11== nvidia-cusolver-cu11== nvidia-cusparse-cu11== nvidia-nccl-cu11==2.19.3 nvidia-nvtx-cu11==11.8.86 onnx==1.16.1 onnxruntime==1.18.1 opencv-contrib-python== opencv-python== opencv-python-headless== openpyxl==3.1.5 opt-einsum==3.3.0 orjson==3.10.6 packaging==24.1 paddle-bfloat==0.1.7 paddle2onnx==1.2.4 paddlefsl==1.1.0 paddlehub==2.4.0 paddlenlp==2.5.2 paddleocr== paddlepaddle-gpu==2.5.1 pandas==2.2.2 parso==0.8.4 pdf2docx==0.5.8 pexpect==4.9.0 pickleshare==0.7.5 pillow==10.4.0 platformdirs==4.2.2 Polygon3== premailer==3.10.0 prettytable==3.10.0 prompt_toolkit==3.0.47 protobuf==3.20.3 psutil==6.0.0 ptyprocess==0.7.0 pure-eval==0.2.2 pyarrow==16.1.0 pyarrow-hotfix==0.6 pybind11==2.13.1 pyclipper==1.3.0.post5 pycryptodome==3.20.0 pydantic==2.8.2 pydantic_core==2.20.1 pydeck==0.9.1 pydub==0.25.1 Pygments==2.18.0 Pympler==1.1 PyMuPDF==1.19.0 pypandoc==1.13 pyparsing==3.1.2 python-dateutil==2.9.0.post0 python-docx==1.1.2 python-dotenv==1.0.1 python-multipart==0.0.9 pytz==2024.1 PyYAML==6.0.1 pyzmq==26.0.3 rapidfuzz==3.9.5 rarfile==4.2 referencing==0.35.1 regex==2024.5.15 requests==2.32.3 rich==13.7.1 rpds-py==0.18.1 ruff==0.5.0 safetensors==0.4.3 scikit-image==0.24.0 scikit-learn==1.5.1 scipy==1.14.0 semantic-version==2.10.0 semver==3.0.2 sentencepiece==0.2.0 seqeval==1.2.2 shapely==2.0.5 shellingham==1.5.4 six==1.16.0 smmap==5.0.1 sniffio==1.3.1 soupsieve==2.5 stack-data==0.6.3 starlette==0.37.2 streamlit==1.13.0 streamlit-image-comparison==0.0.4 sympy==1.12.1 termcolor==2.4.0 threadpoolctl==3.5.0 tifffile==2024.7.24 toml==0.10.2 tomli==2.0.1 tomlkit==0.12.0 tool-helpers==0.1.1 toolz==0.12.1 tornado==6.4.1 tqdm==4.66.4 traitlets==5.14.3 typer==0.12.3 typing_extensions==4.12.2 tzdata==2024.1 tzlocal==5.2 uc-micro-py==1.0.3 ujson==5.10.0 urllib3==2.2.2 uvicorn==0.30.1 uvloop==0.19.0 validators==0.30.0 visualdl==2.4.2 watchdog==4.0.1 watchfiles==0.22.0 wcwidth==0.2.13 websockets==11.0.3 Werkzeug==3.0.3 xxhash==3.4.1 yacs==0.1.8 yarl==1.9.4 zipp==3.19.2
[2024-08-04 17:44:30,395] [ ERROR] check_version.py:39 - Error fetching version info Traceback (most recent call last): File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/albumentations/check_version.py", line 29, in fetch_version_info with opener.open(url, timeout=2) as response: File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/urllib/request.py", line 519, in open response = self._open(req, data) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/urllib/request.py", line 536, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/urllib/request.py", line 496, in _call_chain result = func(*args) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/urllib/request.py", line 1391, in https_open return self.do_open(http.client.HTTPSConnection, req, File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/urllib/request.py", line 1352, in do_open r = h.getresponse() File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/http/client.py", line 1374, in getresponse response.begin() File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/http/client.py", line 337, in begin self.headers = self.msg = parse_headers(self.fp) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/http/client.py", line 234, in parse_headers headers = _read_headers(fp) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/http/client.py", line 214, in _read_headers line = fp.readline(_MAXLINE + 1) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/socket.py", line 705, in readinto return self._sock.recv_into(b) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/ssl.py", line 1274, in recv_into return self.read(nbytes, buffer) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/ssl.py", line 1130, in read return self._sslobj.read(len, buffer) TimeoutError: The read operation timed out [2024/08/04 17:44:30] ppocr INFO: ********** re config ********** [2024/08/04 17:44:30] ppocr INFO: Architecture : [2024/08/04 17:44:30] ppocr INFO: Backbone : [2024/08/04 17:44:30] ppocr INFO: checkpoints : ./output/re_vi_layoutxlm_xfund_zh/best_accuracy/ [2024/08/04 17:44:30] ppocr INFO: mode : vi [2024/08/04 17:44:30] ppocr INFO: name : LayoutXLMForRe [2024/08/04 17:44:30] ppocr INFO: pretrained : True [2024/08/04 17:44:30] ppocr INFO: Transform : None [2024/08/04 17:44:30] ppocr INFO: algorithm : LayoutXLM [2024/08/04 17:44:30] ppocr INFO: model_type : kie [2024/08/04 17:44:30] ppocr INFO: Eval : [2024/08/04 17:44:30] ppocr INFO: dataset : [2024/08/04 17:44:30] ppocr INFO: data_dir : train_data/XCCIC_8020/zh_val/image [2024/08/04 17:44:30] ppocr INFO: label_file_list : ['train_data/XCCIC_8020/zh_val/val.json'] [2024/08/04 17:44:30] ppocr INFO: name : SimpleDataSet [2024/08/04 17:44:30] ppocr INFO: transforms : [2024/08/04 17:44:30] ppocr INFO: DecodeImage : [2024/08/04 17:44:30] ppocr INFO: channel_first : False [2024/08/04 17:44:30] ppocr INFO: img_mode : RGB [2024/08/04 17:44:30] ppocr INFO: VQATokenLabelEncode : [2024/08/04 17:44:30] ppocr INFO: algorithm : LayoutXLM [2024/08/04 17:44:30] ppocr INFO: class_path : train_data/XCCIC_8020/class_list_xfun.txt [2024/08/04 17:44:30] ppocr INFO: contains_re : True [2024/08/04 17:44:30] ppocr INFO: order_method : tb-yx [2024/08/04 17:44:30] ppocr INFO: use_textline_bbox_info : True [2024/08/04 17:44:30] ppocr INFO: VQATokenPad : [2024/08/04 17:44:30] ppocr INFO: max_seq_len : 512 [2024/08/04 17:44:30] ppocr INFO: return_attention_mask : True [2024/08/04 17:44:30] ppocr INFO: VQAReTokenRelation : None [2024/08/04 17:44:30] ppocr INFO: VQAReTokenChunk : [2024/08/04 17:44:30] ppocr INFO: max_seq_len : 512 [2024/08/04 17:44:30] ppocr INFO: TensorizeEntitiesRelations : None [2024/08/04 17:44:30] ppocr INFO: Resize : [2024/08/04 17:44:30] ppocr INFO: size : [224, 224] [2024/08/04 17:44:30] ppocr INFO: NormalizeImage : [2024/08/04 17:44:30] ppocr INFO: mean : [123.675, 116.28, 103.53] [2024/08/04 17:44:30] ppocr INFO: order : hwc [2024/08/04 17:44:30] ppocr INFO: scale : 1 [2024/08/04 17:44:30] ppocr INFO: std : [58.395, 57.12, 57.375] [2024/08/04 17:44:30] ppocr INFO: ToCHWImage : None [2024/08/04 17:44:30] ppocr INFO: KeepKeys : [2024/08/04 17:44:30] ppocr INFO: keep_keys : ['input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'entities', 'relations'] [2024/08/04 17:44:30] ppocr INFO: loader : [2024/08/04 17:44:30] ppocr INFO: batch_size_per_card : 8 [2024/08/04 17:44:30] ppocr INFO: drop_last : False [2024/08/04 17:44:30] ppocr INFO: num_workers : 8 [2024/08/04 17:44:30] ppocr INFO: shuffle : False [2024/08/04 17:44:30] ppocr INFO: Global : [2024/08/04 17:44:30] ppocr INFO: cal_metric_during_train : False [2024/08/04 17:44:30] ppocr INFO: epoch_num : 20 [2024/08/04 17:44:30] ppocr INFO: eval_batch_step : [0, 19] [2024/08/04 17:44:30] ppocr INFO: infer_img : ./train_data/XCCIC_8020/zh_val/val.json [2024/08/04 17:44:30] ppocr INFO: infer_mode : False [2024/08/04 17:44:30] ppocr INFO: kie_det_model_dir : None [2024/08/04 17:44:30] ppocr INFO: kie_rec_model_dir : None [2024/08/04 17:44:30] ppocr INFO: log_smooth_window : 10 [2024/08/04 17:44:30] ppocr INFO: print_batch_step : 10 [2024/08/04 17:44:30] ppocr INFO: save_epoch_step : 2000 [2024/08/04 17:44:30] ppocr INFO: save_inference_dir : None [2024/08/04 17:44:30] ppocr INFO: save_model_dir : ./output/re_vi_layoutxlm_xfund_zh [2024/08/04 17:44:30] ppocr INFO: save_res_path : ./output/ccic/re/xfund_zh/with_gt [2024/08/04 17:44:30] ppocr INFO: seed : 2022 [2024/08/04 17:44:30] ppocr INFO: use_gpu : True [2024/08/04 17:44:30] ppocr INFO: use_visualdl : False [2024/08/04 17:44:30] ppocr INFO: Loss : [2024/08/04 17:44:30] ppocr INFO: key : loss [2024/08/04 17:44:30] ppocr INFO: name : LossFromOutput [2024/08/04 17:44:30] ppocr INFO: reduction : mean [2024/08/04 17:44:30] ppocr INFO: Metric : [2024/08/04 17:44:30] ppocr INFO: main_indicator : hmean [2024/08/04 17:44:30] ppocr INFO: name : VQAReTokenMetric [2024/08/04 17:44:30] ppocr INFO: Optimizer : [2024/08/04 17:44:30] ppocr INFO: beta1 : 0.9 [2024/08/04 17:44:30] ppocr INFO: beta2 : 0.999 [2024/08/04 17:44:30] ppocr INFO: clip_norm : 10 [2024/08/04 17:44:30] ppocr INFO: lr : [2024/08/04 17:44:30] ppocr INFO: learning_rate : 1e-05 [2024/08/04 17:44:30] ppocr INFO: warmup_epoch : 10 [2024/08/04 17:44:30] ppocr INFO: name : AdamW [2024/08/04 17:44:30] ppocr INFO: regularizer : [2024/08/04 17:44:30] ppocr INFO: factor : 0.0 [2024/08/04 17:44:30] ppocr INFO: name : L2 [2024/08/04 17:44:30] ppocr INFO: PostProcess : [2024/08/04 17:44:30] ppocr INFO: name : VQAReTokenLayoutLMPostProcess [2024/08/04 17:44:30] ppocr INFO: Train : [2024/08/04 17:44:30] ppocr INFO: dataset : [2024/08/04 17:44:30] ppocr INFO: data_dir : train_data/XCCIC_8020/zh_train/image [2024/08/04 17:44:30] ppocr INFO: label_file_list : ['train_data/XCCIC_8020/zh_train/train.json'] [2024/08/04 17:44:30] ppocr INFO: name : SimpleDataSet [2024/08/04 17:44:30] ppocr INFO: ratio_list : [1.0] [2024/08/04 17:44:30] ppocr INFO: transforms : [2024/08/04 17:44:30] ppocr INFO: DecodeImage : [2024/08/04 17:44:30] ppocr INFO: channel_first : False [2024/08/04 17:44:30] ppocr INFO: img_mode : RGB [2024/08/04 17:44:30] ppocr INFO: VQATokenLabelEncode : [2024/08/04 17:44:30] ppocr INFO: algorithm : LayoutXLM [2024/08/04 17:44:30] ppocr INFO: class_path : train_data/XCCIC_8020/class_list_xfun.txt [2024/08/04 17:44:30] ppocr INFO: contains_re : True [2024/08/04 17:44:30] ppocr INFO: order_method : tb-yx [2024/08/04 17:44:30] ppocr INFO: use_textline_bbox_info : True [2024/08/04 17:44:30] ppocr INFO: VQATokenPad : [2024/08/04 17:44:30] ppocr INFO: max_seq_len : 512 [2024/08/04 17:44:30] ppocr INFO: return_attention_mask : True [2024/08/04 17:44:30] ppocr INFO: VQAReTokenRelation : None [2024/08/04 17:44:30] ppocr INFO: VQAReTokenChunk : [2024/08/04 17:44:30] ppocr INFO: max_seq_len : 512 [2024/08/04 17:44:30] ppocr INFO: TensorizeEntitiesRelations : None [2024/08/04 17:44:30] ppocr INFO: Resize : [2024/08/04 17:44:30] ppocr INFO: size : [224, 224] [2024/08/04 17:44:30] ppocr INFO: NormalizeImage : [2024/08/04 17:44:30] ppocr INFO: mean : [123.675, 116.28, 103.53] [2024/08/04 17:44:30] ppocr INFO: order : hwc [2024/08/04 17:44:30] ppocr INFO: scale : 1 [2024/08/04 17:44:30] ppocr INFO: std : [58.395, 57.12, 57.375] [2024/08/04 17:44:30] ppocr INFO: ToCHWImage : None [2024/08/04 17:44:30] ppocr INFO: KeepKeys : [2024/08/04 17:44:30] ppocr INFO: keep_keys : ['input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'entities', 'relations'] [2024/08/04 17:44:30] ppocr INFO: loader : [2024/08/04 17:44:30] ppocr INFO: batch_size_per_card : 2 [2024/08/04 17:44:30] ppocr INFO: drop_last : False [2024/08/04 17:44:30] ppocr INFO: num_workers : 4 [2024/08/04 17:44:30] ppocr INFO: shuffle : True [2024/08/04 17:44:30] ppocr INFO: [2024/08/04 17:44:30] ppocr INFO: ********** ser config ********** [2024/08/04 17:44:30] ppocr INFO: Architecture : [2024/08/04 17:44:30] ppocr INFO: Backbone : [2024/08/04 17:44:30] ppocr INFO: checkpoints : ./output/ser_vi_layoutxlm_xfund_zh/best_accuracy/ [2024/08/04 17:44:30] ppocr INFO: mode : vi [2024/08/04 17:44:30] ppocr INFO: name : LayoutXLMForSer [2024/08/04 17:44:30] ppocr INFO: num_classes : 5 [2024/08/04 17:44:30] ppocr INFO: pretrained : True [2024/08/04 17:44:30] ppocr INFO: Transform : None [2024/08/04 17:44:30] ppocr INFO: algorithm : LayoutXLM [2024/08/04 17:44:30] ppocr INFO: model_type : kie [2024/08/04 17:44:30] ppocr INFO: Eval : [2024/08/04 17:44:30] ppocr INFO: dataset : [2024/08/04 17:44:30] ppocr INFO: data_dir : train_data/XCCIC_8020/zh_val/image [2024/08/04 17:44:30] ppocr INFO: label_file_list : ['train_data/XCCIC_8020/zh_val/val.json'] [2024/08/04 17:44:30] ppocr INFO: name : SimpleDataSet [2024/08/04 17:44:30] ppocr INFO: transforms : [2024/08/04 17:44:30] ppocr INFO: DecodeImage : [2024/08/04 17:44:30] ppocr INFO: channel_first : False [2024/08/04 17:44:30] ppocr INFO: img_mode : RGB [2024/08/04 17:44:30] ppocr INFO: VQATokenLabelEncode : [2024/08/04 17:44:30] ppocr INFO: algorithm : LayoutXLM [2024/08/04 17:44:30] ppocr INFO: class_path : train_data/XCCIC_8020/class_list_xfun.txt [2024/08/04 17:44:30] ppocr INFO: contains_re : False [2024/08/04 17:44:30] ppocr INFO: order_method : tb-yx [2024/08/04 17:44:30] ppocr INFO: use_textline_bbox_info : True [2024/08/04 17:44:30] ppocr INFO: VQATokenPad : [2024/08/04 17:44:30] ppocr INFO: max_seq_len : 512 [2024/08/04 17:44:30] ppocr INFO: return_attention_mask : True [2024/08/04 17:44:30] ppocr INFO: VQASerTokenChunk : [2024/08/04 17:44:30] ppocr INFO: max_seq_len : 512 [2024/08/04 17:44:30] ppocr INFO: Resize : [2024/08/04 17:44:30] ppocr INFO: size : [224, 224] [2024/08/04 17:44:30] ppocr INFO: NormalizeImage : [2024/08/04 17:44:30] ppocr INFO: mean : [123.675, 116.28, 103.53] [2024/08/04 17:44:30] ppocr INFO: order : hwc [2024/08/04 17:44:30] ppocr INFO: scale : 1 [2024/08/04 17:44:30] ppocr INFO: std : [58.395, 57.12, 57.375] [2024/08/04 17:44:30] ppocr INFO: ToCHWImage : None [2024/08/04 17:44:30] ppocr INFO: KeepKeys : [2024/08/04 17:44:30] ppocr INFO: keep_keys : ['input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] [2024/08/04 17:44:30] ppocr INFO: loader : [2024/08/04 17:44:30] ppocr INFO: batch_size_per_card : 8 [2024/08/04 17:44:30] ppocr INFO: drop_last : False [2024/08/04 17:44:30] ppocr INFO: num_workers : 4 [2024/08/04 17:44:30] ppocr INFO: shuffle : False [2024/08/04 17:44:30] ppocr INFO: Global : [2024/08/04 17:44:30] ppocr INFO: amp_custom_white_list : ['scale', 'concat', 'elementwise_add'] [2024/08/04 17:44:30] ppocr INFO: cal_metric_during_train : False [2024/08/04 17:44:30] ppocr INFO: d2s_train_image_shape : [3, 224, 224] [2024/08/04 17:44:30] ppocr INFO: epoch_num : 50 [2024/08/04 17:44:30] ppocr INFO: eval_batch_step : [0, 19] [2024/08/04 17:44:30] ppocr INFO: infer_img : train_data/XCCIC_8020/zh_val/val.json [2024/08/04 17:44:30] ppocr INFO: infer_mode : False [2024/08/04 17:44:30] ppocr INFO: kie_det_model_dir : None [2024/08/04 17:44:30] ppocr INFO: kie_rec_model_dir : None [2024/08/04 17:44:30] ppocr INFO: log_smooth_window : 10 [2024/08/04 17:44:30] ppocr INFO: print_batch_step : 10 [2024/08/04 17:44:30] ppocr INFO: save_epoch_step : 2000 [2024/08/04 17:44:30] ppocr INFO: save_inference_dir : None [2024/08/04 17:44:30] ppocr INFO: save_model_dir : ./output/ser_vi_layoutxlm_xfund_zh [2024/08/04 17:44:30] ppocr INFO: save_res_path : ./output/ccic/ser/xfund_zh/res [2024/08/04 17:44:30] ppocr INFO: seed : 2022 [2024/08/04 17:44:30] ppocr INFO: use_gpu : True [2024/08/04 17:44:30] ppocr INFO: use_visualdl : False [2024/08/04 17:44:30] ppocr INFO: Loss : [2024/08/04 17:44:30] ppocr INFO: key : backbone_out [2024/08/04 17:44:30] ppocr INFO: name : VQASerTokenLayoutLMLoss [2024/08/04 17:44:30] ppocr INFO: num_classes : 5 [2024/08/04 17:44:30] ppocr INFO: Metric : [2024/08/04 17:44:30] ppocr INFO: main_indicator : hmean [2024/08/04 17:44:30] ppocr INFO: name : VQASerTokenMetric [2024/08/04 17:44:30] ppocr INFO: Optimizer : [2024/08/04 17:44:30] ppocr INFO: beta1 : 0.9 [2024/08/04 17:44:30] ppocr INFO: beta2 : 0.999 [2024/08/04 17:44:30] ppocr INFO: lr : [2024/08/04 17:44:30] ppocr INFO: epochs : 50 [2024/08/04 17:44:30] ppocr INFO: learning_rate : 1e-05 [2024/08/04 17:44:30] ppocr INFO: name : Linear [2024/08/04 17:44:30] ppocr INFO: warmup_epoch : 2 [2024/08/04 17:44:30] ppocr INFO: name : AdamW [2024/08/04 17:44:30] ppocr INFO: regularizer : [2024/08/04 17:44:30] ppocr INFO: factor : 0.0 [2024/08/04 17:44:30] ppocr INFO: name : L2 [2024/08/04 17:44:30] ppocr INFO: PostProcess : [2024/08/04 17:44:30] ppocr INFO: class_path : train_data/XCCIC_8020/class_list_xfun.txt [2024/08/04 17:44:30] ppocr INFO: name : VQASerTokenLayoutLMPostProcess [2024/08/04 17:44:30] ppocr INFO: Train : [2024/08/04 17:44:30] ppocr INFO: dataset : [2024/08/04 17:44:30] ppocr INFO: data_dir : train_data/XCCIC_8020/zh_train/image [2024/08/04 17:44:30] ppocr INFO: label_file_list : ['train_data/XCCIC_8020/zh_train/train.json'] [2024/08/04 17:44:30] ppocr INFO: name : SimpleDataSet [2024/08/04 17:44:30] ppocr INFO: ratio_list : [1.0] [2024/08/04 17:44:30] ppocr INFO: transforms : [2024/08/04 17:44:30] ppocr INFO: DecodeImage : [2024/08/04 17:44:30] ppocr INFO: channel_first : False [2024/08/04 17:44:30] ppocr INFO: img_mode : RGB [2024/08/04 17:44:30] ppocr INFO: VQATokenLabelEncode : [2024/08/04 17:44:30] ppocr INFO: algorithm : LayoutXLM [2024/08/04 17:44:30] ppocr INFO: class_path : train_data/XCCIC_8020/class_list_xfun.txt [2024/08/04 17:44:30] ppocr INFO: contains_re : False [2024/08/04 17:44:30] ppocr INFO: order_method : tb-yx [2024/08/04 17:44:30] ppocr INFO: use_textline_bbox_info : True [2024/08/04 17:44:30] ppocr INFO: VQATokenPad : [2024/08/04 17:44:30] ppocr INFO: max_seq_len : 512 [2024/08/04 17:44:30] ppocr INFO: return_attention_mask : True [2024/08/04 17:44:30] ppocr INFO: VQASerTokenChunk : [2024/08/04 17:44:30] ppocr INFO: max_seq_len : 512 [2024/08/04 17:44:30] ppocr INFO: Resize : [2024/08/04 17:44:30] ppocr INFO: size : [224, 224] [2024/08/04 17:44:30] ppocr INFO: NormalizeImage : [2024/08/04 17:44:30] ppocr INFO: mean : [123.675, 116.28, 103.53] [2024/08/04 17:44:30] ppocr INFO: order : hwc [2024/08/04 17:44:30] ppocr INFO: scale : 1 [2024/08/04 17:44:30] ppocr INFO: std : [58.395, 57.12, 57.375] [2024/08/04 17:44:30] ppocr INFO: ToCHWImage : None [2024/08/04 17:44:30] ppocr INFO: KeepKeys : [2024/08/04 17:44:30] ppocr INFO: keep_keys : ['input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] [2024/08/04 17:44:30] ppocr INFO: loader : [2024/08/04 17:44:30] ppocr INFO: batch_size_per_card : 8 [2024/08/04 17:44:30] ppocr INFO: drop_last : False [2024/08/04 17:44:30] ppocr INFO: num_workers : 4 [2024/08/04 17:44:30] ppocr INFO: shuffle : True [2024/08/04 17:44:30] ppocr INFO: train with paddle 2.5.1 and device Place(gpu:0) W0804 17:44:31.987720 758615 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.8 W0804 17:44:31.989112 758615 gpu_resources.cc:149] device: 0, cuDNN Version: 8.9. [2024/08/04 17:44:36] ppocr INFO: resume from ./output/ser_vi_layoutxlm_xfund_zh/best_accuracy/ [2024/08/04 17:44:36] ppocr WARNING: The first GPU is used for inference by default, GPU ID: 0 [2024/08/04 17:44:37] ppocr WARNING: The first GPU is used for inference by default, GPU ID: 0 [2024-08-04 17:44:38,120] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/layoutxlm-base-uncased/sentencepiece.bpe.model [2024-08-04 17:44:38,726] [ INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/layoutxlm-base-uncased/tokenizer_config.json [2024-08-04 17:44:38,726] [ INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/layoutxlm-base-uncased/special_tokens_map.json [2024/08/04 17:44:40] ppocr INFO: resume from ./output/re_vi_layoutxlm_xfund_zh/best_accuracy/ Traceback (most recent call last): File "/home/aistudio/PaddleOCR/./tools/infer_kie_token_ser_re.py", line 216, in <module> result = ser_re_engine(data) File "/home/aistudio/PaddleOCR/./tools/infer_kie_token_ser_re.py", line 151, in __call__ preds = self.model(re_input) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in __call__ return self.forward(*inputs, **kwargs) File "/home/aistudio/PaddleOCR/ppocr/modeling/architectures/base_model.py", line 85, in forward x = self.backbone(x) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in __call__ return self.forward(*inputs, **kwargs) File "/home/aistudio/PaddleOCR/ppocr/modeling/backbones/vqa_layoutlm.py", line 248, in forward x = self.model( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in __call__ return self.forward(*inputs, **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1412, in forward loss, pred_relations = self.extractor(sequence_output, entities, relations) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in __call__ return self.forward(*inputs, **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1304, in forward relations, entities = self.build_relation(relations, entities) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1248, in build_relation all_possible_relations = paddle.stack( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/tensor/manipulation.py", line 1842, in stack return _C_ops.stack(x, axis) ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0 [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180)
PaddleOCR 现在是社区人员在维护,不是百度官方维护了。我们项目管理人员大部分都不是百度的哈,看不到你这项目的。
我在issuse搜了一下,这个问题还是有不少人遇到过的,#11261 这个issue你们解决了吗?
我先是训练完了ser模型,接着也训练完了re模型,执行这个命令! python3 ./tools/infer_kie_token_ser_re.py -c configs/kie/vi_layoutxlm/re_vi_layoutxlm_xfund_zh.yml -o Architecture.Backbone.checkpoints=./output/re_vi_layoutxlm_xfund_zh/best_accuracy/ Global.infer_img=./train_data/XCCIC_8020/zh_val/val.json Global.infer_mode=False -c_ser configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh.yml -o_ser Architecture.Backbone.checkpoints=./output/ser_vi_layoutxlm_xfund_zh/best_accuracy/
我是在百度飞浆 aistudio 采用gpu的方式训练自定义的数据集,环境如下:
