Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution error #16

Open
orbxball opened this issue Jul 12, 2018 · 3 comments
Open

Execution error #16

orbxball opened this issue Jul 12, 2018 · 3 comments

Comments

@orbxball
Copy link

Hi,

I'm using this tool for parsing Chinese sentences.
However, I encounter two problems which make me not be able to do the parsing work.

First, when I ran
python 文章/模型訓練/臺華新聞做語料.py
it occurs this error

Traceback (most recent call last):
  File "文章/模型訓練/臺華新聞做語料.py", line 2, in <module>
    from 文章.對語料庫網站掠資料落來 import 對語料庫網站掠資料落來
ModuleNotFoundError: No module named '文章'

I guess it might be the encoding error. It is a smaller problem.

Second, I use ipython to execute the .py instead.
It is also caught on this error

In [1]: %run 文章/模型訓練/臺華新聞做語料.py
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/tmp2/b03502040/data/icorpus/文章/模型訓練/臺華新聞做語料.py in <module>()
     32
     33 if __name__ == '__main__':
---> 34         臺華新聞做語料().做()

/tmp2/b03502040/data/icorpus/文章/模型訓練/臺華新聞做語料.py in 做(self)
     21                                 句物件=分析器.轉做句物件(華語句)
     22                                 華.append(譀鏡.看分詞(句物件))
---> 23                         閩.extend(資料['閩南語'].split('\n'))
     24                         if len(華)!=len(閩):
     25                                 print(華[-100:],閩[-100:])

KeyError: '閩南語'

If I cannot generate the corpus, I cannot train the model and translate Chinese sentences into 台羅拼音 further. Thank you for any instructions on this.

@sih4sing5hong5
Copy link
Owner

If you want corpus only, using the yaml directly.
https://github.com/sih4sing5hong5/icorpus/blob/master/icorpus.yaml

This project run by django.
We recommend virturlenv

virturalenv --python=python3 venv
source venv/bin/activate
pip install -r requirements.txt
python manage run server
# open http://localhost:8000

@orbxball
Copy link
Author

Actually, what I want to do is to translate our sentences into 台羅拼音 sentences as input for our model. However, it seems that we have to run the training step provided in the instructions first, and then we can translate our sentences by python 文章/摩西服務.py .

The error occurred in the virtualenv. I have followed the instructions given in README. The error happened during the step python 文章/模型訓練/臺華新聞做語料.py. I've done all the steps before this one. Hope you can help us to succeed in running this tool. Thank you very much!

@sih4sing5hong5
Copy link
Owner

The translation script in this project is out of date.
We modulize these script with docker in another project although it is under WIP.

We have four modules to detail with these processes:

  • 臺灣言語工具 is package for low-level string/object manipulations.
  • 臺灣言語服務 is django app with database format, service API and command-based function block of 臺灣言語工具.
  • Huē-ji̍p is also django apps importing corpus into 臺灣言語服務 format.
  • Ho̍k-bū contains Dockerfiles with application purpose, written with above packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants