Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/expand corpus model #1226

Merged
merged 47 commits into from
Sep 11, 2023
Merged

Feature/expand corpus model #1226

merged 47 commits into from
Sep 11, 2023

Conversation

lukavdplas
Copy link
Contributor

This is a partial implementation of the steps I suggest here. The major changes are:

  • The database model for corpora is significantly expanded
  • Corpus definitions are loaded into the database when starting the server, rather than with every /api/corpus/ request.
  • When possible, interactions with corpora now rely on the database model rather than importing the python class.

The intention here is that the expanded database model can also be filled through another method (e.g. a yaml/json file or a form). As a side benefit, it also makes it easier to do validation.

This change affects the deployment and development workflow. In deployment, the loadcorpora command needs to be added to the startup script. In development, the command is included in yarn start-back. However, when you change a python corpus definition, you will need to restart the server or run loadcorpora manually to see the changes in the interface.

The PR includes a documentation file which may be a good place to start review.

add function to save all fields
load corpora with yarn start-back
add verbose option to save_corpus
add language code validator
add fieldsets to model
add test for corpus saving output
use callable for default empty list
include serialisers for languages and category fields
flatten Corpus and CorpusConfiguration into a single JSON

remove old serialisation code
@lukavdplas lukavdplas added code quality code & performance improvements that do not affect user functionality corpus changes to corpus definitions or new corpora affects-deployment changes that require an update in the deployment module labels Aug 9, 2023
Copy link
Contributor

@BeritJanssen BeritJanssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice move towards administration of corpora in the admin interface. The only small request I have is renaming the functions starting with _try to something that is more explicit about conditions and handling of failure of these attempts.

backend/addcorpus/save_corpus.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-deployment changes that require an update in the deployment module code quality code & performance improvements that do not affect user functionality corpus changes to corpus definitions or new corpora
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants