Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index version management from interface #1007

Open
lukavdplas opened this issue Nov 28, 2022 · 1 comment
Open

Index version management from interface #1007

lukavdplas opened this issue Nov 28, 2022 · 1 comment
Labels
corpus changes to corpus definitions or new corpora enhancement improvements to user functionality needs-mockup this suggestion could use a picture before it is implemented

Comments

@lukavdplas
Copy link
Contributor

#1004 made me realise that we have versioned index names in production, so indexing corpora from the interface (#985) should account for version management.

The minimal implementation would be that curators do not see the different versions at all, and the application just quietly updates the alias. It would be good to have an environment setting for production (i.e. versioned index names and aliases) vs. development (none of that, clear and overwrite the index with every update) - I don't think this ever needs to be set per corpus or, as we do now, per indexing action.

In this case, the older index versions could be useful when a curator contacts us about an issue. Still, it would be nice if they could restore older versions themselves. This would save unnecessary duplicates.

The indexing menu (step 3 in #982) could show a list of all indices matching the corpus name, with the option to delete inactive indices, or switch which version is currently active.

However, old indices may not be compatible with the current corpus definition. Ideally, the application will save a "snapshot" of the corpus at the time of indexing (doable with the export option #981), and restoring an old index also means restoring the (relevant) corpus settings.

@lukavdplas lukavdplas added enhancement improvements to user functionality corpus changes to corpus definitions or new corpora labels Nov 28, 2022
@lukavdplas lukavdplas changed the title Index version management Index version management from interface Nov 28, 2022
@lukavdplas lukavdplas added the needs-mockup this suggestion could use a picture before it is implemented label Feb 16, 2024
@lukavdplas
Copy link
Contributor Author

These issues cover preparations to database models, etc.

When these are done, we can create an API and an interface in the frontend. This should be much more streamlined; proper index management requires admin privileges.

@JeltevanBoheemen and I discussed this and made a rough outline. We're envisioning these functions for the user:

  • Upload additional CSV files. The user has already uploaded one CSV file, but this may only be a sample. (Also, they may add more data later, break up a dataset into multiple files, etc.)
  • Create a new index for the corpus and activate it. The user will be asked to select one or more files from their CSV uploads. These will be the source files. The backend should now (in order):
    • Create a new index in Elasticsearch
    • Extract data from the selected files and populate the index
    • Link the index to the corpus
    • If the corpus had existing indices, delete them
    • Send the user an email that their index is ready
  • If the corpus is already indexed: delete the index.
  • Toggle the corpus as private/public.

If a corpus has an index, other steps in the form will still be available, but fields that affect the index will be marked with a warning sign. (Or something like that.) When the user changes those fields and hits save, they'll get a confirmation window. The existing index will become invalid, and they'll have to create a new index, or undo their changes, before they can make the corpus public again.

Notes:

  • Users cannot manage multiple index versions; they'll always use the latest.
  • Deleting indices is a good idea for many reasons, but we should watch out that this won't result in situations where heavyweights like times get deleted by accident 🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
corpus changes to corpus definitions or new corpora enhancement improvements to user functionality needs-mockup this suggestion could use a picture before it is implemented
Projects
None yet
Development

No branches or pull requests

1 participant