Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Provide high-level overview of Thoth storage model #2661

Closed

Conversation

VannTen
Copy link
Member

@VannTen VannTen commented Jul 19, 2022

Related Issues and Dependencies

closes #2658

Description

This intend to improve the thoth-storages modules docs to give a high-level
overview of how the data is organized, and differents items relates together.

The intended public would be :

  • Thoth developers, who need to understand the storage model in order to work on
    the storage implementation
  • Thoth operators, to understand which data is valuable and the relations
    between data items: what can be recreated, at what cost, etc. + what must be
    backed up atomically (avoid inconsistency)

Ideally, it should stays generic but points to part of the code / generated
documentation which can list all the concerned objects and in what "class" they
fall.

Current state

This is intended as a continuation of the conversation in #2658, so the
information might not be accurate yet, it reflects my current comprehension /
hypotheses

Goals

  • Developer documentation
  • Operational documentation

@sesheta sesheta added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 19, 2022
@sesheta
Copy link
Member

sesheta commented Jul 19, 2022

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@sesheta
Copy link
Member

sesheta commented Jul 19, 2022

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign mayacostantini after the PR has been reviewed.
You can assign the PR to them by writing /assign @mayacostantini in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sesheta sesheta requested review from harshad16 and KPostOffice July 19, 2022 11:53
@VannTen VannTen marked this pull request as ready for review July 26, 2022 13:07
@VannTen
Copy link
Member Author

VannTen commented Jul 26, 2022

As noted in the linked issue, maybe operationnal knowledge should not be
part of storages, but in a central places regardless of the part of Thoth.

So the current diff (only dev / internal storage logic) will be a start.

@VannTen VannTen changed the title WIP: Provide high-level overview of Thoth storage model Provide high-level overview of Thoth storage model Jul 26, 2022
@sesheta sesheta removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 26, 2022
- The aforementioned results documents are referenced.
- Reference to packages sources (ex: Python packages indexes).

These two combined form the knowledge graph.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we can summarize the content of Thoth's postgres database in such a simple way. In fact, the DB contains all kind of information not directly related to document results and package indexes, see the database model for an overview of the content available.

@VannTen
Copy link
Member Author

VannTen commented Aug 17, 2022 via email

@mayaCostantini
Copy link
Contributor

I can see there is some common patterns between the differents models, but I'm not sure what those patterns are. Maybe we should have more parents class regrouping related models, which would also serve as self-documenting the code ? (by being "category" of things stored in the DB)

I don't think it would be a good idea to group models under more generic objects, as declaring one class per table seems like a clearer implementation wrt SQLAlchemy ORM conventions. Do you have any examples of classes you think should be more generic to help me understand your point?

@VannTen
Copy link
Member Author

VannTen commented Aug 29, 2022

Sure. For example, SoftwareEnvironment and ExternalSoftwareEnvironment

EDIT: Also the various *Run (id, datetime)

I don't think it would be a good idea to group models under more
generic objects, as declaring one class per table seems like a clearer
implementation wrt SQLAlchemy ORM conventions.

Theirs docs at least suggest that inheritance is a valid way to work with the
ORM (using either joined table or single table
inheritance
)

@VannTen VannTen changed the title Provide high-level overview of Thoth storage model WIP: Provide high-level overview of Thoth storage model Aug 30, 2022
@sesheta sesheta added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 30, 2022
@goern goern closed this May 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Developper documentation relation postgres <--> Ceph
4 participants