Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

437 implement api to manage indexing queues #444

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

michalkrzem
Copy link
Collaborator

@michalkrzem michalkrzem commented Dec 31, 2024

Your checklist for this pull request

  • I've read the contributing guideline.
  • I've tested my changes by building and running mquery, and testing changed functionality (if applicable)
  • I've added automated tests for my change (if applicable, optional)
  • I've updated documentation to reflect my change (if applicable)

What is the current behaviour?
We manually add files to the queue.

What is the new behaviour?
We have an API endpoint for adding files to QueuedFile in the database

Test plan
Endpoint api "api/queue/ursadb_id"

Closing issues

fixes #437

@michalkrzem michalkrzem linked an issue Dec 31, 2024 that may be closed by this pull request
src/db.py Outdated
Comment on lines 483 to 491
for file in file_paths:
obj = QueuedFile(
ursadb_id=ursadb_id,
path=file.path,
index_types=file.index_types,
tags=file.tags,
)
session.add(obj)
session.commit()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ostatnio, gdy próbowałem robić inne zadanie i miałem do stworzenia wiele elementów, opłacało się skorzystać z bulk_insert_mappings zamiast dodawać iteracyjnie obiekt po obiekcie: https://stackoverflow.com/questions/34057006/using-bulk-insert-mappings, ewentualnie jakakolwiek inna operacja do wstawiania wielu modeli na raz.

Plus wydaje mi się, że obecnie po prostu iteracyjnie nadpisujesz tylko zmienną obj, po czym dodajesz i commitujesz tylko ostatnią instancję.

Copy link
Collaborator Author

@michalkrzem michalkrzem Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

o, faktycznie - jeszcze jedna tabulacja

src/app.py Outdated
@@ -551,7 +556,7 @@ def job_statuses(user: User = Depends(current_user)) -> JobsSchema:
@app.delete(
"/api/query/{job_id}",
response_model=StatusSchema,
dependencies=[Depends(can_manage_queries)],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accidental change?

src/db.py Outdated
@@ -474,3 +477,17 @@ def alembic_upgrade(self) -> None:
config_file = Path(__file__).parent / "alembic.ini"
alembic_cfg = Config(str(config_file))
command.upgrade(alembic_cfg, "head")

def set_queued_file(self, ursadb_id, file_paths):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_queued_file maybe, or enqueue_file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also ursadb: str, file_paths: list[str] guessing by variable names, but I didn't check the types)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or List[FilePathInputSchema] actually I guess

src/schema.py Outdated

class FilePathInputSchema(BaseModel):
path: str
index_types: List[str]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should validate the index_types somehow. I thought there's an enum for that, but apparently not. I think the easiest solution is to add a validator checking that every element is in ["gram3", "text4", "hash4", "wide8"]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can i use List[Literal["gram3", "text4", "hash4", "wide8"]] from typing?

Copy link
Contributor

@msm-code msm-code left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I'll just finish testing it and merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement API to manage indexing queues
4 participants