-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[META] Indexing and agent management UI #410
Comments
I'll just use screenshots from draw.io, sorry about that. First of all, we need a new kind of agent, because regular agents are not guaranteed to be on the same machine as ursadb. From the beginning it was supported (and, AFAIK, used in production in most cases) to have mquery workers on a different machine than ursadb. So the architecture will look like this (grey rectangles represent separate machines): To achieve this we need to:
Technically we don't need to support "unmanaged" ursadb instances after this release, but it shouldn't be hard so I don't see why not. |
So now the interface. I've sketched some rough ideas in draw.io. I'm not overly attached to them, but they contain most of the things I think are important, divided into subpages in a logical way, while also making sure mquery stays flexible (see above for various weird configurations we support). This main status page contains the While designing this I've realised that we also need to implement queues (otherwise reindexing automatically may create extremely inefficient database). This is also necessary to support s3 indexing. So yeah, indexing queue feature:
|
The agent page. Exposes most of the things from the old status page, and:
"Unmanaged agent group" page should be the same, since "managed agent groups" are just agent groups with one (and exactly one) manager active. But it's entirely possible that a group has some sources configured, but management agent is temporarily down. |
Finally, storage page. I've prepared two sketches when thinking about this. First, for s3 storage it would look like this: Here when indexing files, we process them using S3Plugin goes very roughly like this:
Sadly we also need a sub-UI to add indexing rules and plugins. The process is divided into collecting and indexing:
Indexing sends a ursadb index request, and happens in one of three cases:
It's unclear how to implement scheduling functionality, but I think the work can be handled by manager agent, and scheduling by rq-scheduler. |
Finally, this is how this UI coukd look for a local directory: In this case we have two indexing rules and two plugins, because we want to index small and medium files separately. This will cause a small performance regression in comparison to the current index.py script (because a file tree needs to be traversed twice), but the slowest operation is indexing anyway. |
This is a very ambitious project, that is planned as a main feature of v1.5 release (and the last mquery feature).
The idea is to make it possible to configure and manage every (well, almost every) aspect of mquery from the web UI. Especially indexing, which was the most repeatedly asked feature request since mquery was created.
To do this we'll have to heavily redesign... a lot of things in the UI. In return we'll get a more user-friendly interface overall, I hope.
Another thing we have to do is to add a mquery "manager" agent to ursadb. This is unfortunately not avoidable, for reasons described below. This comment is a meta-comment for tracking the index management UI. I'll add more detailed tasks later.
The text was updated successfully, but these errors were encountered: