-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grok items not in the docs #5605
Comments
Hi @chrismo. Thanks for pointing these out. Indeed, our Before I dive into your specific questions, it might help us to understand how it comes across as a possible improvement for you over As for the specific questions, something else we touch on in the docs is that there's no formal specification of Grok, so in situations like this we basically just see what Logstash's does (since it's effectively the reference implementation) and then see if we have a defensive reason for doing something different. Here's what that shows me regarding your two questions, and I'll go over these with the lead feature developer to confirm.
tl;dr - Yeah, that comes across as a bug. Here's the Logstash comparison. I'm testing with Logstash v8.17.1 and will use an example based on the one from their Grok Basics docs.
To start from the "happy path", if I paste their example log line into that window so it's processed by their stdin plugin, I can see the parsing was successful because they show what was parsed into the named fields such as
Likewise, if I type in junk that doesn't parse, in addition to the named fields not being populated, they communicate this by adding a
With that established, I'll then pivot to using this alternate config where I've dropped all the named fields, hence much like your example with just
As you can see, it did not show the In terms of what we should probably do by comparison, an error like
tl;dr - Now that the "what happens when there's zero named fields" effect can be explained as a likely bug, the inline regexps doing what you showed seems like it's pretty much working as designed. The parts of Grok patterns that aren't patterns and field destinations of the
But likewise I can put
As for why this wasn't already called out explicitly in the docs, I guess that's my fault. I've been supporting Grok in one place or another for so many years that I've gotten accustomed to the fact that most users who look for it are already familiar with one of the existing implementations such as Logstash's, or even if not, they're likely to do Google searches and learn a lot of the basics that way, including stuff like this. So in my last update of the docs I put that Comparison to Other Implementations stuff up top for precisely that reason. But your experience illustrates that our doc would benefit from calling that out explicitly to benefit users that don't have much prior Grok experience. Since I imagine a doc update will be necessary after we make some change related to your first question, I'll plan to attack both docs updates at the same time. If you have other feedback to share on what I've written here thus far I'd also take that into account when doing those updates. |
The only use-case for But I don't have any permanent use-case for only a nameless capture - then I'd just use |
Thanks for the additional info @chrismo. As you may have noticed, @mattnibs already has a PR #5608 up for the proposed "empty record" approach, so here it is doing the right thing on that branch when faced with your example:
As for your proposal of the warning, I see what you're getting at and will share that thought with the team, but don't be surprised if we don't pursue that idea. It's kind of an ongoing philosophical debate about how verbose we want to be in situations like this vs. letting the silence do the talking. When breaking ties in such debates someone will often bring up examples like how |
Nah, I'm fine with that. We've discussed warnings in other cases, where, like nothing at all is returned ... but here an empty record being returned is great. Thx! |
but at least one must be named:
I presume these aren't glitches of the function - if not, then could these items be added to the docs? These two things make it nicer, IMO, to use grok in more scenarios where I might otherwise try awk or sed.
The text was updated successfully, but these errors were encountered: