-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support hooks in analysis pipeline #1887
base: master
Are you sure you want to change the base?
Conversation
const ( | ||
TokensAnalyzerType = "token" | ||
HookTokensAnalyzerType = "hook_token" | ||
VectorAnalyzerType = "vector" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move vector related stuff to a separate file with "vector" build tag
Err error | ||
} | ||
|
||
func AnalyzeForTokens(analyzer Analyzer, input []byte) (TokenStream, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comment:
// A utility function, helpful for analyzing an input to generate TokenStream ( and error, if any )
Previously, Analyze() method of an analyzer to return TokenStream.
But as per the change in this PR, Analyze() method will now return a value of type interface{}.
( Validating and using it can be done based on analyzer.Type() )
Thus, For the benefit of users of old Analyzer interface, this utiity will come handly , to migrate to new Analyzer interface.
analyzerType := analyzer.Type() | ||
if analyzerType != TokensAnalyzerType && | ||
analyzerType != HookTokensAnalyzerType { | ||
return nil, fmt.Errorf("cannot analyze text with analyzer of type: %s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternate error msg: "given analyzer is not compatible to be used as a token analyzer"
- While analyzing a doc, analysis of few fields can fail. - We want to index the part of doc for which analysis succeeded.
39ff270
to
1cc5a32
Compare
Description
Aim is to let embedder register analyzers in bleve, at run time.
These registered analyzers can then be specified in the index mapping as analyzers for fields.
change log
new Analyzer interface
updates in Field interface
New Registry to store embedder submitted analysis hooks
update analyzer registry to also hold analyzers created using hooks
Related changes: