-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize search start and search end index computation while finding … #134
base: main
Are you sure you want to change the base?
Conversation
…token. Earlier, the code would run through all MentionSpans in the text to determine the span index closest to the cursor on either side. This is rather wasteful and, in the case of getSearchEndIndex(), even unnecessary. The idea is to use text.nextSpanTransition to iterate sequentially over batches of spans and break out of the loop as soon as the closest index is reached.
// iterate over all spans before the cursor | ||
// we do this by finding the next span transition and looking back to find the closest span to the cursor | ||
int nextSpanStart; | ||
for (int searchStartIndex = 0; searchStartIndex < cursor; searchStartIndex = nextSpanStart) { | ||
|
||
// find the next span transition | ||
nextSpanStart = text.nextSpanTransition(searchStartIndex, text.length(), MentionSpan.class); | ||
|
||
// get the spans from searchStartIndex to nextSpanStart | ||
// of these, we find the closest span to the cursor | ||
MentionSpan[] closestLastSpans = text.getSpans(searchStartIndex, nextSpanStart, MentionSpan.class); | ||
for (MentionSpan span : closestLastSpans) { | ||
int end = text.getSpanEnd(span); | ||
if (end > closestLastSpanEnd && end <= cursor) { | ||
closestLastSpanEnd = end; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that this is actually faster given that the new code has more calls to get the spans for different substrings and to compute next span transitions (which internally will loop through all the spans, e.g. see this).
It is also worth noting that the new code is a fair bit longer/more complicated.
I'd like to understand the context for changing this. What is the reason for these changes? Are you running into measurable performance issues in your app? Have you benchmarked these methods? More info would be helpful. If we're going to make the code more complicated, we need to understand what we're getting in return.
MentionSpan[] spans = text.getSpans(0, text.length(), MentionSpan.class); | ||
int closestAfterCursor = text.length(); | ||
for (MentionSpan span : spans) { | ||
int start = text.getSpanStart(span); | ||
if (start < closestAfterCursor && start >= cursor) { | ||
closestAfterCursor = start; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nifty, I didn't know about the nextSpanTransition(..)
method when I wrote this many years ago, very nice! 👍
While finding the current token, WordTokenizer would run through all MentionSpans in the text to determine the span index closest to the cursor on either side. This is rather wasteful and, in the case of getSearchEndIndex(), even unnecessary.
The idea is to use nextSpanTransition() to iterate sequentially till the cursor over batches of spans to get the closest last span end index in getSearchStartIndex.
For getSearchEndIndex, nextSpanTransition() does exactly what it needs without ever looking at any MentionSpans.