Optimize search start and search end index computation while finding … #134

amansaryal · 2023-11-29T20:42:55Z

While finding the current token, WordTokenizer would run through all MentionSpans in the text to determine the span index closest to the cursor on either side. This is rather wasteful and, in the case of getSearchEndIndex(), even unnecessary.

The idea is to use nextSpanTransition() to iterate sequentially till the cursor over batches of spans to get the closest last span end index in getSearchStartIndex.

For getSearchEndIndex, nextSpanTransition() does exactly what it needs without ever looking at any MentionSpans.

…token. Earlier, the code would run through all MentionSpans in the text to determine the span index closest to the cursor on either side. This is rather wasteful and, in the case of getSearchEndIndex(), even unnecessary. The idea is to use text.nextSpanTransition to iterate sequentially over batches of spans and break out of the loop as soon as the closest index is reached.

nhibner · 2023-12-02T23:33:42Z

spyglass/src/main/java/com/linkedin/android/spyglass/tokenization/impl/WordTokenizer.java

+        // iterate over all spans before the cursor
+        // we do this by finding the next span transition and looking back to find the closest span to the cursor
+        int nextSpanStart;
+        for (int searchStartIndex = 0; searchStartIndex < cursor; searchStartIndex = nextSpanStart) {
+
+            // find the next span transition
+            nextSpanStart = text.nextSpanTransition(searchStartIndex, text.length(), MentionSpan.class);
+
+            // get the spans from searchStartIndex to nextSpanStart
+            // of these, we find the closest span to the cursor
+            MentionSpan[] closestLastSpans = text.getSpans(searchStartIndex, nextSpanStart, MentionSpan.class);
+            for (MentionSpan span : closestLastSpans) {
+                int end = text.getSpanEnd(span);
+                if (end > closestLastSpanEnd && end <= cursor) {
+                    closestLastSpanEnd = end;
+                }
            }


Not sure that this is actually faster given that the new code has more calls to get the spans for different substrings and to compute next span transitions (which internally will loop through all the spans, e.g. see this).

It is also worth noting that the new code is a fair bit longer/more complicated.

I'd like to understand the context for changing this. What is the reason for these changes? Are you running into measurable performance issues in your app? Have you benchmarked these methods? More info would be helpful. If we're going to make the code more complicated, we need to understand what we're getting in return.

nhibner · 2023-12-02T23:34:57Z

spyglass/src/main/java/com/linkedin/android/spyglass/tokenization/impl/WordTokenizer.java

-        MentionSpan[] spans = text.getSpans(0, text.length(), MentionSpan.class);
-        int closestAfterCursor = text.length();
-        for (MentionSpan span : spans) {
-            int start = text.getSpanStart(span);
-            if (start < closestAfterCursor && start >= cursor) {
-                closestAfterCursor = start;
-            }
-        }


This is nifty, I didn't know about the nextSpanTransition(..) method when I wrote this many years ago, very nice! 👍

nhibner reviewed Dec 2, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize search start and search end index computation while finding … #134

Optimize search start and search end index computation while finding … #134

amansaryal commented Nov 29, 2023

nhibner Dec 2, 2023

nhibner Dec 2, 2023

Optimize search start and search end index computation while finding … #134

Are you sure you want to change the base?

Optimize search start and search end index computation while finding … #134

Conversation

amansaryal commented Nov 29, 2023

nhibner Dec 2, 2023

Choose a reason for hiding this comment

nhibner Dec 2, 2023

Choose a reason for hiding this comment