Feature/nav 91 create raptor cache to prevent building raptor everytime for same date queries #52

munterfi · 2024-06-05T15:54:50Z

Cache raptor instances per day using LRU strategy.

@clukas1 considering your description in the Jira ticket:

Store built raptors so that they can be re-used again if a second query for the same date is requested.

Ideally add logic, that raptor only get’s rebuilt if the service calendar type is different. I.e. if Wednesday and Thursday schedule is identical, use the same raptor instance for both.

The second point won't work, since there are special calendar date in the GTFS, e.g. Christmas 1. May, ... So we have to cache raptor instances per date not weekday.

But: I think I got a solution to mask trips inside Raptor. This will also eradicate the need for the stop validations (except departure time), since we have all stops from the GTFS in the raptor, but some stops will just have no active trips.

I can prepare everything around it, but would be nice to build this together into the mega loop of the Raptor, maybe Friday evening or at the weekend?

Unfortunately we still have some strange errors when stops are not present in the Raptor:

- Use reentrant lock instead of synchronized for critical sections in the cache.

clukas1

Minor comments, suggestions to improve tests.

src/main/java/ch/naviqore/service/impl/PublicTransitServiceImpl.java

src/test/java/ch/naviqore/utils/cache/EvictionCacheTest.java

clukas1 · 2024-06-05T21:38:17Z

Cache raptor instances per day using LRU strategy.

@clukas1 considering your description in the Jira ticket:

Store built raptors so that they can be re-used again if a second query for the same date is requested.

Ideally add logic, that raptor only get’s rebuilt if the service calendar type is different. I.e. if Wednesday and Thursday schedule is identical, use the same raptor instance for both.

The second point won't work, since there are special calendar date in the GTFS, e.g. Christmas 1. May, ... So we have to cache raptor instances per date not weekday.

But: I think I got a solution to mask trips inside Raptor. This will also eradicate the need for the stop validations (except departure time), since we have all stops from the GTFS in the raptor, but some stops will just have no active trips.

I can prepare everything around it, but would be nice to build this together into the mega loop of the Raptor, maybe Friday evening or at the weekend?

Unfortunately we still have some strange errors when stops are not present in the Raptor:

Regarding point two in the Jira Issue, I'm aware not every Thursday etc. is the same. But I think if we combine all active service ids, we can generate a meaningful cache key. Because I expect most of the regular weekdays etc will have the same service ids. Might require adding a function to create this key.

Regarding mask, sure we can huddle up and integrate it into the raptor algorithm. I'm busy on Friday, but should have some time on Saturday.

- The only compatible version of log4j with our project, seems to be 3.0.0-beta1. - Versions below 3.0.0 and also 3.0.0-beta2 are ignoring the log4j2-test.properties file, which sets the default log level for tests to INFO.

…date - Since many days have the same active trip set, this prevents from recomputing the raptor for each date.

- This prevents from recomputing the active trips for a day, that has recently been queried.

- Ensure values are not recomputed when cached.

clukas1

Again some small comments

clukas1 · 2024-06-06T16:34:05Z

pom.xml

weird that only beta works, but good if that works

I think we have somehow a conflict between the logging of spring and the log4j which we pull directly in the pom.xml.
Logging works for testing and production, but in production the formatting is not consistent anymore, so it uses a mix of spring and log4j2.properties:

I gues we should open a new ticket in JIRA to fix this. At the moment at least all logs we want are shown, although not in the format we want. For now I can live with that...

src/main/java/ch/naviqore/service/impl/PublicTransitServiceImpl.java

munterfi · 2024-06-06T16:54:43Z

Regarding point two in the Jira Issue, I'm aware not every Thursday etc. is the same. But I think if we combine all active service ids, we can generate a meaningful cache key. Because I expect most of the regular weekdays etc will have the same service ids. Might require adding a function to create this key.

Yes you are right, in theorie this should work, and it is now implemented in abacfbe. On the integration test data sample this works well, but on the GTFS of Switzerland, almost every day has its own active trip set. So even if we query normal weekdays, most of the time a new raptor will be created.

Here also an example how the equals on Sets works in Java. I was not sure, if the set instance itself matters or not: It does not, just its content is considered in equals:

package ch.naviqore.example;

import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.RequiredArgsConstructor;

import java.util.HashSet;
import java.util.Set;

public class Test {

    @RequiredArgsConstructor
    @Getter
    @EqualsAndHashCode
    static class Instance {
        private final String id;
    }

    public static void main(String[] args) {
        Instance instanceA = new Instance("a");
        Instance instanceA2 = new Instance("a");
        Instance instanceB = new Instance("b");
        Instance instanceC = new Instance("c");

        Set<Instance> set1 = new HashSet<>();
        Set<Instance> set2 = new HashSet<>();

        set1.add(instanceA);
        set1.add(instanceA2);
        set1.add(instanceB);
        set1.add(instanceC);

        set2.add(instanceA);
        set2.add(instanceB);
        set2.add(instanceC);

        System.out.println(set1.equals(set2));
        // true

        set2.add(instanceA);
        System.out.println(set1.equals(set2));
        // true

        set1.remove(instanceA);
        System.out.println(set1.equals(set2));
        // false
    }

}

clukas1 · 2024-06-06T16:58:14Z

Review already done :) See my comments before you requested the review. Only thing I would consider to reduce the set size, is using sets of calendars instead of trips to identify similar days.

clukas1 · 2024-06-06T16:58:49Z

Review already done :) See my comments before you requested the review. Only thing I would consider to reduce the set size, is using sets of calendars instead of trips to identify similar days.

But sadly, you're right for a large dataset (like switzerland) every day will be a special case.

clukas1

sorry, have to leave another message

… instead of active trips - Needs less memory in the cache and is more efficiently computed.

munterfi added 5 commits June 5, 2024 16:48

ENH: NAV-91 - Add eviction cache with MRU and LRU strategy

d54b2d9

STYLE: NAV-91 - Add logging to eviction cache

4a0f16a

ENH: NAV-91 - Use cache with LRU strategy for raptor in service

4be580b

STYLE: NAV-91 - Set log level of cache to debug

c587509

ENH: NAV-91 - Improve concurrent cache access

6d059e2

- Use reentrant lock instead of synchronized for critical sections in the cache.

munterfi requested a review from clukas1 June 5, 2024 15:54

munterfi self-assigned this Jun 5, 2024

clukas1 requested changes Jun 5, 2024

View reviewed changes

munterfi added 4 commits June 6, 2024 10:28

FIX: NAV-91 - Log tests on DEBUG level

b358d85

- The only compatible version of log4j with our project, seems to be 3.0.0-beta1. - Versions below 3.0.0 and also 3.0.0-beta2 are ignoring the log4j2-test.properties file, which sets the default log level for tests to INFO.

ENH: NAV-91 - Use active trip set as cache key for raptor instead of …

abacfbe

…date - Since many days have the same active trip set, this prevents from recomputing the raptor for each date.

ENH: NAV-91 - Cache also the active trips per day.

2385b27

- This prevents from recomputing the active trips for a day, that has recently been queried.

TEST: NAV-91 - Improve tests of eviction cache

6633abc

- Ensure values are not recomputed when cached.

clukas1 requested changes Jun 6, 2024

View reviewed changes

munterfi requested a review from clukas1 June 6, 2024 16:54

clukas1 requested changes Jun 6, 2024

View reviewed changes

munterfi added 2 commits June 7, 2024 09:47

TEST: NAV-91 - Use active calendars as cache key for raptor instances…

95c44b1

… instead of active trips - Needs less memory in the cache and is more efficiently computed.

REFACTOR: NAV-91 - Extract inner class for raptor caching

2247888

munterfi requested a review from clukas1 June 7, 2024 08:14

clukas1 approved these changes Jun 7, 2024

View reviewed changes

clukas1 merged commit 168591c into main Jun 7, 2024
2 checks passed

clukas1 deleted the feature/NAV-91-create-raptor-cache-to-prevent-building-raptor-everytime-for-same-date-queries branch June 7, 2024 14:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/nav 91 create raptor cache to prevent building raptor everytime for same date queries #52

Feature/nav 91 create raptor cache to prevent building raptor everytime for same date queries #52

munterfi commented Jun 5, 2024 •

edited

Loading

clukas1 left a comment

clukas1 commented Jun 5, 2024 •

edited

Loading

clukas1 left a comment

clukas1 Jun 6, 2024

munterfi Jun 6, 2024

munterfi commented Jun 6, 2024

clukas1 commented Jun 6, 2024

clukas1 commented Jun 6, 2024

clukas1 left a comment

Feature/nav 91 create raptor cache to prevent building raptor everytime for same date queries #52

Feature/nav 91 create raptor cache to prevent building raptor everytime for same date queries #52

Conversation

munterfi commented Jun 5, 2024 • edited Loading

clukas1 left a comment

Choose a reason for hiding this comment

clukas1 commented Jun 5, 2024 • edited Loading

clukas1 left a comment

Choose a reason for hiding this comment

clukas1 Jun 6, 2024

Choose a reason for hiding this comment

munterfi Jun 6, 2024

Choose a reason for hiding this comment

munterfi commented Jun 6, 2024

clukas1 commented Jun 6, 2024

clukas1 commented Jun 6, 2024

clukas1 left a comment

Choose a reason for hiding this comment

munterfi commented Jun 5, 2024 •

edited

Loading

clukas1 commented Jun 5, 2024 •

edited

Loading