-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/nav 91 create raptor cache to prevent building raptor everytime for same date queries #52
Conversation
- Use reentrant lock instead of synchronized for critical sections in the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, suggestions to improve tests.
src/main/java/ch/naviqore/service/impl/PublicTransitServiceImpl.java
Outdated
Show resolved
Hide resolved
Regarding point two in the Jira Issue, I'm aware not every Thursday etc. is the same. But I think if we combine all active service ids, we can generate a meaningful cache key. Because I expect most of the regular weekdays etc will have the same service ids. Might require adding a function to create this key. Regarding mask, sure we can huddle up and integrate it into the raptor algorithm. I'm busy on Friday, but should have some time on Saturday. |
- The only compatible version of log4j with our project, seems to be 3.0.0-beta1. - Versions below 3.0.0 and also 3.0.0-beta2 are ignoring the log4j2-test.properties file, which sets the default log level for tests to INFO.
…date - Since many days have the same active trip set, this prevents from recomputing the raptor for each date.
- This prevents from recomputing the active trips for a day, that has recently been queried.
- Ensure values are not recomputed when cached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again some small comments
pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird that only beta works, but good if that works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have somehow a conflict between the logging of spring and the log4j which we pull directly in the pom.xml
.
Logging works for testing and production, but in production the formatting is not consistent anymore, so it uses a mix of spring and log4j2.properties
:
I gues we should open a new ticket in JIRA to fix this. At the moment at least all logs we want are shown, although not in the format we want. For now I can live with that...
src/main/java/ch/naviqore/service/impl/PublicTransitServiceImpl.java
Outdated
Show resolved
Hide resolved
src/main/java/ch/naviqore/service/impl/PublicTransitServiceImpl.java
Outdated
Show resolved
Hide resolved
Yes you are right, in theorie this should work, and it is now implemented in abacfbe. On the integration test data sample this works well, but on the GTFS of Switzerland, almost every day has its own active trip set. So even if we query normal weekdays, most of the time a new raptor will be created. Here also an example how the equals on Sets works in Java. I was not sure, if the set instance itself matters or not: It does not, just its content is considered in equals: package ch.naviqore.example;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.RequiredArgsConstructor;
import java.util.HashSet;
import java.util.Set;
public class Test {
@RequiredArgsConstructor
@Getter
@EqualsAndHashCode
static class Instance {
private final String id;
}
public static void main(String[] args) {
Instance instanceA = new Instance("a");
Instance instanceA2 = new Instance("a");
Instance instanceB = new Instance("b");
Instance instanceC = new Instance("c");
Set<Instance> set1 = new HashSet<>();
Set<Instance> set2 = new HashSet<>();
set1.add(instanceA);
set1.add(instanceA2);
set1.add(instanceB);
set1.add(instanceC);
set2.add(instanceA);
set2.add(instanceB);
set2.add(instanceC);
System.out.println(set1.equals(set2));
// true
set2.add(instanceA);
System.out.println(set1.equals(set2));
// true
set1.remove(instanceA);
System.out.println(set1.equals(set2));
// false
}
} |
Review already done :) See my comments before you requested the review. Only thing I would consider to reduce the set size, is using sets of calendars instead of trips to identify similar days. |
But sadly, you're right for a large dataset (like switzerland) every day will be a special case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, have to leave another message
… instead of active trips - Needs less memory in the cache and is more efficiently computed.
Cache raptor instances per day using LRU strategy.
@clukas1 considering your description in the Jira ticket:
The second point won't work, since there are special calendar date in the GTFS, e.g. Christmas 1. May, ... So we have to cache raptor instances per date not weekday.
But: I think I got a solution to mask trips inside Raptor. This will also eradicate the need for the stop validations (except departure time), since we have all stops from the GTFS in the raptor, but some stops will just have no active trips.
I can prepare everything around it, but would be nice to build this together into the mega loop of the Raptor, maybe Friday evening or at the weekend?
Unfortunately we still have some strange errors when stops are not present in the Raptor: