-
Notifications
You must be signed in to change notification settings - Fork 795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC: Compiler caches #18190
base: main
Are you sure you want to change the base?
PoC: Compiler caches #18190
Conversation
❗ Release notes requiredCaution No release notes found for the changed paths (see table below). Please make sure to add an entry with an informative description of the change as well as link to this pull request, issue and language suggestion if applicable. Release notes for this repository are based on Keep A Changelog format. The following format is recommended for this repository:
If you believe that release notes are not necessary for this PR, please add NO_RELEASE_NOTES label to the pull request. You can open this PR in browser to add release notes: open in github.dev
|
I have added a CWT-based cache as well (optional, off by default), but it really needs more testing to make sure nothing is leaking. |
src/Compiler/Utilities/Caches.fs
Outdated
while not cts.Token.IsCancellationRequested do | ||
if this.GetEvictCount() > 0 then | ||
this.TryEvictItems () | ||
// do! Task.Delay(100, cts.Token) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exponential backoff based on number of evicted items?
Did I evict 0 - can afford a longer delay.
A lot of stuff evicted - try again shortly after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was my idea as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I made it based on the cache utilization. It will fluctuate from 0 to 1 seconds. Worst case scenario would be if cache is populated in bursts (and has max size way off), then it will resize. Wondering if it's good enough for now? @0101 @T-Gro @KevinRansom thoughts?
FYI, I have removed it for time being, since making it part of one Cache object shown to be problematic, since in many places |
let inline mkDelayedSeq (f: unit -> IEnumerable<'T>) = | ||
mkSeq (fun () -> f().GetEnumerator()) | ||
|
||
let inline sortBy ([<InlineIfLambda>] projection) (source: ConcurrentDictionary<_, _>) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really sure about InlineIfLambda
here.
// Default Seq.* function have one issue - when doing `Seq.sortBy`, it will call a `ToArray` on the collection, | ||
// which is *not* calling `ConcurrentDictionary.ToArray`, but uses a custom one instead (treating it as `ICollection`) | ||
// this leads to and exception when trying to evict without locking (The index is equal to or greater than the length of the array, | ||
// or the number of elements in the dictionary is greater than the available space from index to the end of the destination array.) | ||
// this is casuedby insertions happened between reading the `Count` and doing the `CopyTo`. | ||
// This solution introduces a custom `ConcurrentDictionary.sortBy` which will be calling a proper `CopyTo`, the one on the ConcurrentDictionary itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat important to note. There are almost no changes comparing to how Seq.sortBy
works, it just calls into ConcurrentDictionary.ToArray
, instead of internal toArray
(which results in calling Enumerable.ToArray
, which is not thread-safe).
| true, _ -> eviction.Trigger(key) | ||
| _ -> () // TODO: We probably want to count eviction misses as well? | ||
|
||
// TODO: Shall this be a safer task, wrapping everything in try .. with, so it's not crashing silently? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have some solid tests to be pretty confident it won't crash and will keep doing what it should, then it's probably fine. Also since it's executed with Task.Run
, does task
vs. backgroundTask
make any difference?
src/Compiler/Utilities/Caches.fs
Outdated
|
||
} | ||
|
||
// TODO: Explore an eviction shortcut, some sort of list of keys to evict first, based on the strategy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be quite easy to do - maintain some structure as shortcut (like linked list), and move pointers to either end of it based on strategy (most accessed to the end, for example), and chop beginning, since we don't care about ordering.
The question is if it's worth to allocate, maintain ans synchronise it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say it's worth it, especially if we want to put more items in the cache. Since eviction is super cheap then it can be done on each access. Also we then wouldn't need the CachedEntity
type. At least for the eviction strategies based on recency.
I guess the only downside is that we'd have to maintain the lock to ensure thread safety. Probably wouldn't need a ConcurrentDictionary then also.
Anyway, this can always be done as a future improvement. Ideally if we have some benchmarks for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not a huge fan of dealing with concurrency myself tbh, when CD exists and is done well. I'll see what can be done with backlog.
This is more or less ready (baselines need updating, I will do it this week). This, in my opinion, is good to replace Approach here is pretty straightforward, please refer to comments. cc @KevinRansom, @0101, @T-Gro PTAL, i would love to have it under preview for tooling in 9.0.300. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just needs some basic tests and plugging it into the place where it's needed.
src/Compiler/Utilities/Caches.fs
Outdated
type Cache<'Key, 'Value> (options: CacheOptions) as this = | ||
|
||
let capacity = options.MaximumCapacity + (options.MaximumCapacity * options.PercentageToEvict / 100) | ||
let cts = new CancellationTokenSource() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed when it never gets cancelled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it in case if there's ever a need for it to get cancelled (via dispose).
This is a very first naive draft of universal cache for compiler internals.
Currently it's very
dumbstraightforward - have an underlying concurrent dictionary, evict on every insert.Things it lacks:
It currently solves the only issue versus using
ConcurrentDictionarry
- eviction. It can evict things, though not in the most efficient way possible.Personally, I think, that first it has to be at least as good as
ConcurrentDictionary
+eviction, so we can enable some caching for tooling as well as for compiler.