-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Parallelization of AST Parsing (Plus a few other things) #590
Comments
Hi @fforres thanks for taking the time for that write-up! Learning how dependency-cruiser is used is very interesting to me - and helps a lot in making the tool better. I've tried to respond to the questions you raise one by one, but it'll be in several instalments :-)
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
Took me a bit to get back to this questions @sverweij Thanks a lot for that very very detailed explanation 🙏 So, started looking into all you mentioned and tried profiling our main frontend codebase (last run: For a bit of context,
The biggest part being, as you mentioned, reading files: visiting dependencies, so I went and tried diving a bit deeper into profiling/measurements and sum-up executions of internal functions, starting from The TLDR is that the The actual runtimes: (all in milliseconds)
Looked into some tracing node's --prof flag and looked to some flamegraphs via
So, I Started thinking on a few things, and had more questions for you 😅
Sorry for all the questions, this is a very interesting project, thanks a lot for your time 🙏 |
Hi @fforres thanks for the detailed analysis.
Thanks 😊. The next step will be to only run on the diff between this and the previous run (it now only serves from cache when nothing has changed).
It's still necessary and useful to keep looking for optimisations; the ecosystem moves on and might have new opportunities. And there might be inefficiencies in the code that weren't detected yet.
We use enhanced-resolve to resolve module names to files on disk. Different from other programming languages there's a myriad of ways this can be done (automatic expansions, implicit and explicit rules, aliases etc). Given a module name a resolver will try a list of options. E.g. when you import something from './some-module' it'll try whether We've chosen enhanced-resolve over other options (e.g. browserify's resolve, or node's own require resolver) because it gives us the flexibility to influence the module resolution a.o. with its plugin system. It also has decent caching - and it's a separate module we can actually integrate (as far as I know esbuild and swc don't have that). The fact webpack uses it for module resolution as well was only a secondary consideration.
If that exists (and it's just as feature rich as enhanced-resolve) it'd be interesting to have it in dependency-cruiser. Last time I checked swc and esbuild didn't have a separately callable resolver. My hypothesis is that enhanced-resolve (and/ or tsconfig-paths(/-plugin) in case of typescript) takes most most time in accessing files.
yes
It's definitely blocking. That would be a serious problem when nodejs would have something useful to do (e.g. when it's serving web requests). The other thing nodejs would be able to do at the same time is accessing other file(s). As far as I know that'll bump into physical restrictions of the disk in any case. In the past I've done some simple tests using sync vs async code accessing disk and to my surprise the async code was measurably slower. It's some time ago (> 1y), though, and I might've done something wrong in that test, so it might be worth a retry. Async code is more complicated to handle though, and is kind of viral requiring all calling code to be async as well (in the case of resolve.js about 18 modules): |
Heyooo @sverweij necromancing this old issue to bring something to your attention. So, recently I was navigating on the twitters, and I came across this project:
More than the compiler itself, I noticed they are composed of a couple of packaged tools and one of them is a resolver with the interesting claim of It is written in Rust, and seems to be a port of Docs: https://oxc-project.github.io/docs/guide/usage/resolver.html I know there's overhead with node/rust interop, (and probably that |
Hi @fforres thanks! That looks like a cool project - and 28x faster module resolution sounds very tempting. From a quick glance it looked like the (still experimental) nodejs binding has an interface that is quite close to enhanced-resolve, so it shouldn't be too hard to try out. |
Summary
Have a few questions
Context
First of all, want to start saying that dependency-cruiser is a great tool. We've been using it for a while at BrexHQ, and has made a few things a lot easier. So thanks for all the work you have done with it.
We are mainly using it to enforce architecture patterns across our codebase, structure migrations and detect what sections of our codebase have the most "violation" in order to prioritize work (Side-note, really helpful to structure long and gradual migrations 🔥)
I do have a question (/suggestion/me asking for pointers). Right now for all our main frontend codebase, (which is 100% typescript), it takes the full dependency analysis around 2 minutes to finalize. We are then running a few html-reports over it on CI and reporting the artifacts in a few ways.
2 minutes is not a long time in the grand scheme of things, however in our specific case, we are projecting a lot of growth on that codebase specifically, and we are also looking into a mono-repo structure for a lot of JS code (Which...yeah, that'll be huge 😅 )
So enough context, to my question. Been looking into the AST parsing sections of this cod and had a couple of questions:
comment
field?allowedPaths
/rulePathText
).Related to this, had another thoughts/questions, (maybe I can put it on another thread to better discuss? LMK)
One thing I find myself often reaching for, is a way to analyze what a module that I'm depending on is exporting. Like, if we could expose import-specific information, we could be able to do more granular rules 🤔
(Started thinking of if after #588)
For example, let's say we a file
performanceHelpers.ts
that exports maaaany things (bunch of types, default export, several named exports) and amongst all of them, a functionmeasureControllerPerformance
.If I want to enforce that
fileThatNeedsMeasurement.ts
importsperformanceHelpers.ts
, I can do it... However, i cannot enforce thatmeasureControllerPerformance
.(We can argue that we cannot enforce
measureControllerPerformance
is being used... so what's the point 😆 but it gives us users a bit control over their rules)We could also expose a module's "export information" and be able to create rules that enforce that a
xxxController.tsx
for example, has to have a specifically named export. etcSorry for the wall of text 😅
I had a few thoughts after using this tool for a bit, and wanted to pick your brain, maybe you have thought of all of these already and decided for-against them, or not.
Been getting familiarized with the code, and I'd be more than happy to either help on any of this or tbh, just discuss about it. Didn't want to come across as the "fix your opensource stuff for me" kind of reporter 😅
Either way, thanks a lot for the work you've done here 🙏
Environment
The text was updated successfully, but these errors were encountered: