From 661b6045aacf660b204432bdae3887468ea14de4 Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Tue, 28 Mar 2023 14:50:01 -0700 Subject: [PATCH 001/108] Start list of OOBs --- .../2023/net8.0-polyfills/net8.0-polyfills.md | 29 +++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 accepted/2023/net8.0-polyfills/net8.0-polyfills.md diff --git a/accepted/2023/net8.0-polyfills/net8.0-polyfills.md b/accepted/2023/net8.0-polyfills/net8.0-polyfills.md new file mode 100644 index 000000000..c4d6fa5ce --- /dev/null +++ b/accepted/2023/net8.0-polyfills/net8.0-polyfills.md @@ -0,0 +1,29 @@ +# .NET 8.0 Polyfill + +**Owner** [Immo Landwerth](https://github.com/terrajobst) + +This document covers which APIs we intend to ship for older versions of .NET, +which includes .NET Standard and .NET Framework. + +## Polyfills + +| Polyfill | Assembly | Package | Existing? | API | Contacts | +| ------------ | -------------------------- | -------------------------- | --------- | ---------------------- | -------------------------- | +| TimeProvider | Microsoft.Bcl.TimeProvider | Microsoft.Bcl.TimeProvider | No | [dotnet/runtime#36617] | [@tarekgh] [@geeknoid] | +| IPNetwork | Microsoft.Bcl.IPNetwork | Microsoft.Bcl.IPNetwork | No | [dotnet/runtime#79946] | [@antonfirsov] [@geeknoid] | + +* `TimeProvider`. This type is an abstraction for the current time and time + zone. In order to be useful, it's an exchange type that needs to be plumbed + through several layers, which includes framework code (such as `Task.Delay`) + and user code. + +* `IPNetwork`. It's a utilitarian type that is used across several parties. Not + necessarily a critical exchange type but if we don't ship it downlevel, + parties (such as [@geeknoid]'s team) will end up shipping their own copy and + wouldn't use the framework provided type, even on .NET 8.0. + +[@tarekgh]: https://github.com/tarekgh +[@geeknoid]: https://github.com/geeknoid +[@antonfirsov]: https://github.com/antonfirsov +[dotnet/runtime#36617]: https://github.com/dotnet/runtime/issues/36617 +[dotnet/runtime#79946]: https://github.com/dotnet/runtime/issues/79946 From 43883b281336e420aad61a43a6d1ea142941509b Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Tue, 28 Mar 2023 14:51:02 -0700 Subject: [PATCH 002/108] Add .NET 8.0 Polyfills to index --- INDEX.md | 1 + 1 file changed, 1 insertion(+) diff --git a/INDEX.md b/INDEX.md index 5086c5d94..9a7c3f915 100644 --- a/INDEX.md +++ b/INDEX.md @@ -81,6 +81,7 @@ Use update-index to regenerate it: | 2021 | [TFM for .NET nanoFramework](accepted/2021/nano-framework-tfm/nano-framework-tfm.md) | [Immo Landwerth](https://github.com/terrajobst), [Laurent Ellerbach](https://github.com/Ellerbach), [José Simões](https://github.com/josesimoes) | | 2021 | [Tracking Platform Dependencies](accepted/2021/platform-dependencies/platform-dependencies.md) | [Matt Thalman](https://github.com/mthalman) | | 2022 | [.NET 7 Version Selection Improvements](accepted/2022/version-selection.md) | [Rich Lander](https://github.com/richlander) | +| 2023 | [.NET 8.0 Polyfill](accepted/2023/net8.0-polyfills/net8.0-polyfills.md) | [Immo Landwerth](https://github.com/terrajobst) | | 2023 | [Experimental APIs](accepted/2023/preview-apis/preview-apis.md) | [Immo Landwerth](https://github.com/terrjobst) | | 2023 | [net8.0-browser TFM for applications running in the browser](accepted/2023/net8.0-browser-tfm.md) | [Javier Calvarro](https://github.com/javiercn) | From 6f92b6c361496a6b8de9680499640dad348f5056 Mon Sep 17 00:00:00 2001 From: Rich Lander Date: Sat, 19 Aug 2023 11:10:35 -0700 Subject: [PATCH 003/108] Proposal for vector instruction default (#173) * Add proposal for vector instruction default * Add owner marker --------- Co-authored-by: Immo Landwerth --- proposed/vector-instruction-set-default.md | 102 +++++++++++++++++++++ 1 file changed, 102 insertions(+) create mode 100644 proposed/vector-instruction-set-default.md diff --git a/proposed/vector-instruction-set-default.md b/proposed/vector-instruction-set-default.md new file mode 100644 index 000000000..a50e8eb5b --- /dev/null +++ b/proposed/vector-instruction-set-default.md @@ -0,0 +1,102 @@ +# Target AVX2 in R2R images + +**Owner** [Richard Lander](https://github.com/richlander) + +[Vector instructions](https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-5/#intrinsics) have become one of the key pillars of performance in the .NET libraries. Because of that, .NET apps light up a greater percentage of the transistors in a CPU/SOC than they previously did. That's progress. However, we've left performance opportunities on the table that we should consider picking up to make .NET apps run faster and more cheaply (in hosted environments). + +Related: [Initial JIT support for SIMD](https://devblogs.microsoft.com/dotnet/the-jit-finally-proposed-jit-and-simd-are-getting-married/) + +The minimum supported [SIMD instruction set](https://en.wikipedia.org/wiki/SIMD) for and required by .NET is [SSE2](https://en.wikipedia.org/wiki/SSE2). SSE2 predates the x64 processor by a couple years. SSE2 is old! However, since it is the minimum supported SIMD instruction set, the native code present and executed in ready-to-run images is (you guessed it) SSE2-based. That means your machine uses slower SSE2 instructions even if it (and it almost certainly does) supports the newer [AVX2](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#Advanced_Vector_Extensions_2) instructions.We rely on the JIT to take advantage of newer (and wider) SIMD instructions than SSE2, such as AVX2. The way the JIT does this is good but not aggressive enough to matter for startup or for apps with shorter life spans (measured in minutes). + +SSE4 was introduced with the [Intel Core 2](https://en.wikipedia.org/wiki/Intel_Core_(microarchitecture)) launch in 2006, along with x64. That means that all x64 chips are SSE4 capable, and that SSE2 is really just a holdover from our 32-bit computer heritage. That would make SSE2 a poor baseline for x64 software, but a good one for 32-bit. + +AVX2 was released in late 2013 with [Intel Haswell](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2) processors. After more than seven years in market, we should be able to take advantage of AVX2 more actively, to the end of improved performance, and lighting up even more CPU transistors. + +The advantage of AVX2 is not tens of percentage points. It is important not to oversell the importance of using later SIMD instructions. In the general case, using AVX2 might deliver low double-digit wins. For server apps that run all day, this could be important, and might allow for using a lower-cost tier of cloud machine and/or delivering higher performance. We also expect a significant startup improvement. + +You might be wondering about [AVX-512](https://en.wikipedia.org/wiki/AVX-512). Hardware intrinsics have not yet been defined (for .NET) for AVX-512. AVX-512 is also known to have more [nuanced performance](https://lemire.me/blog/2018/08/25/avx-512-throttling-heavy-instructions-are-maybe-not-so-dangerous/). For now, our vectorized code tops out at AVX2 for the x86-x64 [ISA](https://en.wikipedia.org/wiki/Instruction_set_architecture). Also, too few machines support AVX-512 to consider making it a default choice. + +You might be wondering about Arm processors. Assuming we adopted this plan for x86-x64, we'd do something similar for Arm64. The Arm ISA defines [NEON](https://en.wikipedia.org/wiki/ARM_architecture#Advanced_SIMD_(Neon)) as part of [Armv7](https://en.wikipedia.org/wiki/ARM_architecture#32-bit_architecture), NEON (enhanced) in [Armv8A](https://en.wikipedia.org/wiki/AArch64#AArch64_features) and [SVE](https://en.wikipedia.org/wiki/AArch64#Scalable_Vector_Extension_(SVE)) as part of [Armv8.2A](https://en.wikipedia.org/wiki/AArch64#ARMv8.2-A). [SVE2](https://en.wikipedia.org/wiki/AArch64#Scalable_Vector_Extension_2_(SVE2)) will appear sometime in the future. The .NET code base isn't vectorized much or at all for Arm32, such that this proposal doesn't apply for 32-bit Arm. On Arm64, we'd need to ensure that we make choices that take Raspberry Pi, laptops and any other form factors into consideration. + +This proposal is primarily oriented around the Intel ISA because it is the first order of business to determine a solution for Intel, and more people are familiar with the Intel SIMD instructions. + +There are possible downsides to this proposal. If we target AVX2 by default, people with non-AVX2 machines will have a significantly worse experience. The goal of this document is to propose a solution that delivers most of the expected benefit of AVX2 without much performance regression pain. It is easy to fall into the trap that we only need to worry about developers and think that developers only use the very latest machines. That's not true. There are also developers building client apps for end-users. We also continue to support Windows Server 2012 R2. Some of those machines are likely very old. There is also likely significant differences by country, possibly correlating to GDP. Needless to say, people using .NET as a developer or an end-user use a great variety of machines. + +At the same time, we need to make choices that move .NET metrics and capabilities forward. Let's explore this topic. + +Supporting ecosystem data: + +- [Steam hardware survey](https://store.steampowered.com/hwsurvey) (see "Other Settings") +- [Azure VMs](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/series/) -- these all support AVX2; we expect same for other clouds + +Note: "SSEx" is used to refer to any of the SSE versions, like SSE2 and SSE4. + +## .NET 5.0 behavior + +Let's start with the SIMD characteristics for .NET 5.0: + +- Ready-to-run images target SSE2 instructions for vectorized code (not just by default; it's the only option). That means ready-to-run images include processor-specific machine code with SSE2 instructions and never any AVX/AVX2 instructions. +- The JIT (via tiered compilation) can re-jit methods so that they take advantage of a newer SIMD instruction set (if present on a given machine), like SSE4, AVX, or AVX2. The JIT doesn't do this proactively, but will only rejit methods that are used a lot, per its normal behavior. It doesn't re-jit methods solely to use better SIMD instructions, but will use the best SIMD instructions available (on a given machine) when it does rejit methods, again, per its normal behavior. +- As a result, applications rarely run at peak performance initially (since most machines support newer SIMD instructions than SSE2) and it takes a long time for an app to get to peak performance (via tiering to SSE4, AVX, or AVX2-enabled code). +- This behavior is fine for machines with only SSE2, and is leaving performance on the table for newer machines. + +## Context, characteristics and constraints + +The most obvious solution is to target AVX2 by default, and this approach would be the clear winner for anyone using a machine that was bought after 2015. The challenge is that this change would be significant regression for SSEx machines, even more significant than the win it would provide for AVX2-capable machines, in the general case. + +The primary issue is that ready to run code for (what is likely to be) performance sensitive methods would no longer be usable on SSEx machines, but would require JITing IL methods instead. The resultant native code would be the same (actually, it would be better; we'll leave that topic for another day), but would require time to produce via JIT compilation. As a result, startup time for apps on SSEx machines would be noticeably degraded. We guess the startup regression would be unacceptable, but need to measure it. + +Making AVX2 the default would be the same as dropping support for SSEx. If we do that, we'd need to announce it, and reason about what an SSEx offering looks like, if any. + +This discussion on [AVX and AVX2 requirements from Microsoft Teams](https://medium.com/365uc/why-cant-i-use-background-blur-or-custom-backgrounds-in-microsoft-teams-a514e9da3921) provides a sense of how hardware requirements can affect users. + +The [Steam hardware survey](https://store.steampowered.com/hwsurvey) tells us that ~98% of Steam users have SSE4, 94% have AVX, and ~80% have AVX2. We could instead target one of the SSE4 variants. The performance win wouldn't be as high, but the effective regression would be much more limited. Targeting the middle is unappealing when we know almost all developers and all clouds are AVX2-capable. We need a more appealing solution, without as much tension between tradeoffs. + +Note: Stating the obvious, Steam users and .NET users don't fully overlap, however, the Steam hardware survey provides a great view on PC computing in the world. One can guess that .NET users might be using 5 points more or less (that's a 10 point swing) modern hardware than Steam users, but not much more than that (in either direction). Also, if Unity was ever to take a dependency on upstream .NET, they'd need it to work on Steam machines. That's definitely not an announcement, just emphasizing that using the Steam data is reasonable. + +It is important to realize that this proposal is for a product that will be released in November 2021, and not likely used significantly until 2022. Early 2022 should be the primary time frame kept in mind. The Steam survey, for example, will report different numbers at that time. It's possible AVX2 will have reached today's AVX value by next year. To that point, PC sales have had a [surprising resurgence](https://www.idc.com/getdoc.jsp?containerId=prUS47274421) due to the pandemic, resulting in a lot of old machines being removed from the worldwide fleet (at least in countries with high GDP). + +An important degree of freedom is making different choices, per OS and hardware combination. Different OSes have a different hardware dynamic, different use cases, and a different future. Let's look at that. It's the heart of the proposal. + +## Differ behavior by OS + +It is reasonable to vary SIMD defaults, per operating system. There are many aspects of the product that differ by operating system or hardware platform. In fact, the JIT dynamically targeting different SIMD instructions per machine is such an example. For this case, the key consideration is the ready-to-run native code produced for a given operating system + CPU consideration. We could make one decision for Linux x64, another for Windows x64, and another for macOS Arm64. We already have to target different SIMD instructions for x64 vs Arm64, so targeting different SIMD instructions across OSes isn't much of a leap. + +It is useful to look at hardware dates again. Continuing with the Intel focus, we can see the following delivery of SIMD instructions in chips: + +* 2006: SSE4 was introduced with Core 2 x64 chips at the initial [Intel Core architecture](https://en.wikipedia.org/wiki/Intel_Core_(microarchitecture)) launch. +* 2008: SSE4.2 was introduced with [Nehalem](https://en.wikipedia.org/wiki/Nehalem_(microarchitecture)) chips. +* 2011: AVX was introduced with [Sandy Bridge](https://en.wikipedia.org/wiki/Sandy_Bridge) chips. +* 2013: AVX2 was introduced with [Haswell](https://en.wikipedia.org/wiki/Haswell_(microarchitecture)) chips. + +Let's take a look at each operating system, assuming the Intel ISA. + +* On Windows, we need to be conservative. There are a lot of .NET users on Windows, and the majority of .NET desktop apps target Windows. Windows 10 requires AVX, but Win7 requires only SSE2. It is reasonable to expect that we'll continue supporting Windows 7 with .NET 6.0 (and probably not after that). +* On [macOS](https://en.wikipedia.org/wiki/MacOS_version_history#Version_10.13:_%22High_Sierra%22), it appears that [Mac machines have had AVX2 since late 2013](https://en.wikipedia.org/wiki/List_of_Macintosh_models_grouped_by_CPU_type#Haswell), when they adopted [Haswell chips](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2). In terms of macOS, [macOS Big Sur (11.0)](https://support.apple.com/kb/sp833?locale=en_US) appears to be the first version to require Haswell. .NET 5.0, for example, supports [macOS High Sierra (10.13)](https://support.apple.com/kb/SP765?locale=en_US) and later. macOS 10.13 supports hardware significantly before Haswell. For .NET 6.0, we'll likely continue to choose a "10." macOS version as the minimum, possibly [macOS Catalina (10.15)](https://support.apple.com/kb/SP803?locale=en_US), aligning with our past practice of slowly (not aggressively) moving forward OS requirements. Zooming out, macOS x64 is now a legacy platform. We shouldn't rock the boat with a change that could cause a performance regression at this late stage of the macOS x64 lifecycle. +* On Linux, we can assume that there is at least as much diversity of hardware as Windows, however, we can also assume that .NET usage on Linux is more narrow, targeted at developers and (mostly) production deployments. + +The following is a draft plan on how to approach different systems for this proposal: + +* Windows x86 -- Target SSE2 (status-quo; match Windows 7) +* Windows x64 -- Target AVX2 +* Linux x64 -- Target AVX2 +* macOS x64 -- Target SSE2 (status-quo; alternatively, target SSE4 if it is straightforward and has significant value) +* Arm64 (all OSes) -- Target Armv8 NEON +* Linux Arm32 -- N/A (there is very little vectorized code) +* Windows Arm32 -- N/A (unsupported platform) + +Note on [NEON](https://en.wikipedia.org/wiki/ARM_architecture#Advanced_SIMD_(Neon)): The [Raspberry Pi 3 (and later) supports Armv8 NEON](https://en.wikipedia.org/wiki/Raspberry_Pi#Specifications). Apple M1 chips apparently have [great NEON performance](https://lemire.me/blog/2020/12/13/arm-macbook-vs-intel-macbook-a-simd-benchmark/). + +There may be a significant set of users that have very old machines, either developers (all OSes) or end-users (primarily Windows). The Windows 32-bit offering should satisfy developers on very old Windows machines. We have no such offering for Linux. macOS developers should be unaffected. Developers may also have concerns about their end-users. They will be able to generate self-contained apps that are re-compiled (via crossgen2) to target an earlier SIMD instruction set (such as SSE2). + +In conclusion, this plan should have the following outcomes: + +- .NET apps will run at peak performance (as it related to vector instructions) on modern hardware. +- Developers who only have access to very old machines will have a satisfactory option on Windows. +- Developers have a supported path to satisfy end-users on very old machines. + +## Next steps + +- Productize crossgen2 (the new version of crossgen that includes the capability to target higher SIMD instructions than SSE2). +- Determine the performance wins and regressions of various SIMD instruction sets. For example, what is the difference between SSE2 and SSE4, and between AVX and AVX2? How large is the performance regression for SSE2-only machines? +- Determine the distribution of machines in the wild that support various SIMD instruction set. It is expected that nearly all developer machines and all cloud VMs (all clouds) support AVX2. Sites like [statcounter.com](https://gs.statcounter.com/os-version-market-share/macos/desktop/worldwide) may prove useful. From 9566772afdee8730160e59fe2cc0736b181da3a4 Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Mon, 21 Aug 2023 15:49:31 -0700 Subject: [PATCH 004/108] Add file to index --- INDEX.md | 1 + 1 file changed, 1 insertion(+) diff --git a/INDEX.md b/INDEX.md index 5086c5d94..62799f597 100644 --- a/INDEX.md +++ b/INDEX.md @@ -98,4 +98,5 @@ Use update-index to regenerate it: | | [Rate limits](proposed/rate-limit.md) | [John Luo](https://github.com/juntaoluo), [Sourabh Shirhatti](https://github.com/shirhatti) | | | [Readonly references in C# and IL verification.](proposed/verifiable-ref-readonly.md) | | | | [Ref returns in C# and IL verification.](proposed/verifiable-ref-returns.md) | | +| | [Target AVX2 in R2R images](proposed/vector-instruction-set-default.md) | [Richard Lander](https://github.com/richlander) | From 41448327432aeff4cdb567dbc558ac7b4674b7e6 Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Mon, 21 Aug 2023 15:58:05 -0700 Subject: [PATCH 005/108] Clarify behavior of severity --- accepted/2023/preview-apis/preview-apis.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/accepted/2023/preview-apis/preview-apis.md b/accepted/2023/preview-apis/preview-apis.md index d3bfd9773..ba22a0594 100644 --- a/accepted/2023/preview-apis/preview-apis.md +++ b/accepted/2023/preview-apis/preview-apis.md @@ -225,7 +225,17 @@ both options are being presented to the consumer. ### Compiler Behavior The compiler will raise a diagnostic when an experimental API is used, using the -supplied diagnostic ID. The severity is always error. +supplied diagnostic ID. + +> [!NOTE] +> The severity is warning, because errors cannot be suppressed. +> +> However, there will also be a generic compiler diagnostic ID that applies to +> all warnings raised for using experimental APIs (like nullable). The built-in +> `editor.config` that we ship with the .NET SDK will elevate these warnings to +> errors. From a user's standpoint this will result in these diagnostics to +> appear as errors that they are expected to suppress, which is the UX we +> desire. The semantics are identical to how obsolete is tracked, except there is no special treatment when both caller and callee are in the same assembly -- any From 821bd9e9646117088dbd8b0aaeb2fd1b91d2661e Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Tue, 26 Sep 2023 13:08:58 -0700 Subject: [PATCH 006/108] Update accepted/2023/preview-apis/preview-apis.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Alexander Köplinger --- accepted/2023/preview-apis/preview-apis.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/accepted/2023/preview-apis/preview-apis.md b/accepted/2023/preview-apis/preview-apis.md index ba22a0594..e526cb5ea 100644 --- a/accepted/2023/preview-apis/preview-apis.md +++ b/accepted/2023/preview-apis/preview-apis.md @@ -232,7 +232,7 @@ supplied diagnostic ID. > > However, there will also be a generic compiler diagnostic ID that applies to > all warnings raised for using experimental APIs (like nullable). The built-in -> `editor.config` that we ship with the .NET SDK will elevate these warnings to +> `.editorconfig` that we ship with the .NET SDK will elevate these warnings to > errors. From a user's standpoint this will result in these diagnostics to > appear as errors that they are expected to suppress, which is the UX we > desire. From 17cc8781ad48209b5b7a817343b05d708bc9d4b8 Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Tue, 26 Sep 2023 13:18:46 -0700 Subject: [PATCH 007/108] Update document --- accepted/2023/preview-apis/preview-apis.md | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/accepted/2023/preview-apis/preview-apis.md b/accepted/2023/preview-apis/preview-apis.md index e526cb5ea..1bed4a4ee 100644 --- a/accepted/2023/preview-apis/preview-apis.md +++ b/accepted/2023/preview-apis/preview-apis.md @@ -224,18 +224,14 @@ both options are being presented to the consumer. ### Compiler Behavior +*For more details on the compiler behavior, see [C# spec][csharp-spec].* + The compiler will raise a diagnostic when an experimental API is used, using the supplied diagnostic ID. > [!NOTE] -> The severity is warning, because errors cannot be suppressed. -> -> However, there will also be a generic compiler diagnostic ID that applies to -> all warnings raised for using experimental APIs (like nullable). The built-in -> `.editorconfig` that we ship with the .NET SDK will elevate these warnings to -> errors. From a user's standpoint this will result in these diagnostics to -> appear as errors that they are expected to suppress, which is the UX we -> desire. +> The severity is warning, because errors cannot be suppressed. However, the +> warning is promoted to an error for purpose of reporting. The semantics are identical to how obsolete is tracked, except there is no special treatment when both caller and callee are in the same assembly -- any @@ -264,6 +260,7 @@ the usual means (i.e. `#pragma` or project-wide `NoWarn`). - We should point users to this new attribute (`ExperimentalAttribute`) [preview-features]: ../../2021/preview-features/preview-features.md +[csharp-spec]: https://github.com/dotnet/csharplang/blob/main/proposals/csharp-12.0/experimental-attribute.md ## Q & A From 588482b2498d4280979f31ce7d0e952cd8b78d35 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 27 Sep 2023 18:08:43 +0200 Subject: [PATCH 008/108] work in progress --- accepted/2023/wasm-browser-threads.md | 268 ++++++++++++++++++++++++++ 1 file changed, 268 insertions(+) create mode 100644 accepted/2023/wasm-browser-threads.md diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md new file mode 100644 index 000000000..ddf6ff577 --- /dev/null +++ b/accepted/2023/wasm-browser-threads.md @@ -0,0 +1,268 @@ +# Multi-threading on a browser + +## Goals + - CPU intensive workloads on dotnet thread pool + - enable blocking .Wait APIs from C# user code on all threads + - Current public API throws PNSE for it + - This is core part on MT value proposition. + - If people want to use existing MT code-bases, most of the time, the code is full of locks. + - People want to use existing desktop/server multi-threaded code as is. + - allow HTTP and WS C# APIs to be used from any thread + - Underlying JS object have thread affinity + - JSImport/JSExport interop in maximum possible extent + - don't change/break single threaded build. † + +## Lower priority goals + - try to make it debugging friendly + - implement crypto via `subtle` browser API + - allow calls to synchronous JSExport from UI thread (callback) + +† Note: all the text below discusses MT build only, unless explicit about ST build. + +## Context - Problems +**1)** If you have multithreading, any thread might need to block while waiting for any other to release a lock. + - locks are in the user code, in nuget packages, in Mono VM itself + - there are managed and un-managed locks + - in single-threaded build of the runtime, all of this is NOOP. That's why it works on UI thread. + +**2)** UI thread in the browser can't synchronously block + - you can spin-lock but it's bad idea. + - Deadlock: when you spin-block, the JS timer loop and any messages are not pumping. But code in other threads may be waiting for some such event to resolve. + - It eats your battery + - Browser will kill your tab at random point (Aw, snap). + - It's not deterministic and you can't really test your app to prove it harmless. + - Firefox (still) has synchronous XHR which could be captured by async code in service worker + - it's deprecated legacy API + - but other browsers don't and it's unlikely they will implement it + - there are deployment and security challenges with it + - all the other threads/workers could synchronously block + - if we will have managed thread on the UI thread, any `lock` or Mono GC barrier could cause spin-wait + - in case of Mono code, we at least know it's short duration + - we should prevent it from blocking in user code + +**3)** JavaScript engine APIs and objects have thread affinity. + - The DOM and few other browser APIs are only available on the main UI "thread" + - and so, you need to have C# interop with UI, but you can't block there. + - HTTP & WS objects have affinity, but we would like to consume them (via Streams) from any managed thread + - Any `JSObject`, `JSException` and `Task` have thread affinity + - they need to be disposed on correct thread. GC is running on random thread + +**4)** State management of JS context `self` of the worker. + - emscripten pre-allocates poll of web worker to be used as pthreads. + - Because they could only be created asynchronously, but `pthread_create` is synchronous call + - Because they are slow to start + - those pthreads have JS context `self`, which is re-used when mapped to C# thread pool + - when we allow JS interop on a managed thread, we need a way how to clean up the JS state + +**5)** Blazor's `renderBatch` is using direct memory access + +## Define terms +- UI thread + - this is the main browser "thread", the one with DOM on it + - it can't block-wait, only spin-wait +- "sidecar" thread - possible design + - is a web worker with emscripten and mono VM started on it + - doing this allows all managed threads to allow blocking wait +- "deputy" thread - possible design + - is a web worker and pthread with C# `Main` entrypoint + - doing this allows all managed threads to allow blocking wait +- "managed thread" + - is a thread with emscripten pthread and Mono VM attached thread and GC barriers +- "main managed thread" + - is a thread with C# `Main` entrypoint running on it + - if this is UI thread, it means that one managed thread is special + - see problems **1,2** +- "managed thread pool thread" + - pthread dedicated to serving Mono thread pool +- `JSSynchronizationContext` +- `JSObject` +- `Promise`/`Task` +- `JSWebWorker` + +## Implementation options (only some combinations are possible) +- how to deal with blocking C# code on UI thread + - **A)** pretend it's not a problem (this we already have) + - **B)** move user C# code to web worker + - **C)** move all Mono to web worker +- how to deal with blocking in synchronous JS calls from UI thread + - **D)** pretend it's not a problem (this we already have) + - **E)** throw PNSE when synchronous JSExport is called on UI thread + - **F)** dispatch calls to synchronous JSExport to web worker and spin-wait on JS side of UI thread. +- how to implement JS interop between managed main thread and UI thread (DOM) + - **G)** put it out of scope for MT, manually implement what Blazor needs + - **H)** pure JS dispatch between threads, [comlink](https://github.com/GoogleChromeLabs/comlink) style + - **I)** C/emscripten dispatch of infrastructure to marshal individual parameters + - **J)** C/emscripten dispatch of method binding and invoke, but marshal parameters on UI thread + - **K)** pure C# dispatch between threads +- how to implement JS interop on non-main web worker + - **L)** disable it for all non-main threads + - **M)** disable it for managed thread pool threads + - **N)** allow it only for threads created as dedicated resource `WebWorker` via new API + - **O)** enables it on all workers (let user deal with JS state) +- how to dispatch calls to the right JS thread context + - **P)** via `SynchronizationContext` before `JSImport` stub, synchronously, stack frames + - this is written by user. Complex, async, MT stuff. + - **Q)** via `SynchronizationContext` inside `JSImport` C# stub + - **R)** via `emscripten_dispatch_to_thread_async` inside C code of `` +- how to implement GC/dispose of `JSObject` proxies + - **S)** per instance: synchronous dispatch the call to correct thread via `SynchronizationContext` + - **T)** per instance: async schedule the cleanup + - at the detach of the thread. We already have `forceDisposeProxies` + - could target managed thread be paused during GC ? +- where to instantiate initial user JS modules (like Blazor's) + - **U)** in the UI thread + - **V)** in the deputy/sidecar thread +- where to instantiate `JSHost.ImportAsync` modules + - **W)** in the UI thread + - **X)** in the deputy/sidecar thread + - **Y)** allow it only for dedicated `JSWebWorker` threads + - **Z)** disable it + - same for `JSHost.GlobalThis`, `JSHost.DotnetInstance` +- how to implement Blazor's `renderBatch` + - **a)** keep as is, wrap it with GC pause, use legacy JS interop on UI thread + - **b)** extract some of the legacy JS interop into Blazor codebase + - **c)** switch to Blazor server mode. Web worker create the batch of bytes and UI thread apply it to DOM +- where to create HTTP+WS JS objects + - **d)** in the UI thread + - **e)** in the managed main thread + - **f)** in first calling managed thread + - install `JSSynchronizationContext` even without `JSWebWorker` ? +- how to dispatch calls to HTTP+WS JS objects + - **g)** try to stick to the same thread via `ConfigureAwait(false)`. + - doesn't really work. `Task` migrate too freely + - **h)** via C# `SynchronizationContext` + - **i)** via `emscripten_dispatch_to_thread_async` + - **j)** via `postMessage` + - **k)** same whatever we choose for `JSImport` + - note there are some synchronous calls on WS +- where to create the emscripten instance + - **l)** could be on the UI thread + - **m)** could be on the "sidecar" thread +- where to start the Mono VM + - **n)** could be on the UI thread + - **o)** could be on the "sidecar" thread +- where to run the C# main entrypoint + - **p)** could be on the UI thread + - **q)** could be on the "deputy" or "sidecar" thread +- where to implement subtle crypto + - **r)** out of scope + - **s)** in the UI thread + - **t)** is a dedicated web worker + +# Interesting combinations + +## Minimal support +- **A,D,G,L,P,S,U,Y,a,f,h,l,n,p** +- this is what we [already have today](#Current-state-2023-Sep) +- it could deadlock or die, +- JS interop on threads requires lot of user code attention +- Keeps problems **1,2,3,4** + +## Sidecar + no JS interop + narrow Blazor support +- **C,E,G,L,P,S,U,Z,c,d,h,m,o,q** +- minimal effort, low risk, low capabilities +- move both emscripten and Mono VM sidecar thread +- no user code JS interop on any thread +- internal solutions for Blazor needs +- Ignores problems **1,2,3,4,5** + +## Sidecar + only async just JS proxies UI + JSWebWorker + Blazor WASM server +- **C,E,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q** +- no C or managed code on UI thread +- no support for blocking sync JSExport calls from UI thread (callbacks) + - it will throw PNSE +- this will create double proxy for `Task`, `JSObject`, `Func<>` etc + - difficult to GC, difficult to debug +- double marshaling of parameters +- Avoids **1,2** for JS callback +- Solves **1,2** for managed code. + - emscripten main loop stays responsive +- Solves **3,4,5** + +## Sidecar + async & sync just JS proxies UI + JSWebWorker + Blazor WASM server +- **C,F,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q** +- no C or managed code on UI thread +- support for blocking sync JSExport calls from UI thread (callbacks) + - at blocking the UI is at least well isolated from runtime code + - it makes responsibility for sync call clear +- this will create double proxy for `Task`, `JSObject`, `Func<>` etc + - difficult to GC, difficult to debug +- double marshaling of parameters +- Ignores **1,2** for JS callback +- Solves **1,2** for managed code + - emscripten main loop stays responsive + - unless there is sync `JSImport`->`JSExport` call +- Solves **3,4,5** + +## Deputy + managed UI interop + JSWebWorker +- this uses `JSSynchronizationContext` to dispatch calls to UI thread + - this is problematic because some managed code is actually running on UI thread + - it needs to also use `SynchronizationContext` for `JSExport` and callbacks, to dispatch to deputy. +- blazor render could be both legacy render or Blazor server style + - because we have both memory and mono on the UI thread +- Ignores **1,2** for JS callback +- Solves **1,2** for managed code + - emscripten main loop stays responsive + - unless there is sync `JSImport`->`JSExport` call +- Solves **3,4,5** + +## Deputy + UI bound interop + JSWebWorker +- this uses `emscripten_dispatch_to_thread_async` for `call_entry_point`, `complete_task`, `cwraps.mono_wasm_invoke_method_bound`, `mono_wasm_invoke_bound_function`, `mono_wasm_invoke_import`, `call_delegate_method` to get to the UI thread. +- it uses other `cwraps` locally on UI thread, like `mono_wasm_new_root`, `stringToMonoStringRoot`, `malloc`, `free`, `create_task_callback_method` + - it means that interop related managed runtime code is running on the UI thread, but not the user code. + - it means that parameter marshalling is fast (compared to sidecar) + - it still needs to enter GC barrier and so it could block UI for GC run +- blazor render could be both legacy render or Blazor server style + - because we have both memory and mono on the UI thread + + +## Blazor +- as compared to single threaded runtime, the major difference would be no synchronous callbacks. + - for example from DOM `onClick`. This is one of the reasons people prefer ST WASM over Blazor Server. + - but there is really [no way around it](#problem), because you can't have both MT and sync calls from UI. +- Blazor `renderBatch` + - currently `Blazor._internal.renderBatch` -> `MONO.getI16`, `MONO.getI32`, `MONO.getF32`, `BINDING.js_string_to_mono_string`, `BINDING.conv_string`, `BINDING.unbox_mono_obj` + - some of them need Mono VM and GC barrier, but could be re-written with GC pause and only memory read +- Blazor's [`IJSInProcessRuntime.Invoke`](https://learn.microsoft.com/en-us/dotnet/api/microsoft.jsinterop.ijsinprocessruntime.invoke) - this is C# -> JS direction + - TODO: which implementation keeps this working ? Which worker is the target ? + - we could use Blazor Server style instead +- Blazor's [`IJSUnmarshalledRuntime`](https://learn.microsoft.com/en-us/dotnet/api/microsoft.jsinterop.ijsunmarshalledruntime) + - this is ICall `InvokeJS` + - TODO: which implementation keeps this working ? Which worker is the target ? +- `JSImport` used for startup, loading and embedding: `INTERNAL.loadLazyAssembly`, `INTERNAL.loadSatelliteAssemblies`, `Blazor._internal.getApplicationEnvironment`, `receiveHotReloadAsync` + - all of them pass simple data types, no proxies +- `JSImport` used for calling user JS code: `Blazor._internal.endInvokeDotNetFromJS`, `Blazor._internal.invokeJSJson`, `Blazor._internal.receiveByteArray`, `Blazor._internal.getPersistedState` + - TODO: which implementation keeps this working ? Which worker is the target ? +- `JSImport` used for logging: `globalThis.console.debug`, `globalThis.console.error`, `globalThis.console.info`, `globalThis.console.warn`, `Blazor._internal.dotNetCriticalError` + - probably could be any JS context + +# Current state 2023 Sep + - we already ship MT version of the runtime in the wasm-tools workload. + - It's enabled by `true` and it requires COOP HTTP headers. + - It will serve extra file `dotnet.native.worker.js`. + - This will also start in Blazor project, but UI rendering would not work. + - we have pre-allocated pool of browser Web Workers which are mapped to pthread dynamically. + - we can configure pthread to keep running after synchronous thread_main finished. That's necessary to run any async tasks involving JavaScript interop. + - GC is running on UI thread/worker. + - legacy interop has problems with GC boundaries. + - JSImport & JSExport work + - There is private JSSynchronizationContext implementation which is too synchronous + - There is draft of public C# API for creating JSWebWorker with JS interop. It must be dedicated un-managed resource, because we could not cleanup JS state created by user code. + - There is MT version of HTTP & WS clients, which could be called from any thread but it's also too synchronous implementation. + - Many unit tests fail on MT https://github.com/dotnet/runtime/pull/91536 + - there are MT C# ref assemblies, which don't throw PNSE for MT build of the runtime for blocking APIs. + +## Task breakdown +- [ ] rename `WebWorker` API to `JSWebWorker` ? +- [ ] `ToManaged(out Task)` to be called before the actual JS method +- [ ] public API for `JSHost.SynchronizationContext` which could be used by code generator. +- [ ] reimplement `JSSynchronizationContext` to be more async +- [ ] implement Blazor's `WebAssemblyDispatcher` + [feedback](https://github.com/dotnet/aspnetcore/pull/48991) +- [ ] optinal: make underlying emscripten WebWorker pool allocation dynamic, or provide C# API for that. +- [ ] optinal: implement async function/delegate marshaling in JSImport/JSExport parameters. +- [ ] optinal: enable blocking HTTP/WS APIs +- [ ] optinal: enable lazy DLL download by blocking the caller +- [ ] optinal: implement crypto +- [ ] measure perf impact + +Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 \ No newline at end of file From cf5202eeb983e135e241c2d5ca18908face6816e Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 27 Sep 2023 18:39:52 +0200 Subject: [PATCH 009/108] wip --- accepted/2023/wasm-browser-threads.md | 43 +++++++++++++++++++++------ 1 file changed, 34 insertions(+), 9 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index ddf6ff577..074e3a485 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -62,6 +62,7 @@ - it can't block-wait, only spin-wait - "sidecar" thread - possible design - is a web worker with emscripten and mono VM started on it + - for Blazor rendering MAUI/BlazorWebView use the same concept - doing this allows all managed threads to allow blocking wait - "deputy" thread - possible design - is a web worker and pthread with C# `Main` entrypoint @@ -74,10 +75,6 @@ - see problems **1,2** - "managed thread pool thread" - pthread dedicated to serving Mono thread pool -- `JSSynchronizationContext` -- `JSObject` -- `Promise`/`Task` -- `JSWebWorker` ## Implementation options (only some combinations are possible) - how to deal with blocking C# code on UI thread @@ -215,6 +212,33 @@ - blazor render could be both legacy render or Blazor server style - because we have both memory and mono on the UI thread +# Details + +## JSImport and marshaled JS functions +- both sync and async could be called on all `JSWebWorker` threads +- both sync and async could be called on main managed thread (even when running on UI) + - unless there is loop back to blocking `JSExport`, it could not deadlock + +## JSExport & C# delegates +- async could be called on all `JSWebWorker` threads +- sync could be called on `JSWebWorker` +- sync could be called on from UI thread is problematic + - with spin-wait in UI in JS it would at least block the UI rendering + - with spin-wait in emscripten, it could deadlock the rest of the app + +## JSWebWorker with JS interop +- is proposed concept to let user to manage JS state of the worker explicitly + - because of problem **4** +- is C# thread created and disposed by new API for it +- could block on synchronization primitives +- could do full JSImport/JSExport to it's own JS `self` context +- there is `JSSynchronizationContext`` installed on it + - so that user code could dispatch back to it, in case that it needs to call `JSObject` proxy (with thread affinity) + +## Debugging +- VS debugger would work as usual +- Chrome dev tools would only see the events coming from `postMessage` or `Atomics.waitAsync` +- Chrome dev tools debugging C# could be bit different, it possibly works already. The C# code would be in different node of the "source" tree view ## Blazor - as compared to single threaded runtime, the major difference would be no synchronous callbacks. @@ -222,6 +246,7 @@ - but there is really [no way around it](#problem), because you can't have both MT and sync calls from UI. - Blazor `renderBatch` - currently `Blazor._internal.renderBatch` -> `MONO.getI16`, `MONO.getI32`, `MONO.getF32`, `BINDING.js_string_to_mono_string`, `BINDING.conv_string`, `BINDING.unbox_mono_obj` + - we could also [RenderBatchWriter](https://github.com/dotnet/aspnetcore/blob/045afcd68e6cab65502fa307e306d967a4d28df6/src/Components/Shared/src/RenderBatchWriter.cs) in the WASM - some of them need Mono VM and GC barrier, but could be re-written with GC pause and only memory read - Blazor's [`IJSInProcessRuntime.Invoke`](https://learn.microsoft.com/en-us/dotnet/api/microsoft.jsinterop.ijsinprocessruntime.invoke) - this is C# -> JS direction - TODO: which implementation keeps this working ? Which worker is the target ? @@ -258,11 +283,11 @@ - [ ] public API for `JSHost.SynchronizationContext` which could be used by code generator. - [ ] reimplement `JSSynchronizationContext` to be more async - [ ] implement Blazor's `WebAssemblyDispatcher` + [feedback](https://github.com/dotnet/aspnetcore/pull/48991) -- [ ] optinal: make underlying emscripten WebWorker pool allocation dynamic, or provide C# API for that. -- [ ] optinal: implement async function/delegate marshaling in JSImport/JSExport parameters. -- [ ] optinal: enable blocking HTTP/WS APIs -- [ ] optinal: enable lazy DLL download by blocking the caller -- [ ] optinal: implement crypto +- [ ] optional: make underlying emscripten WebWorker pool allocation dynamic, or provide C# API for that. +- [ ] optional: implement async function/delegate marshaling in JSImport/JSExport parameters. +- [ ] optional: enable blocking HTTP/WS APIs +- [ ] optional: enable lazy DLL download by blocking the caller +- [ ] optional: implement crypto - [ ] measure perf impact Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 \ No newline at end of file From ba2359056def4722fc61890b62d7597a5dfabb06 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 27 Sep 2023 21:02:27 +0200 Subject: [PATCH 010/108] wip --- accepted/2023/wasm-browser-threads.md | 149 ++++++++++++++++++++++++-- 1 file changed, 143 insertions(+), 6 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 074e3a485..a9fc8cd0c 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -75,6 +75,14 @@ - see problems **1,2** - "managed thread pool thread" - pthread dedicated to serving Mono thread pool +- "comlink" + - in this document it stands for the pattern + - dispatch to another worker via pure JS means + - create JS proxies for types which can't be serialized, like `Function` + - actual [comlink](https://github.com/GoogleChromeLabs/comlink) + - doesn't implement spin-wait + - we already have prototype of the similar functionality + - which can spin-wait ## Implementation options (only some combinations are possible) - how to deal with blocking C# code on UI thread @@ -98,7 +106,6 @@ - **O)** enables it on all workers (let user deal with JS state) - how to dispatch calls to the right JS thread context - **P)** via `SynchronizationContext` before `JSImport` stub, synchronously, stack frames - - this is written by user. Complex, async, MT stuff. - **Q)** via `SynchronizationContext` inside `JSImport` C# stub - **R)** via `emscripten_dispatch_to_thread_async` inside C code of `` - how to implement GC/dispose of `JSObject` proxies @@ -145,18 +152,23 @@ - **r)** out of scope - **s)** in the UI thread - **t)** is a dedicated web worker +- where to marshal JSImport/JSExport parameters/return/exception + - **u)** could be only values types, proxies out of scope + - **v)** could be on UI thread (with deputy design and Mono there) + - **w)** could be on sidecar (with double proxies of parameters via comlink) + - **x)** could be on sidecar (with comlink calls per parameter) # Interesting combinations ## Minimal support -- **A,D,G,L,P,S,U,Y,a,f,h,l,n,p** +- **A,D,G,L,P,S,U,Y,a,f,h,l,n,p,v** - this is what we [already have today](#Current-state-2023-Sep) - it could deadlock or die, - JS interop on threads requires lot of user code attention - Keeps problems **1,2,3,4** ## Sidecar + no JS interop + narrow Blazor support -- **C,E,G,L,P,S,U,Z,c,d,h,m,o,q** +- **C,E,G,L,P,S,U,Z,c,d,h,m,o,q,u** - minimal effort, low risk, low capabilities - move both emscripten and Mono VM sidecar thread - no user code JS interop on any thread @@ -164,7 +176,7 @@ - Ignores problems **1,2,3,4,5** ## Sidecar + only async just JS proxies UI + JSWebWorker + Blazor WASM server -- **C,E,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q** +- **C,E,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** - no C or managed code on UI thread - no support for blocking sync JSExport calls from UI thread (callbacks) - it will throw PNSE @@ -177,7 +189,7 @@ - Solves **3,4,5** ## Sidecar + async & sync just JS proxies UI + JSWebWorker + Blazor WASM server -- **C,F,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q** +- **C,F,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** - no C or managed code on UI thread - support for blocking sync JSExport calls from UI thread (callbacks) - at blocking the UI is at least well isolated from runtime code @@ -197,7 +209,7 @@ - it needs to also use `SynchronizationContext` for `JSExport` and callbacks, to dispatch to deputy. - blazor render could be both legacy render or Blazor server style - because we have both memory and mono on the UI thread -- Ignores **1,2** for JS callback +- Ignores **1,2** for JS callback - Solves **1,2** for managed code - emscripten main loop stays responsive - unless there is sync `JSImport`->`JSExport` call @@ -211,6 +223,11 @@ - it still needs to enter GC barrier and so it could block UI for GC run - blazor render could be both legacy render or Blazor server style - because we have both memory and mono on the UI thread +- Ignores **1,2** for JS callback +- Solves **1,2** for managed code + - emscripten main loop stays responsive + - unless there is sync `JSImport`->`JSExport` call +- Solves **3,4,5** # Details @@ -226,6 +243,122 @@ - with spin-wait in UI in JS it would at least block the UI rendering - with spin-wait in emscripten, it could deadlock the rest of the app +## Proxies - thread affinity +- all of them have thread affinity +- all of them need to be used and disposed on correct thread + - how to dispatch to correct thread is one of the questions here +- all of them are registered to 2 GCs + - maybe `Dispose` could be schedule asynchronously instead of blocking Mono GC +- `JSObject` + - have thread ID on them, so we know which thread owns them +- `JSException` + - they are a proxy because stack trace is lazy + - we could eval stack trace eagerly, so they could become "value type" + - but it would be expensive +- `Task` + - continuations need to be dispatched onto correct JS thread + - they can't be passed back to wrong JS thread + - resolving `Task` could be async +- `Func`/`Action`/`JSImport` + - callbacks need to be dispatched onto correct JS thread + - they can't be passed back to wrong JS thread + - calling functions which return `Task` could be aggressively async + - unless the synchronous part of the implementation could throw exception + - which maybe our HTTP/WS could do ? + - could this difference we ignored ? +- `JSExport`/`Function` + - we already are on correct thread in JS + - would anything improve if we tried to be more async ? + +## HTTP and WS clients +- are implemented in terms of `JSObject` and `Promise` proxies +- they have thread affinity, see above + - typically to the `JSWebWorker` of the creator +- could C# thread-pool threads create HTTP clients ? + - there is no `JSWebWorker` + - but this is JS state which the runtime could manage well + - so the answer should be yes! +- but are consumed via their C# Streams from any thread. +- And so need to solve the dispatch to correct thread. + +# Dispatching call, who is responsible +- User code + - this is difficult and complex task which many will fail to do right + - it can't be user code for HTTP/WS clients because there is no direct call via Streams + - authors of 3rd party components would need to do it to hide complexity from users +- Roslyn generator: JSExport is already on correct thread, no action +- Roslyn generator: JSImport + - it needs to stay backward compatible with Net7, Net8 already generated code + - it needs to do it via public C# API + - possibly new API `JSHost.Post` or `JSHost.Send` + - it needs to re-consider current `stackalloc` + - probably by re-ordering Roslyn generated code of `__arg_return.ToManaged(out __retVal);` before `JSFunctionBinding.InvokeJS` + - it needs to propagate exceptions + +# Dispatching JSImport - what should happen +- is normally bound to JS context of the calling managed thread +- but it could be called with `JSObject` parameters which are bound to another thread + - if targets don't match each other throw `ArgumentException` ? + - if it's called from thread-pool thread + - which is not `JSWebWorker` + - should we dispatch it by affinity of the parameters ? + - if parameters affinity do match each other but not match current `JSWebWorker` + - should we dispatch it by affinity of the parameters ? + - this would solve HTTP/WS scenarios + +# Dispatching call - options +- `JSSynchronizationContext` + - is implementation of `SynchronizationContext` installed to + - managed thread with `JSWebWorker` + - or main managed thread + - it has asynchronous `SynchronizationContext.Post` + - it has synchronous `SynchronizationContext.Send` + - can propagate caller stack frames + - can propagate exceptions from callee thread + - this would not work in sidecar design + - when the method is async + - we can schedule it asynchronously to the `JSWebWorker` or main thread + - propagate exceptions via `TaskCompletionSource.SetException` from any managed thread + - when the method is sync + - create internal `TaskCompletionSource` + - we can schedule it asynchronously to the `JSWebWorker` or main thread + - we could block-wait on `Task.Wait` until it's done. + - return sync result +- `emscripten_dispatch_to_thread_async` - in deputy design + - can dispatch async call to C function on the timer loop of target pthread + - doesn't block and doesn't propagate exceptions + - needs to deal with `stackalloc` in C# generated stub + - probably by re-ordering Roslyn generated code + - when the method is async + - extract GCHandle of the `TaskCompletionSource` + - copy "stack frame" and pass it to + - asynchronously schedule to the target pthread via `emscripten_dispatch_to_thread_async` + - unpack the "stack frame" + - using local Mono `cwraps` for marshaling + - capture JS result/exception + - use stored `TaskCompletionSource` to resolve the `Task` on target thread + - when the method is sync + - inside `JSFunctionBinding.InvokeJS`: + - create internal `TaskCompletionSource` + - use async dispatch above + - block-wait on `Task.Wait` until it's done. + - return sync result + - or when the method is sync + - do something similar in C or JS +- "comlink" - in sidecar design + - when the method is async + - extract GCHandle of the `TaskCompletionSource` + - convert parameters to JS (sidecar context) + - using sidecar Mono `cwraps` for marshaling + - call UI thread via "comlink" + - will create comlink proxies + - capture JS result/exception from "comlink" + - use stored `TaskCompletionSource` to resolve the `Task` on target thread +- `postMessage` + - can send serializable message to WebWorker + - doesn't block and doesn't propagate exceptions + - this is slow + ## JSWebWorker with JS interop - is proposed concept to let user to manage JS state of the worker explicitly - because of problem **4** @@ -261,6 +394,10 @@ - `JSImport` used for logging: `globalThis.console.debug`, `globalThis.console.error`, `globalThis.console.info`, `globalThis.console.warn`, `Blazor._internal.dotNetCriticalError` - probably could be any JS context +# WebPack, Rollup friendly +- it's not clear how to make this single-file +- because web workers need to start separate script(s) + # Current state 2023 Sep - we already ship MT version of the runtime in the wasm-tools workload. - It's enabled by `true` and it requires COOP HTTP headers. From 7cdb8e2e414b1a16acfa4b03f07825a6831ac0bb Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 27 Sep 2023 21:11:32 +0200 Subject: [PATCH 011/108] XHR details --- accepted/2023/wasm-browser-threads.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index a9fc8cd0c..5c9d96fff 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -31,9 +31,9 @@ - It eats your battery - Browser will kill your tab at random point (Aw, snap). - It's not deterministic and you can't really test your app to prove it harmless. - - Firefox (still) has synchronous XHR which could be captured by async code in service worker - - it's deprecated legacy API - - but other browsers don't and it's unlikely they will implement it + - Firefox (still) has synchronous `XMLHttpRequest` which could be captured by async code in service worker + - it's [deprecated legacy API](https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Synchronous_and_Asynchronous_Requests#synchronous_request) + - [but other browsers don't](https://wpt.fyi/results/service-workers/service-worker/fetch-request-xhr-sync.https.html?label=experimental&label=master&aligned) and it's unlikely they will implement it - there are deployment and security challenges with it - all the other threads/workers could synchronously block - if we will have managed thread on the UI thread, any `lock` or Mono GC barrier could cause spin-wait From 8d475d2959437bf32b7d9f5883b3674e42b75f72 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 27 Sep 2023 21:18:45 +0200 Subject: [PATCH 012/108] subtle and lazy DLL download --- accepted/2023/wasm-browser-threads.md | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 5c9d96fff..b8d2065ae 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -16,6 +16,7 @@ - try to make it debugging friendly - implement crypto via `subtle` browser API - allow calls to synchronous JSExport from UI thread (callback) + - allow lazy `[DLLImport]` to download from the server † Note: all the text below discusses MT build only, unless explicit about ST build. @@ -394,10 +395,20 @@ - `JSImport` used for logging: `globalThis.console.debug`, `globalThis.console.error`, `globalThis.console.info`, `globalThis.console.warn`, `Blazor._internal.dotNetCriticalError` - probably could be any JS context -# WebPack, Rollup friendly +## WebPack, Rollup friendly - it's not clear how to make this single-file - because web workers need to start separate script(s) +## Subtle crypto +- once we have have all managed threads outside of the UI thread +- we could synchronously wait for another thread to do async operations +- and use [async API of subtle crypto](https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto) + +## Lazy DLLImport - download +- once we have have all managed threads outside of the UI thread +- we could synchronously wait for another thread to do async operations +- to fetch another DLL which was not pre-downloaded + # Current state 2023 Sep - we already ship MT version of the runtime in the wasm-tools workload. - It's enabled by `true` and it requires COOP HTTP headers. From fb487f97dcf0398491781c8563ee74c0e506e5d5 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 27 Sep 2023 21:25:32 +0200 Subject: [PATCH 013/108] feedback --- accepted/2023/wasm-browser-threads.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index b8d2065ae..85988d0af 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -2,21 +2,20 @@ ## Goals - CPU intensive workloads on dotnet thread pool - - enable blocking .Wait APIs from C# user code on all threads + - enable blocking `Task.Wait` and `lock()` like APIs from C# user code on all threads - Current public API throws PNSE for it - This is core part on MT value proposition. - If people want to use existing MT code-bases, most of the time, the code is full of locks. - People want to use existing desktop/server multi-threaded code as is. - - allow HTTP and WS C# APIs to be used from any thread - - Underlying JS object have thread affinity + - allow HTTP and WS C# APIs to be used from any thread despite underlying JS object affinity - JSImport/JSExport interop in maximum possible extent - don't change/break single threaded build. † ## Lower priority goals - try to make it debugging friendly - implement crypto via `subtle` browser API - - allow calls to synchronous JSExport from UI thread (callback) - allow lazy `[DLLImport]` to download from the server + - allow calls to synchronous JSExport from UI thread (callback) † Note: all the text below discusses MT build only, unless explicit about ST build. @@ -52,7 +51,7 @@ - emscripten pre-allocates poll of web worker to be used as pthreads. - Because they could only be created asynchronously, but `pthread_create` is synchronous call - Because they are slow to start - - those pthreads have JS context `self`, which is re-used when mapped to C# thread pool + - those pthreads have stateful JS context `self`, which is re-used when mapped to C# thread pool - when we allow JS interop on a managed thread, we need a way how to clean up the JS state **5)** Blazor's `renderBatch` is using direct memory access @@ -90,7 +89,7 @@ - **A)** pretend it's not a problem (this we already have) - **B)** move user C# code to web worker - **C)** move all Mono to web worker -- how to deal with blocking in synchronous JS calls from UI thread +- how to deal with blocking in synchronous JS calls from UI thread (like `onClick` callback) - **D)** pretend it's not a problem (this we already have) - **E)** throw PNSE when synchronous JSExport is called on UI thread - **F)** dispatch calls to synchronous JSExport to web worker and spin-wait on JS side of UI thread. From 3104ff41612ca712b3663ba5ac22f1cd6013f9ae Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 27 Sep 2023 21:37:58 +0200 Subject: [PATCH 014/108] wip --- accepted/2023/wasm-browser-threads.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 85988d0af..1b4f422c3 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -318,7 +318,7 @@ - this would not work in sidecar design - when the method is async - we can schedule it asynchronously to the `JSWebWorker` or main thread - - propagate exceptions via `TaskCompletionSource.SetException` from any managed thread + - propagate result/exceptions via `TaskCompletionSource.SetException` from any managed thread - when the method is sync - create internal `TaskCompletionSource` - we can schedule it asynchronously to the `JSWebWorker` or main thread From 5cfcff063d4320c08590c8008ae778c5331608f8 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 27 Sep 2023 21:41:03 +0200 Subject: [PATCH 015/108] wip --- accepted/2023/wasm-browser-threads.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 1b4f422c3..2985a1d2c 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -307,7 +307,7 @@ - this would solve HTTP/WS scenarios # Dispatching call - options -- `JSSynchronizationContext` +- `JSSynchronizationContext` - in deputy design - is implementation of `SynchronizationContext` installed to - managed thread with `JSWebWorker` - or main managed thread @@ -315,7 +315,6 @@ - it has synchronous `SynchronizationContext.Send` - can propagate caller stack frames - can propagate exceptions from callee thread - - this would not work in sidecar design - when the method is async - we can schedule it asynchronously to the `JSWebWorker` or main thread - propagate result/exceptions via `TaskCompletionSource.SetException` from any managed thread @@ -324,6 +323,8 @@ - we can schedule it asynchronously to the `JSWebWorker` or main thread - we could block-wait on `Task.Wait` until it's done. - return sync result + - this would not work in sidecar design + - because UI is not managed thread there - `emscripten_dispatch_to_thread_async` - in deputy design - can dispatch async call to C function on the timer loop of target pthread - doesn't block and doesn't propagate exceptions @@ -345,6 +346,9 @@ - return sync result - or when the method is sync - do something similar in C or JS + - this would not work in sidecar design + - because UI is not managed thread there + - Mono cwraps are not available either - "comlink" - in sidecar design - when the method is async - extract GCHandle of the `TaskCompletionSource` From 8f3807ba90714b8c1eae080d7a0713ec2a468e0c Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Fri, 29 Sep 2023 12:22:18 +0200 Subject: [PATCH 016/108] feedback --- accepted/2023/wasm-browser-threads.md | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 2985a1d2c..2177bd5cc 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -19,6 +19,10 @@ † Note: all the text below discusses MT build only, unless explicit about ST build. +## Key idea in this proposal + +Move all managed user code out of UI/DOM thread, so that it becomes consistent with all other threads. + ## Context - Problems **1)** If you have multithreading, any thread might need to block while waiting for any other to release a lock. - locks are in the user code, in nuget packages, in Mono VM itself @@ -26,8 +30,14 @@ - in single-threaded build of the runtime, all of this is NOOP. That's why it works on UI thread. **2)** UI thread in the browser can't synchronously block + - that means, "you can't not block" UI thread, not just usual "you should not block" UI + - `Atomics.wait()` throws `TypeError` on UI thread - you can spin-lock but it's bad idea. - - Deadlock: when you spin-block, the JS timer loop and any messages are not pumping. But code in other threads may be waiting for some such event to resolve. + - Deadlock: when you spin-block, the JS timer loop and any messages are not pumping. + - But code in other threads may be waiting for some such event to resolve. + - all async/await don't work during spin-wait + - all networking doesn't work + - you can't create or join another web worker - It eats your battery - Browser will kill your tab at random point (Aw, snap). - It's not deterministic and you can't really test your app to prove it harmless. @@ -36,8 +46,9 @@ - [but other browsers don't](https://wpt.fyi/results/service-workers/service-worker/fetch-request-xhr-sync.https.html?label=experimental&label=master&aligned) and it's unlikely they will implement it - there are deployment and security challenges with it - all the other threads/workers could synchronously block + - `Atomics.wait()` works as expected - if we will have managed thread on the UI thread, any `lock` or Mono GC barrier could cause spin-wait - - in case of Mono code, we at least know it's short duration + - in case of Mono code, we at least know it's short duration - we should prevent it from blocking in user code **3)** JavaScript engine APIs and objects have thread affinity. @@ -400,7 +411,10 @@ ## WebPack, Rollup friendly - it's not clear how to make this single-file -- because web workers need to start separate script(s) +- because web workers need to start separate script(s) via `new Worker('./dotnet.js', {type: 'module'})` + - we can start a WebWorker with a Blob, but that's not CSP friendly. + - when bundled together with user code, into "my-react-application.js" we don't have way how to `new Worker('./dotnet.js')` anymore. +- emscripten uses `dotnet.native.worker.js` script. At the moment we don't have nice way how to customize what is inside of it. ## Subtle crypto - once we have have all managed threads outside of the UI thread From cd560d8f50a245a56f18426998d240cd2fe92bec Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Fri, 29 Sep 2023 12:33:32 +0200 Subject: [PATCH 017/108] http --- accepted/2023/wasm-browser-threads.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 2177bd5cc..f6c3f2a85 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -15,6 +15,7 @@ - try to make it debugging friendly - implement crypto via `subtle` browser API - allow lazy `[DLLImport]` to download from the server + - implement synchronous APIs of the HTTP and WS clients. At the moment they throw PNSE. - allow calls to synchronous JSExport from UI thread (callback) † Note: all the text below discusses MT build only, unless explicit about ST build. @@ -287,10 +288,15 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - typically to the `JSWebWorker` of the creator - could C# thread-pool threads create HTTP clients ? - there is no `JSWebWorker` - - but this is JS state which the runtime could manage well - - so the answer should be yes! + - is JS state which the runtime could manage well + - so we could create the HTTP client on the pool worker + - but managed thread pool doesn't know about it and could kill the pthread at any time + - so we could instead create dedicated `JSWebWorker` + - or we could dispatch it to UI thread - but are consumed via their C# Streams from any thread. - And so need to solve the dispatch to correct thread. + - such dispatch will come with overhead + - especially when called with small buffer in tight loop # Dispatching call, who is responsible - User code From 58c44888c0161463dff6268a8afadc02fca30054 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Fri, 29 Sep 2023 12:52:22 +0200 Subject: [PATCH 018/108] http --- accepted/2023/wasm-browser-threads.md | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index f6c3f2a85..57cadb53a 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -286,17 +286,24 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - are implemented in terms of `JSObject` and `Promise` proxies - they have thread affinity, see above - typically to the `JSWebWorker` of the creator +- but are consumed via their C# Streams from any thread. + - therefore need to solve the dispatch to correct thread. + - such dispatch will come with overhead + - especially when called with small buffer in tight loop + - or we could throw PNSE, but it may be difficult for user code to + - know what thread created the client + - have means how to dispatch the call there +- because we could have blocking wait now, we could also implement synchronous APIs of HTTP/WS + - so that existing use code bases would just work without change + - at the moment they throw PNSE + - this would also require separate thread, doing the async job - could C# thread-pool threads create HTTP clients ? - there is no `JSWebWorker` - - is JS state which the runtime could manage well + - this is JS state which the runtime could manage well - so we could create the HTTP client on the pool worker - but managed thread pool doesn't know about it and could kill the pthread at any time - so we could instead create dedicated `JSWebWorker` - or we could dispatch it to UI thread -- but are consumed via their C# Streams from any thread. -- And so need to solve the dispatch to correct thread. - - such dispatch will come with overhead - - especially when called with small buffer in tight loop # Dispatching call, who is responsible - User code From 87fb56e5a2c20c2cc85fe2eb73e46649a370b359 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Fri, 29 Sep 2023 13:24:24 +0200 Subject: [PATCH 019/108] wip --- accepted/2023/wasm-browser-threads.md | 76 +++++++++++++++++---------- 1 file changed, 48 insertions(+), 28 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 57cadb53a..76c464968 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -5,7 +5,7 @@ - enable blocking `Task.Wait` and `lock()` like APIs from C# user code on all threads - Current public API throws PNSE for it - This is core part on MT value proposition. - - If people want to use existing MT code-bases, most of the time, the code is full of locks. + - If people want to use existing MT code-bases, most of the time, the code is full of locks. - People want to use existing desktop/server multi-threaded code as is. - allow HTTP and WS C# APIs to be used from any thread despite underlying JS object affinity - JSImport/JSExport interop in maximum possible extent @@ -33,12 +33,13 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w **2)** UI thread in the browser can't synchronously block - that means, "you can't not block" UI thread, not just usual "you should not block" UI - `Atomics.wait()` throws `TypeError` on UI thread - - you can spin-lock but it's bad idea. - - Deadlock: when you spin-block, the JS timer loop and any messages are not pumping. + - you can spin-wait but it's bad idea. + - Deadlock: when you spin-block, the JS timer loop and any messages are not pumping. - But code in other threads may be waiting for some such event to resolve. - - all async/await don't work during spin-wait + - all async/await don't work - all networking doesn't work - you can't create or join another web worker + - browser dev tools UI freeze - It eats your battery - Browser will kill your tab at random point (Aw, snap). - It's not deterministic and you can't really test your app to prove it harmless. @@ -52,7 +53,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - in case of Mono code, we at least know it's short duration - we should prevent it from blocking in user code -**3)** JavaScript engine APIs and objects have thread affinity. +**3)** JavaScript engine APIs and objects have thread affinity. - The DOM and few other browser APIs are only available on the main UI "thread" - and so, you need to have C# interop with UI, but you can't block there. - HTTP & WS objects have affinity, but we would like to consume them (via Streams) from any managed thread @@ -60,7 +61,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - they need to be disposed on correct thread. GC is running on random thread **4)** State management of JS context `self` of the worker. - - emscripten pre-allocates poll of web worker to be used as pthreads. + - emscripten pre-allocates poll of web worker to be used as pthreads. - Because they could only be created asynchronously, but `pthread_create` is synchronous call - Because they are slow to start - those pthreads have stateful JS context `self`, which is re-used when mapped to C# thread pool @@ -160,10 +161,11 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - where to run the C# main entrypoint - **p)** could be on the UI thread - **q)** could be on the "deputy" or "sidecar" thread -- where to implement subtle crypto +- where to implement sync-to-async: crypto/DLLImport/HTTP APIs/ - **r)** out of scope - **s)** in the UI thread - **t)** is a dedicated web worker + - **z)** in the sidecar or deputy - where to marshal JSImport/JSExport parameters/return/exception - **u)** could be only values types, proxies out of scope - **v)** could be on UI thread (with deputy design and Mono there) @@ -173,9 +175,9 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w # Interesting combinations ## Minimal support -- **A,D,G,L,P,S,U,Y,a,f,h,l,n,p,v** +- **A,D,G,L,P,S,U,Y,a,f,h,l,n,p,v** - this is what we [already have today](#Current-state-2023-Sep) -- it could deadlock or die, +- it could deadlock or die, - JS interop on threads requires lot of user code attention - Keeps problems **1,2,3,4** @@ -195,7 +197,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - this will create double proxy for `Task`, `JSObject`, `Func<>` etc - difficult to GC, difficult to debug - double marshaling of parameters -- Avoids **1,2** for JS callback +- Avoids **1,2** for JS callback - Solves **1,2** for managed code. - emscripten main loop stays responsive - Solves **3,4,5** @@ -209,7 +211,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - this will create double proxy for `Task`, `JSObject`, `Func<>` etc - difficult to GC, difficult to debug - double marshaling of parameters -- Ignores **1,2** for JS callback +- Ignores **1,2** for JS callback - Solves **1,2** for managed code - emscripten main loop stays responsive - unless there is sync `JSImport`->`JSExport` call @@ -286,15 +288,15 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - are implemented in terms of `JSObject` and `Promise` proxies - they have thread affinity, see above - typically to the `JSWebWorker` of the creator -- but are consumed via their C# Streams from any thread. +- but are consumed via their C# Streams from any thread. - therefore need to solve the dispatch to correct thread. - such dispatch will come with overhead - especially when called with small buffer in tight loop - - or we could throw PNSE, but it may be difficult for user code to - - know what thread created the client + - or we could throw PNSE, but it may be difficult for user code to + - know what thread created the client - have means how to dispatch the call there - because we could have blocking wait now, we could also implement synchronous APIs of HTTP/WS - - so that existing use code bases would just work without change + - so that existing user code bases would just work without change - at the moment they throw PNSE - this would also require separate thread, doing the async job - could C# thread-pool threads create HTTP clients ? @@ -302,7 +304,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - this is JS state which the runtime could manage well - so we could create the HTTP client on the pool worker - but managed thread pool doesn't know about it and could kill the pthread at any time - - so we could instead create dedicated `JSWebWorker` + - so we could instead create dedicated `JSWebWorker` managed thread - or we could dispatch it to UI thread # Dispatching call, who is responsible @@ -310,14 +312,14 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - this is difficult and complex task which many will fail to do right - it can't be user code for HTTP/WS clients because there is no direct call via Streams - authors of 3rd party components would need to do it to hide complexity from users -- Roslyn generator: JSExport is already on correct thread, no action -- Roslyn generator: JSImport +- Roslyn generator: JSImport - if we make it responsible for the dispatch - it needs to stay backward compatible with Net7, Net8 already generated code - it needs to do it via public C# API - possibly new API `JSHost.Post` or `JSHost.Send` - it needs to re-consider current `stackalloc` - probably by re-ordering Roslyn generated code of `__arg_return.ToManaged(out __retVal);` before `JSFunctionBinding.InvokeJS` - it needs to propagate exceptions +- Roslyn generator: JSExport - if we make it responsible for the dispatch # Dispatching JSImport - what should happen - is normally bound to JS context of the calling managed thread @@ -330,13 +332,21 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - should we dispatch it by affinity of the parameters ? - this would solve HTTP/WS scenarios +# Dispatching JSExport - what should happen +- when caller is UI, we need to dispatch back to managed thread + - preferably deputy or sidecar thread +- when caller is `JSWebWorker`, + - we are probably on correct thread already + - when caller is callback from HTTP/WS we could dispatch to any managed thread +- caller can't be managed thread pool, because they would not use JS `self` context + # Dispatching call - options - `JSSynchronizationContext` - in deputy design - - is implementation of `SynchronizationContext` installed to + - is implementation of `SynchronizationContext` installed to - managed thread with `JSWebWorker` - or main managed thread - it has asynchronous `SynchronizationContext.Post` - - it has synchronous `SynchronizationContext.Send` + - it has synchronous `SynchronizationContext.Send` - can propagate caller stack frames - can propagate exceptions from callee thread - when the method is async @@ -347,7 +357,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - we can schedule it asynchronously to the `JSWebWorker` or main thread - we could block-wait on `Task.Wait` until it's done. - return sync result - - this would not work in sidecar design + - this would not work in sidecar design - because UI is not managed thread there - `emscripten_dispatch_to_thread_async` - in deputy design - can dispatch async call to C function on the timer loop of target pthread @@ -356,7 +366,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - probably by re-ordering Roslyn generated code - when the method is async - extract GCHandle of the `TaskCompletionSource` - - copy "stack frame" and pass it to + - copy "stack frame" and pass it to - asynchronously schedule to the target pthread via `emscripten_dispatch_to_thread_async` - unpack the "stack frame" - using local Mono `cwraps` for marshaling @@ -370,7 +380,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - return sync result - or when the method is sync - do something similar in C or JS - - this would not work in sidecar design + - this would not work in sidecar design - because UI is not managed thread there - Mono cwraps are not available either - "comlink" - in sidecar design @@ -383,15 +393,23 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - capture JS result/exception from "comlink" - use stored `TaskCompletionSource` to resolve the `Task` on target thread - `postMessage` - - can send serializable message to WebWorker + - can send serializable message to web worker - doesn't block and doesn't propagate exceptions - this is slow +## Spin-waiting in JS +- if we want to keep synchronous JS APIs to work on UI thread, we have to spin-wait +- at the moment emscripten implements spin-wait in wasm + - if we want sidecar design we also need pure JS version of it (we have OK prototype) +- if emscripten main is not running in UI thread, it means it could still pump events and would not deadlock in Mono or managed code +- it could still deadlock if there is synchronous JSImport call to UI thread while UI thread is spin-waiting on it. + - this would be clearly user code mistake + ## JSWebWorker with JS interop - is proposed concept to let user to manage JS state of the worker explicitly - because of problem **4** - is C# thread created and disposed by new API for it -- could block on synchronization primitives +- could block on synchronization primitives - could do full JSImport/JSExport to it's own JS `self` context - there is `JSSynchronizationContext`` installed on it - so that user code could dispatch back to it, in case that it needs to call `JSObject` proxy (with thread affinity) @@ -426,17 +444,19 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - it's not clear how to make this single-file - because web workers need to start separate script(s) via `new Worker('./dotnet.js', {type: 'module'})` - we can start a WebWorker with a Blob, but that's not CSP friendly. - - when bundled together with user code, into "my-react-application.js" we don't have way how to `new Worker('./dotnet.js')` anymore. + - when bundled together with user code, into `./my-react-application.js` we don't have way how to `new Worker('./dotnet.js')` anymore. - emscripten uses `dotnet.native.worker.js` script. At the moment we don't have nice way how to customize what is inside of it. +- for ST build we have JS API to replace our dynamic `import()` of our modules with provided instance of a module. + - we would have to have some way how 3rs party code could become responsible for doing it also on web worker (which we start) ## Subtle crypto - once we have have all managed threads outside of the UI thread -- we could synchronously wait for another thread to do async operations +- we could synchronously wait for another thread to do async operations - and use [async API of subtle crypto](https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto) ## Lazy DLLImport - download - once we have have all managed threads outside of the UI thread -- we could synchronously wait for another thread to do async operations +- we could synchronously wait for another thread to do async operations - to fetch another DLL which was not pre-downloaded # Current state 2023 Sep From b7910c5053dbe8f13040bc00ff8fb81969d2064e Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Fri, 29 Sep 2023 13:57:28 +0200 Subject: [PATCH 020/108] wip --- accepted/2023/wasm-browser-threads.md | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 76c464968..bcb90f9a6 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -142,8 +142,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - where to create HTTP+WS JS objects - **d)** in the UI thread - **e)** in the managed main thread - - **f)** in first calling managed thread - - install `JSSynchronizationContext` even without `JSWebWorker` ? + - **f)** in first calling `JSWebWorker` managed thread - how to dispatch calls to HTTP+WS JS objects - **g)** try to stick to the same thread via `ConfigureAwait(false)`. - doesn't really work. `Task` migrate too freely @@ -164,7 +163,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - where to implement sync-to-async: crypto/DLLImport/HTTP APIs/ - **r)** out of scope - **s)** in the UI thread - - **t)** is a dedicated web worker + - **t)** in a dedicated web worker - **z)** in the sidecar or deputy - where to marshal JSImport/JSExport parameters/return/exception - **u)** could be only values types, proxies out of scope @@ -174,14 +173,14 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w # Interesting combinations -## Minimal support +## (8) Minimal support - **A,D,G,L,P,S,U,Y,a,f,h,l,n,p,v** - this is what we [already have today](#Current-state-2023-Sep) - it could deadlock or die, - JS interop on threads requires lot of user code attention - Keeps problems **1,2,3,4** -## Sidecar + no JS interop + narrow Blazor support +## (9) Sidecar + no JS interop + narrow Blazor support - **C,E,G,L,P,S,U,Z,c,d,h,m,o,q,u** - minimal effort, low risk, low capabilities - move both emscripten and Mono VM sidecar thread @@ -189,7 +188,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - internal solutions for Blazor needs - Ignores problems **1,2,3,4,5** -## Sidecar + only async just JS proxies UI + JSWebWorker + Blazor WASM server +## (10) Sidecar + only async just JS proxies UI + JSWebWorker + Blazor WASM server - **C,E,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** - no C or managed code on UI thread - no support for blocking sync JSExport calls from UI thread (callbacks) @@ -202,7 +201,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - emscripten main loop stays responsive - Solves **3,4,5** -## Sidecar + async & sync just JS proxies UI + JSWebWorker + Blazor WASM server +## (11) Sidecar + async & sync just JS proxies UI + JSWebWorker + Blazor WASM server - **C,F,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** - no C or managed code on UI thread - support for blocking sync JSExport calls from UI thread (callbacks) @@ -217,9 +216,10 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - unless there is sync `JSImport`->`JSExport` call - Solves **3,4,5** -## Deputy + managed UI interop + JSWebWorker +## (12) Deputy + managed dispatch to UI + JSWebWorker +- **B,F,K,N,Q,S/T,U,W,a/b/c,d+f,h,l,n,s/z,v** - this uses `JSSynchronizationContext` to dispatch calls to UI thread - - this is problematic because some managed code is actually running on UI thread + - this is "dirty" as compared to sidecar because some managed code is actually running on UI thread - it needs to also use `SynchronizationContext` for `JSExport` and callbacks, to dispatch to deputy. - blazor render could be both legacy render or Blazor server style - because we have both memory and mono on the UI thread @@ -229,7 +229,10 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - unless there is sync `JSImport`->`JSExport` call - Solves **3,4,5** -## Deputy + UI bound interop + JSWebWorker +## (13) Deputy + emscripten dispatch to UI + JSWebWorker +- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** +- is variation of **(12)** + - with emscripten dispatch and marshaling in UI thread - this uses `emscripten_dispatch_to_thread_async` for `call_entry_point`, `complete_task`, `cwraps.mono_wasm_invoke_method_bound`, `mono_wasm_invoke_bound_function`, `mono_wasm_invoke_import`, `call_delegate_method` to get to the UI thread. - it uses other `cwraps` locally on UI thread, like `mono_wasm_new_root`, `stringToMonoStringRoot`, `malloc`, `free`, `create_task_callback_method` - it means that interop related managed runtime code is running on the UI thread, but not the user code. From daf801ade419858a552545c2c28e00e18055380f Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Fri, 29 Sep 2023 14:12:51 +0200 Subject: [PATCH 021/108] wip --- accepted/2023/wasm-browser-threads.md | 34 ++++++++++++++++++--------- 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index bcb90f9a6..cb969770b 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -61,7 +61,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - they need to be disposed on correct thread. GC is running on random thread **4)** State management of JS context `self` of the worker. - - emscripten pre-allocates poll of web worker to be used as pthreads. + - emscripten pre-allocates pool of web worker to be used as pthreads. - Because they could only be created asynchronously, but `pthread_create` is synchronous call - Because they are slow to start - those pthreads have stateful JS context `self`, which is re-used when mapped to C# thread pool @@ -196,8 +196,8 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - this will create double proxy for `Task`, `JSObject`, `Func<>` etc - difficult to GC, difficult to debug - double marshaling of parameters -- Avoids **1,2** for JS callback - Solves **1,2** for managed code. +- Avoids **1,2** for JS callback - emscripten main loop stays responsive - Solves **3,4,5** @@ -210,26 +210,26 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - this will create double proxy for `Task`, `JSObject`, `Func<>` etc - difficult to GC, difficult to debug - double marshaling of parameters -- Ignores **1,2** for JS callback - Solves **1,2** for managed code - - emscripten main loop stays responsive - unless there is sync `JSImport`->`JSExport` call +- Ignores **1,2** for JS callback + - emscripten main loop stays responsive - Solves **3,4,5** -## (12) Deputy + managed dispatch to UI + JSWebWorker +## (12) Deputy + managed dispatch to UI + JSWebWorker + with sync JSExport - **B,F,K,N,Q,S/T,U,W,a/b/c,d+f,h,l,n,s/z,v** - this uses `JSSynchronizationContext` to dispatch calls to UI thread - this is "dirty" as compared to sidecar because some managed code is actually running on UI thread - it needs to also use `SynchronizationContext` for `JSExport` and callbacks, to dispatch to deputy. - blazor render could be both legacy render or Blazor server style - because we have both memory and mono on the UI thread -- Ignores **1,2** for JS callback - Solves **1,2** for managed code - - emscripten main loop stays responsive - unless there is sync `JSImport`->`JSExport` call +- Ignores **1,2** for JS callback + - emscripten main loop could deadlock on sync JSExport - Solves **3,4,5** -## (13) Deputy + emscripten dispatch to UI + JSWebWorker +## (13) Deputy + emscripten dispatch to UI + JSWebWorker + with sync JSExport - **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** - is variation of **(12)** - with emscripten dispatch and marshaling in UI thread @@ -237,13 +237,24 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - it uses other `cwraps` locally on UI thread, like `mono_wasm_new_root`, `stringToMonoStringRoot`, `malloc`, `free`, `create_task_callback_method` - it means that interop related managed runtime code is running on the UI thread, but not the user code. - it means that parameter marshalling is fast (compared to sidecar) - - it still needs to enter GC barrier and so it could block UI for GC run + - it still needs to enter GC barrier and so it could block UI for GC run shortly - blazor render could be both legacy render or Blazor server style - because we have both memory and mono on the UI thread +- Solves **1,2** for managed code + - unless there is sync `JSImport`->`JSExport` call - Ignores **1,2** for JS callback + - emscripten main loop could deadlock on sync JSExport +- Solves **3,4,5** + +## (14) Deputy + emscripten dispatch to UI + JSWebWorker + no sync JSExport +- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** +- is variation of **(13)** + - without support for synchronous JSExport - Solves **1,2** for managed code - emscripten main loop stays responsive - unless there is sync `JSImport`->`JSExport` call +- Avoids **2** for JS callback + - by throwing PNSE - Solves **3,4,5** # Details @@ -257,8 +268,9 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - async could be called on all `JSWebWorker` threads - sync could be called on `JSWebWorker` - sync could be called on from UI thread is problematic - - with spin-wait in UI in JS it would at least block the UI rendering - - with spin-wait in emscripten, it could deadlock the rest of the app + - with spin-wait in UI in JS it has **2)** problems + - with spin-wait in UI when emscripten is there could also deadlock the rest of the app + - this means that combination of sync JSExport and deputy design is dangerous ## Proxies - thread affinity - all of them have thread affinity From 686f1a611c5d49a2daf843f48507206a1e78edb2 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Sun, 1 Oct 2023 14:40:15 +0200 Subject: [PATCH 022/108] Aleksey's feedback --- accepted/2023/wasm-browser-threads.md | 89 +++++++++++++++++---------- 1 file changed, 55 insertions(+), 34 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index cb969770b..7192e6297 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -1,22 +1,26 @@ # Multi-threading on a browser ## Goals - - CPU intensive workloads on dotnet thread pool - - enable blocking `Task.Wait` and `lock()` like APIs from C# user code on all threads - - Current public API throws PNSE for it - - This is core part on MT value proposition. - - If people want to use existing MT code-bases, most of the time, the code is full of locks. - - People want to use existing desktop/server multi-threaded code as is. - - allow HTTP and WS C# APIs to be used from any thread despite underlying JS object affinity - - JSImport/JSExport interop in maximum possible extent - - don't change/break single threaded build. † +- CPU intensive workloads on dotnet thread pool +- enable blocking `Task.Wait` and `lock()` like APIs from C# user code on all threads + - Current public API throws PNSE for it + - This is core part on MT value proposition. + - If people want to use existing MT code-bases, most of the time, the code is full of locks. + - People want to use existing desktop/server multi-threaded code as is. +- allow HTTP and WS C# APIs to be used from any thread despite underlying JS object affinity +- JSImport/JSExport interop in maximum possible extent +- don't change/break single threaded build. † ## Lower priority goals - - try to make it debugging friendly - - implement crypto via `subtle` browser API - - allow lazy `[DLLImport]` to download from the server - - implement synchronous APIs of the HTTP and WS clients. At the moment they throw PNSE. - - allow calls to synchronous JSExport from UI thread (callback) +- try to make it debugging friendly +- sync C# to async JS + - dynamic creation of new pthread + - implement crypto via `subtle` browser API + - allow lazy `[DLLImport]` to download from the server + - implement synchronous APIs of the HTTP and WS clients. At the moment they throw PNSE. +- sync JS to async JS to sync C# + - allow calls to synchronous JSExport from UI thread (callback) +- don't prevent future marshaling of JS [transferable objects](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects), like streams and canvas. † Note: all the text below discusses MT build only, unless explicit about ST build. @@ -191,6 +195,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w ## (10) Sidecar + only async just JS proxies UI + JSWebWorker + Blazor WASM server - **C,E,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** - no C or managed code on UI thread + - this architectural clarity is major selling point for sidecar design - no support for blocking sync JSExport calls from UI thread (callbacks) - it will throw PNSE - this will create double proxy for `Task`, `JSObject`, `Func<>` etc @@ -237,6 +242,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - it uses other `cwraps` locally on UI thread, like `mono_wasm_new_root`, `stringToMonoStringRoot`, `malloc`, `free`, `create_task_callback_method` - it means that interop related managed runtime code is running on the UI thread, but not the user code. - it means that parameter marshalling is fast (compared to sidecar) + - this deputy design is major selling point #2 - it still needs to enter GC barrier and so it could block UI for GC run shortly - blazor render could be both legacy render or Blazor server style - because we have both memory and mono on the UI thread @@ -299,6 +305,15 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - we already are on correct thread in JS - would anything improve if we tried to be more async ? +## JSWebWorker with JS interop +- is proposed concept to let user to manage JS state of the worker explicitly + - because of problem **4** +- is C# thread created and disposed by new API for it +- could block on synchronization primitives +- could do full JSImport/JSExport to it's own JS `self` context +- there is `JSSynchronizationContext`` installed on it + - so that user code could dispatch back to it, in case that it needs to call `JSObject` proxy (with thread affinity) + ## HTTP and WS clients - are implemented in terms of `JSObject` and `Promise` proxies - they have thread affinity, see above @@ -314,13 +329,16 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - so that existing user code bases would just work without change - at the moment they throw PNSE - this would also require separate thread, doing the async job -- could C# thread-pool threads create HTTP clients ? - - there is no `JSWebWorker` - - this is JS state which the runtime could manage well - - so we could create the HTTP client on the pool worker - - but managed thread pool doesn't know about it and could kill the pthread at any time - - so we could instead create dedicated `JSWebWorker` managed thread - - or we could dispatch it to UI thread + +## JSImport calls on threads without JSWebWorker +- what should happen when thread-pool thread uses JSImport directly ? +- what should happen when thread-pool thread uses HTTP/WS clients ? +- we could dispatch it to UI thread + - easy to understand default behavior + - downside is blocking the UI and emscripten loops with CPU intensive activity + - in sidecar design, also extra copy of buffers +- we could instead create dedicated `JSWebWorker` managed thread + - this extra worker could also serve all the sync-to-async jobs # Dispatching call, who is responsible - User code @@ -329,6 +347,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - authors of 3rd party components would need to do it to hide complexity from users - Roslyn generator: JSImport - if we make it responsible for the dispatch - it needs to stay backward compatible with Net7, Net8 already generated code + - how to detect that there is new version of generated code ? - it needs to do it via public C# API - possibly new API `JSHost.Post` or `JSHost.Send` - it needs to re-consider current `stackalloc` @@ -414,21 +433,23 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w ## Spin-waiting in JS - if we want to keep synchronous JS APIs to work on UI thread, we have to spin-wait + - we probably should have opt-in configuration flag for this + - making user responsible for the consequences - at the moment emscripten implements spin-wait in wasm - - if we want sidecar design we also need pure JS version of it (we have OK prototype) -- if emscripten main is not running in UI thread, it means it could still pump events and would not deadlock in Mono or managed code + - See [pthread_cond_timedwait.c](https://github.com/emscripten-core/emscripten/blob/cbf4256d651455abc7b3332f1943d3df0698e990/system/lib/libc/musl/src/thread/pthread_cond_timedwait.c#L117-L118) and [__timedwait.c](https://github.com/emscripten-core/emscripten/blob/cbf4256d651455abc7b3332f1943d3df0698e990/system/lib/libc/musl/src/thread/__timedwait.c#L67-L69) + - I was not able to confirm that they would call `emscripten_check_mailbox` during spin-wait +- in sidecar design - emscripten main is not running in UI thread + - it means it could still pump events and would not deadlock in Mono or managed code + - unless the sidecar thread is blocked, or CPU hogged, which could happen + - we need pure JS version of spin-wait and we have OK enough prototype +- in deputy design - emscripten main is running in UI thread + - but the UI thread is not running managed code + - it means it could still pump events and would not deadlock in Mono or managed code + - this deputy design is major selling point #1 + - unless user code opts-in to call sync JSExport - it could still deadlock if there is synchronous JSImport call to UI thread while UI thread is spin-waiting on it. - this would be clearly user code mistake -## JSWebWorker with JS interop -- is proposed concept to let user to manage JS state of the worker explicitly - - because of problem **4** -- is C# thread created and disposed by new API for it -- could block on synchronization primitives -- could do full JSImport/JSExport to it's own JS `self` context -- there is `JSSynchronizationContext`` installed on it - - so that user code could dispatch back to it, in case that it needs to call `JSObject` proxy (with thread affinity) - ## Debugging - VS debugger would work as usual - Chrome dev tools would only see the events coming from `postMessage` or `Atomics.waitAsync` @@ -437,7 +458,6 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w ## Blazor - as compared to single threaded runtime, the major difference would be no synchronous callbacks. - for example from DOM `onClick`. This is one of the reasons people prefer ST WASM over Blazor Server. - - but there is really [no way around it](#problem), because you can't have both MT and sync calls from UI. - Blazor `renderBatch` - currently `Blazor._internal.renderBatch` -> `MONO.getI16`, `MONO.getI32`, `MONO.getF32`, `BINDING.js_string_to_mono_string`, `BINDING.conv_string`, `BINDING.unbox_mono_obj` - we could also [RenderBatchWriter](https://github.com/dotnet/aspnetcore/blob/045afcd68e6cab65502fa307e306d967a4d28df6/src/Components/Shared/src/RenderBatchWriter.cs) in the WASM @@ -462,7 +482,8 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - when bundled together with user code, into `./my-react-application.js` we don't have way how to `new Worker('./dotnet.js')` anymore. - emscripten uses `dotnet.native.worker.js` script. At the moment we don't have nice way how to customize what is inside of it. - for ST build we have JS API to replace our dynamic `import()` of our modules with provided instance of a module. - - we would have to have some way how 3rs party code could become responsible for doing it also on web worker (which we start) + - we would have to have some way how 3rd party code could become responsible for doing it also on web worker (which we start) +- what other JS frameworks do when they want to be webpack friendly and create web workers ? ## Subtle crypto - once we have have all managed threads outside of the UI thread From ea9aab0e8fefbd7821259b508839104633a191a6 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 2 Oct 2023 10:21:21 +0200 Subject: [PATCH 023/108] wip --- accepted/2023/wasm-browser-threads.md | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 7192e6297..ffc36e3bb 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -61,7 +61,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - The DOM and few other browser APIs are only available on the main UI "thread" - and so, you need to have C# interop with UI, but you can't block there. - HTTP & WS objects have affinity, but we would like to consume them (via Streams) from any managed thread - - Any `JSObject`, `JSException` and `Task` have thread affinity + - Any `JSObject`, `JSException` and `Promise`->`Task` have thread affinity - they need to be disposed on correct thread. GC is running on random thread **4)** State management of JS context `self` of the worker. @@ -203,7 +203,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - double marshaling of parameters - Solves **1,2** for managed code. - Avoids **1,2** for JS callback - - emscripten main loop stays responsive + - emscripten main loop stays responsive only when main managed thread is idle - Solves **3,4,5** ## (11) Sidecar + async & sync just JS proxies UI + JSWebWorker + Blazor WASM server @@ -218,7 +218,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - Solves **1,2** for managed code - unless there is sync `JSImport`->`JSExport` call - Ignores **1,2** for JS callback - - emscripten main loop stays responsive + - emscripten main loop stays responsive only when main managed thread is idle - Solves **3,4,5** ## (12) Deputy + managed dispatch to UI + JSWebWorker + with sync JSExport @@ -252,7 +252,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - emscripten main loop could deadlock on sync JSExport - Solves **3,4,5** -## (14) Deputy + emscripten dispatch to UI + JSWebWorker + no sync JSExport +## (14) Deputy + emscripten dispatch to UI + JSWebWorker + without sync JSExport - **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** - is variation of **(13)** - without support for synchronous JSExport @@ -302,7 +302,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - which maybe our HTTP/WS could do ? - could this difference we ignored ? - `JSExport`/`Function` - - we already are on correct thread in JS + - we already are on correct thread in JS, unless this is UI thread - would anything improve if we tried to be more async ? ## JSWebWorker with JS interop @@ -331,13 +331,17 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - this would also require separate thread, doing the async job ## JSImport calls on threads without JSWebWorker -- what should happen when thread-pool thread uses JSImport directly ? -- what should happen when thread-pool thread uses HTTP/WS clients ? +- those are + - thread-pool threads + - main managed thread in deputy design +- what should happen when it calls JSImport directly ? +- what should happen when it calls HTTP/WS clients ? - we could dispatch it to UI thread - easy to understand default behavior - downside is blocking the UI and emscripten loops with CPU intensive activity - in sidecar design, also extra copy of buffers - we could instead create dedicated `JSWebWorker` managed thread + - more difficult to reason about - this extra worker could also serve all the sync-to-async jobs # Dispatching call, who is responsible @@ -354,6 +358,8 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - probably by re-ordering Roslyn generated code of `__arg_return.ToManaged(out __retVal);` before `JSFunctionBinding.InvokeJS` - it needs to propagate exceptions - Roslyn generator: JSExport - if we make it responsible for the dispatch +- Mono/C/JS internal layer + - see `emscripten_dispatch_to_thread_async` below # Dispatching JSImport - what should happen - is normally bound to JS context of the calling managed thread From 007a076303fbcce4a91ebcfd3e5ef7a3ee6dd0e8 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 2 Oct 2023 13:14:11 +0200 Subject: [PATCH 024/108] wip --- accepted/2023/wasm-browser-threads.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index ffc36e3bb..6b1dc4152 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -437,6 +437,10 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - doesn't block and doesn't propagate exceptions - this is slow +## Performance +- the dispatch between threads (caused by JS object thread affinity) will have negative performance impact on the JS interop +- in case of HTTP/WS clients used via Streams, it could be surprizing + ## Spin-waiting in JS - if we want to keep synchronous JS APIs to work on UI thread, we have to spin-wait - we probably should have opt-in configuration flag for this @@ -444,6 +448,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - at the moment emscripten implements spin-wait in wasm - See [pthread_cond_timedwait.c](https://github.com/emscripten-core/emscripten/blob/cbf4256d651455abc7b3332f1943d3df0698e990/system/lib/libc/musl/src/thread/pthread_cond_timedwait.c#L117-L118) and [__timedwait.c](https://github.com/emscripten-core/emscripten/blob/cbf4256d651455abc7b3332f1943d3df0698e990/system/lib/libc/musl/src/thread/__timedwait.c#L67-L69) - I was not able to confirm that they would call `emscripten_check_mailbox` during spin-wait + - See also https://emscripten.org/docs/porting/pthreads.html - in sidecar design - emscripten main is not running in UI thread - it means it could still pump events and would not deadlock in Mono or managed code - unless the sidecar thread is blocked, or CPU hogged, which could happen From 9d741042fd4e52a33037a446fcfd299526cf46fa Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 4 Oct 2023 13:06:09 +0200 Subject: [PATCH 025/108] wip --- accepted/2023/wasm-browser-threads.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 6b1dc4152..fadd7b7dd 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -304,6 +304,9 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - `JSExport`/`Function` - we already are on correct thread in JS, unless this is UI thread - would anything improve if we tried to be more async ? +- `MonoString` + - we have optimization for interned strings, that we marshal them only once by value. Subsequent calls in both directions are just a pinned pointer. + - in deputy design we could create `MonoString` instance on the UI thread, but it involves GC barrier ## JSWebWorker with JS interop - is proposed concept to let user to manage JS state of the worker explicitly @@ -423,6 +426,8 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - this would not work in sidecar design - because UI is not managed thread there - Mono cwraps are not available either +- `emscripten_sync_run_in_main_runtime_thread` - in deputy design + - can run sync method in UI thread - "comlink" - in sidecar design - when the method is async - extract GCHandle of the `TaskCompletionSource` @@ -434,6 +439,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - use stored `TaskCompletionSource` to resolve the `Task` on target thread - `postMessage` - can send serializable message to web worker + - can transmit [transferable objects](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects) - doesn't block and doesn't propagate exceptions - this is slow From b5c31572b9949cf9039ec310f724884c99602c69 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 9 Oct 2023 11:28:37 +0200 Subject: [PATCH 026/108] designs 15, 16 --- accepted/2023/wasm-browser-threads.md | 35 +++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index fadd7b7dd..e95439127 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -263,6 +263,41 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - by throwing PNSE - Solves **3,4,5** +## (15) Deputy + Sidecar + UI thread +- 2 levels of indirection. +- benefit: blocking JSExport from UI thread doesn't block emscripten loop +- downside: complex and more resource intensive + +## (16) Deputy without Mono, no GC barrier breach for interop +- variation on **(13)** or **(14)** where we get rid of per-parameter calls to Mono +- benefit: get closer to purity of sidecar design without loosing perf +- offenders + - `Task`/`Promise` + - `create_task_callback`, `mono_wasm_marshal_promise` + - `JavaScriptImports.MarshalPromise` + - this will need to create something like `GCHandle`/`JSHandle` on the opposite direction and send it instead of creating it with extra call + - which means that we need richer interop stack frame slot, because we need to pack more information + - this is doable by making `MarshalerType` `byte`-based instead of `Int32`-based. This will be also good for better nested generic types if we proceed with it. + - `MonoString` + - `monoStringToString`, `stringToMonoStringRoot` + - `mono_wasm_string_get_data_ref` + - `mono_wasm_string_from_utf16_ref` + - we could start passing just a buffer instead of `MonoString` + - we will the optimization for interned strings + - managed instances in `MonoArray`, like `MonoString`, `JSObject` or `System.Object` + - `mono_wasm_register_root`, `mono_wasm_deregister_root` + - `Interop.Runtime.DeregisterGCRoot`, `Interop.Runtime.RegisterGCRoot` + - this is about GC and Dispose(): `ManagedObject`, `ErrorObject` + - `release_js_owned_object_by_gc_handle`, `setup_managed_proxy`, `teardown_managed_proxy` + - `JavaScriptExports.ReleaseJSOwnedObjectByGCHandle`, `CreateTaskCallback` + - this is about GC and Dispose(): `JSObject`, `JSException` + - `Interop.Runtime.ReleaseCSOwnedObject` + - `mono_wasm_get_assembly_exports` -> `__Register_` + - `mono_wasm_assembly_load`, `mono_wasm_assembly_find_class`, `mono_wasm_assembly_find_method` + - this logic could be moved to deputy or sidecar thread + - not problem for deputy design: `Module.stackAlloc`, `Module.stackSave`, `Module.stackRestore` + - what's overall perf impact for Blazor's `renderBatch` ? + # Details ## JSImport and marshaled JS functions From 812757ce70a91596eef89c25da5a08aad1c3f517 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 9 Oct 2023 11:32:13 +0200 Subject: [PATCH 027/108] wip --- accepted/2023/wasm-browser-threads.md | 1 + 1 file changed, 1 insertion(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index e95439127..49d8fcb45 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -271,6 +271,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w ## (16) Deputy without Mono, no GC barrier breach for interop - variation on **(13)** or **(14)** where we get rid of per-parameter calls to Mono - benefit: get closer to purity of sidecar design without loosing perf + - this could be done later as purity optimization - offenders - `Task`/`Promise` - `create_task_callback`, `mono_wasm_marshal_promise` From e9d4e8a6c9080355541cec3a9d1585416cc3356e Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 9 Oct 2023 11:34:12 +0200 Subject: [PATCH 028/108] wip --- accepted/2023/wasm-browser-threads.md | 1 + 1 file changed, 1 insertion(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 49d8fcb45..e0c2349ca 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -279,6 +279,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - this will need to create something like `GCHandle`/`JSHandle` on the opposite direction and send it instead of creating it with extra call - which means that we need richer interop stack frame slot, because we need to pack more information - this is doable by making `MarshalerType` `byte`-based instead of `Int32`-based. This will be also good for better nested generic types if we proceed with it. + - this problem with "who owns the proxy", I'm still confused about it after implementing 80% prototype. - `MonoString` - `monoStringToString`, `stringToMonoStringRoot` - `mono_wasm_string_get_data_ref` From d390d5adb7bca118261f3b740438cc4c1c44d650 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 9 Oct 2023 11:40:42 +0200 Subject: [PATCH 029/108] wip --- accepted/2023/wasm-browser-threads.md | 57 ++++++++++++++------------- 1 file changed, 30 insertions(+), 27 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index e0c2349ca..5d7215620 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -272,33 +272,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - variation on **(13)** or **(14)** where we get rid of per-parameter calls to Mono - benefit: get closer to purity of sidecar design without loosing perf - this could be done later as purity optimization -- offenders - - `Task`/`Promise` - - `create_task_callback`, `mono_wasm_marshal_promise` - - `JavaScriptImports.MarshalPromise` - - this will need to create something like `GCHandle`/`JSHandle` on the opposite direction and send it instead of creating it with extra call - - which means that we need richer interop stack frame slot, because we need to pack more information - - this is doable by making `MarshalerType` `byte`-based instead of `Int32`-based. This will be also good for better nested generic types if we proceed with it. - - this problem with "who owns the proxy", I'm still confused about it after implementing 80% prototype. - - `MonoString` - - `monoStringToString`, `stringToMonoStringRoot` - - `mono_wasm_string_get_data_ref` - - `mono_wasm_string_from_utf16_ref` - - we could start passing just a buffer instead of `MonoString` - - we will the optimization for interned strings - - managed instances in `MonoArray`, like `MonoString`, `JSObject` or `System.Object` - - `mono_wasm_register_root`, `mono_wasm_deregister_root` - - `Interop.Runtime.DeregisterGCRoot`, `Interop.Runtime.RegisterGCRoot` - - this is about GC and Dispose(): `ManagedObject`, `ErrorObject` - - `release_js_owned_object_by_gc_handle`, `setup_managed_proxy`, `teardown_managed_proxy` - - `JavaScriptExports.ReleaseJSOwnedObjectByGCHandle`, `CreateTaskCallback` - - this is about GC and Dispose(): `JSObject`, `JSException` - - `Interop.Runtime.ReleaseCSOwnedObject` - - `mono_wasm_get_assembly_exports` -> `__Register_` - - `mono_wasm_assembly_load`, `mono_wasm_assembly_find_class`, `mono_wasm_assembly_find_method` - - this logic could be moved to deputy or sidecar thread - - not problem for deputy design: `Module.stackAlloc`, `Module.stackSave`, `Module.stackRestore` - - what's overall perf impact for Blazor's `renderBatch` ? +- See [details](#Get_rid_of_Mono_GC_boundary_breach) # Details @@ -480,6 +454,35 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - doesn't block and doesn't propagate exceptions - this is slow +## Get rid of Mono GC boundary breach +- related to design **(16)** +- `Task`/`Promise` + - `create_task_callback`, `mono_wasm_marshal_promise` + - `JavaScriptImports.MarshalPromise` + - this will need to create something like `GCHandle`/`JSHandle` on the opposite direction and send it instead of creating it with extra call + - which means that we need richer interop stack frame slot, because we need to pack more information + - this is doable by making `MarshalerType` `byte`-based instead of `Int32`-based. This will be also good for better nested generic types if we proceed with it. + - this problem with "who owns the proxy", I'm still confused about it after implementing 80% prototype. +- `MonoString` + - `monoStringToString`, `stringToMonoStringRoot` + - `mono_wasm_string_get_data_ref` + - `mono_wasm_string_from_utf16_ref` + - we could start passing just a buffer instead of `MonoString` + - we will lose the optimization for interned strings +- managed instances in `MonoArray`, like `MonoString`, `JSObject` or `System.Object` + - `mono_wasm_register_root`, `mono_wasm_deregister_root` + - `Interop.Runtime.DeregisterGCRoot`, `Interop.Runtime.RegisterGCRoot` +- this is about GC and Dispose(): `ManagedObject`, `ErrorObject` + - `release_js_owned_object_by_gc_handle`, `setup_managed_proxy`, `teardown_managed_proxy` + - `JavaScriptExports.ReleaseJSOwnedObjectByGCHandle`, `CreateTaskCallback` +- this is about GC and Dispose(): `JSObject`, `JSException` + - `Interop.Runtime.ReleaseCSOwnedObject` +- `mono_wasm_get_assembly_exports` -> `__Register_` + - `mono_wasm_assembly_load`, `mono_wasm_assembly_find_class`, `mono_wasm_assembly_find_method` + - this logic could be moved to deputy or sidecar thread +- not problem for deputy design: `Module.stackAlloc`, `Module.stackSave`, `Module.stackRestore` +- what's overall perf impact for Blazor's `renderBatch` ? + ## Performance - the dispatch between threads (caused by JS object thread affinity) will have negative performance impact on the JS interop - in case of HTTP/WS clients used via Streams, it could be surprizing From 77685abbc5e0aff059e55aeb230e89d87c562601 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 9 Oct 2023 11:48:03 +0200 Subject: [PATCH 030/108] more to (16) --- accepted/2023/wasm-browser-threads.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 5d7215620..26eb3bd23 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -467,6 +467,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - `monoStringToString`, `stringToMonoStringRoot` - `mono_wasm_string_get_data_ref` - `mono_wasm_string_from_utf16_ref` + - `get_string_root` -> `mono_wasm_new_external_root` - we could start passing just a buffer instead of `MonoString` - we will lose the optimization for interned strings - managed instances in `MonoArray`, like `MonoString`, `JSObject` or `System.Object` @@ -480,6 +481,10 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - `mono_wasm_get_assembly_exports` -> `__Register_` - `mono_wasm_assembly_load`, `mono_wasm_assembly_find_class`, `mono_wasm_assembly_find_method` - this logic could be moved to deputy or sidecar thread +- `mono_wasm_bind_js_function`, `mono_wasm_bind_cs_function` + - `mono_wasm_new_external_root` +- `invoke_method_and_handle_exception` + - `mono_wasm_new_root` - not problem for deputy design: `Module.stackAlloc`, `Module.stackSave`, `Module.stackRestore` - what's overall perf impact for Blazor's `renderBatch` ? From 433c104587131adc4bd25c3655ad4503b9ed6a40 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 9 Oct 2023 12:14:46 +0200 Subject: [PATCH 031/108] more --- accepted/2023/wasm-browser-threads.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 26eb3bd23..2eab76335 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -272,6 +272,8 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - variation on **(13)** or **(14)** where we get rid of per-parameter calls to Mono - benefit: get closer to purity of sidecar design without loosing perf - this could be done later as purity optimization +- in this design the mono could be started on deputy thread +- UI would not be mono attached thread. - See [details](#Get_rid_of_Mono_GC_boundary_breach) # Details From 874a5ec34bae70d3d6b518103a38666d62ca0410 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Fri, 13 Oct 2023 12:05:57 +0200 Subject: [PATCH 032/108] whitespace --- accepted/2023/wasm-browser-threads.md | 70 +++++++++++++-------------- 1 file changed, 35 insertions(+), 35 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 2eab76335..6d809397a 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -30,46 +30,46 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w ## Context - Problems **1)** If you have multithreading, any thread might need to block while waiting for any other to release a lock. - - locks are in the user code, in nuget packages, in Mono VM itself - - there are managed and un-managed locks - - in single-threaded build of the runtime, all of this is NOOP. That's why it works on UI thread. +- locks are in the user code, in nuget packages, in Mono VM itself +- there are managed and un-managed locks +- in single-threaded build of the runtime, all of this is NOOP. That's why it works on UI thread. **2)** UI thread in the browser can't synchronously block - - that means, "you can't not block" UI thread, not just usual "you should not block" UI - - `Atomics.wait()` throws `TypeError` on UI thread - - you can spin-wait but it's bad idea. - - Deadlock: when you spin-block, the JS timer loop and any messages are not pumping. - - But code in other threads may be waiting for some such event to resolve. - - all async/await don't work - - all networking doesn't work - - you can't create or join another web worker - - browser dev tools UI freeze - - It eats your battery - - Browser will kill your tab at random point (Aw, snap). - - It's not deterministic and you can't really test your app to prove it harmless. - - Firefox (still) has synchronous `XMLHttpRequest` which could be captured by async code in service worker - - it's [deprecated legacy API](https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Synchronous_and_Asynchronous_Requests#synchronous_request) - - [but other browsers don't](https://wpt.fyi/results/service-workers/service-worker/fetch-request-xhr-sync.https.html?label=experimental&label=master&aligned) and it's unlikely they will implement it - - there are deployment and security challenges with it - - all the other threads/workers could synchronously block - - `Atomics.wait()` works as expected - - if we will have managed thread on the UI thread, any `lock` or Mono GC barrier could cause spin-wait - - in case of Mono code, we at least know it's short duration - - we should prevent it from blocking in user code +- that means, "you can't not block" UI thread, not just usual "you should not block" UI + - `Atomics.wait()` throws `TypeError` on UI thread +- you can spin-wait but it's bad idea. + - Deadlock: when you spin-block, the JS timer loop and any messages are not pumping. + - But code in other threads may be waiting for some such event to resolve. + - all async/await don't work + - all networking doesn't work + - you can't create or join another web worker + - browser dev tools UI freeze + - It eats your battery + - Browser will kill your tab at random point (Aw, snap). + - It's not deterministic and you can't really test your app to prove it harmless. +- Firefox (still) has synchronous `XMLHttpRequest` which could be captured by async code in service worker + - it's [deprecated legacy API](https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Synchronous_and_Asynchronous_Requests#synchronous_request) + - [but other browsers don't](https://wpt.fyi/results/service-workers/service-worker/fetch-request-xhr-sync.https.html?label=experimental&label=master&aligned) and it's unlikely they will implement it + - there are deployment and security challenges with it +- all the other threads/workers could synchronously block + - `Atomics.wait()` works as expected +- if we will have managed thread on the UI thread, any `lock` or Mono GC barrier could cause spin-wait + - in case of Mono code, we at least know it's short duration + - we should prevent it from blocking in user code **3)** JavaScript engine APIs and objects have thread affinity. - - The DOM and few other browser APIs are only available on the main UI "thread" - - and so, you need to have C# interop with UI, but you can't block there. - - HTTP & WS objects have affinity, but we would like to consume them (via Streams) from any managed thread - - Any `JSObject`, `JSException` and `Promise`->`Task` have thread affinity - - they need to be disposed on correct thread. GC is running on random thread +- The DOM and few other browser APIs are only available on the main UI "thread" + - and so, you need to have C# interop with UI, but you can't block there. +- HTTP & WS objects have affinity, but we would like to consume them (via Streams) from any managed thread +- Any `JSObject`, `JSException` and `Promise`->`Task` have thread affinity + - they need to be disposed on correct thread. GC is running on random thread **4)** State management of JS context `self` of the worker. - - emscripten pre-allocates pool of web worker to be used as pthreads. - - Because they could only be created asynchronously, but `pthread_create` is synchronous call - - Because they are slow to start - - those pthreads have stateful JS context `self`, which is re-used when mapped to C# thread pool - - when we allow JS interop on a managed thread, we need a way how to clean up the JS state +- emscripten pre-allocates pool of web worker to be used as pthreads. + - Because they could only be created asynchronously, but `pthread_create` is synchronous call + - Because they are slow to start +- those pthreads have stateful JS context `self`, which is re-used when mapped to C# thread pool +- when we allow JS interop on a managed thread, we need a way how to clean up the JS state **5)** Blazor's `renderBatch` is using direct memory access @@ -272,7 +272,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - variation on **(13)** or **(14)** where we get rid of per-parameter calls to Mono - benefit: get closer to purity of sidecar design without loosing perf - this could be done later as purity optimization -- in this design the mono could be started on deputy thread +- in this design the mono could be started on deputy thread - UI would not be mono attached thread. - See [details](#Get_rid_of_Mono_GC_boundary_breach) From eb386094c785f7dacdc4425d3f2bf86e49073beb Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 18 Oct 2023 17:24:02 +0200 Subject: [PATCH 033/108] wip --- accepted/2023/wasm-browser-threads.md | 77 +++++++++++++++++---------- 1 file changed, 49 insertions(+), 28 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 6d809397a..e06b4d091 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -313,7 +313,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - calling functions which return `Task` could be aggressively async - unless the synchronous part of the implementation could throw exception - which maybe our HTTP/WS could do ? - - could this difference we ignored ? + - could this difference be ignored ? - `JSExport`/`Function` - we already are on correct thread in JS, unless this is UI thread - would anything improve if we tried to be more async ? @@ -417,28 +417,54 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - because UI is not managed thread there - `emscripten_dispatch_to_thread_async` - in deputy design - can dispatch async call to C function on the timer loop of target pthread - - doesn't block and doesn't propagate exceptions - - needs to deal with `stackalloc` in C# generated stub - - probably by re-ordering Roslyn generated code - - when the method is async - - extract GCHandle of the `TaskCompletionSource` - - copy "stack frame" and pass it to - - asynchronously schedule to the target pthread via `emscripten_dispatch_to_thread_async` - - unpack the "stack frame" - - using local Mono `cwraps` for marshaling - - capture JS result/exception - - use stored `TaskCompletionSource` to resolve the `Task` on target thread - - when the method is sync - - inside `JSFunctionBinding.InvokeJS`: - - create internal `TaskCompletionSource` - - use async dispatch above - - block-wait on `Task.Wait` until it's done. - - return sync result - - or when the method is sync - - do something similar in C or JS + - doesn't block and doesn't propagate results and exceptions - this would not work in sidecar design - - because UI is not managed thread there - - Mono cwraps are not available either + - because UI is not pthread there + - from JS (UI) to C# managed main + - only necessary for deputy/sidecar, not for HTTP + - async + - `malloc` stack frame and do JS side of marshaling + - re-order `marshal_task_to_js` before `invoke_method_and_handle_exception` + - pre-create `JSHandle` of a `promise_controller` + - pass `JSHandle` instead of receiving it + - send the message via `emscripten_dispatch_to_thread_async` + - return the promise immediately + - await until `mono_wasm_resolve_or_reject_promise` is sent back + - this need to be also dispatched + - how could we make that dispatch same for HTTP cross-thread by `JSObject` affinity ? + - any errors in messaging will `abort()` + - sync + - dispatch C function + - which will lift Atomic semaphore at the end + - spin-wait for semaphore + - stack-frame could stay on stack + - synchronously returning `null` `Task?` + - pass `slot.ElementType = MarshalerType.Discard;` ? + - `abort()` ? + - `resolve(null)` ? + - `reject(null)` ? + - from C# to JS (UI) + - async + - needs to deal with `stackalloc` in C# generated stub, by copying the buffer + - sync + - inside `JSFunctionBinding.InvokeJS`: + - create internal `TaskCompletionSource` + - use async dispatch above + - block-wait on `Task.Wait` until it's done. + - !! this would not keep receiving JS loop events !! + - return sync result + - implementation calls + - `BindJSFunction`, `mono_wasm_bind_js_function` - many out params, need to be sync call to UI + - `BindCSFunction`, `mono_wasm_bind_cs_function` - many out params, need to be sync call to UI + - `ReleaseCSOwnedObject`, `mono_wasm_release_cs_owned_object` - async message to UI + - `ResolveOrRejectPromise`, `mono_wasm_resolve_or_reject_promise` - async message to UI + - `InvokeJSFunction`, `mono_wasm_invoke_bound_function` - depending on signature, via FuncJS.ResMarshaler + - `InvokeImport`, `mono_wasm_invoke_import` - depending on signature, could be sync or async message to UI + - `InstallWebWorkerInterop`, `mono_wasm_install_js_worker_interop` - could become async + - `UninstallWebWorkerInterop`, `mono_wasm_uninstall_js_worker_interop` - could become async + - `RegisterGCRoot`, `mono_wasm_register_root` - could stay on deputy + - `DeregisterGCRoot`, `mono_wasm_deregister_root` - could stay on deputy + - hybrid globalization, could probably stay on deputy - `emscripten_sync_run_in_main_runtime_thread` - in deputy design - can run sync method in UI thread - "comlink" - in sidecar design @@ -459,12 +485,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w ## Get rid of Mono GC boundary breach - related to design **(16)** - `Task`/`Promise` - - `create_task_callback`, `mono_wasm_marshal_promise` - - `JavaScriptImports.MarshalPromise` - - this will need to create something like `GCHandle`/`JSHandle` on the opposite direction and send it instead of creating it with extra call - - which means that we need richer interop stack frame slot, because we need to pack more information - - this is doable by making `MarshalerType` `byte`-based instead of `Int32`-based. This will be also good for better nested generic types if we proceed with it. - - this problem with "who owns the proxy", I'm still confused about it after implementing 80% prototype. + - improved in https://github.com/dotnet/runtime/pull/93010 - `MonoString` - `monoStringToString`, `stringToMonoStringRoot` - `mono_wasm_string_get_data_ref` From e84930266d077eea7671e93b73d8b0aec5b39bc7 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 18 Oct 2023 17:30:04 +0200 Subject: [PATCH 034/108] fix --- accepted/2023/wasm-browser-threads.md | 1 - 1 file changed, 1 deletion(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index e06b4d091..bc03076fa 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -587,7 +587,6 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - This will also start in Blazor project, but UI rendering would not work. - we have pre-allocated pool of browser Web Workers which are mapped to pthread dynamically. - we can configure pthread to keep running after synchronous thread_main finished. That's necessary to run any async tasks involving JavaScript interop. - - GC is running on UI thread/worker. - legacy interop has problems with GC boundaries. - JSImport & JSExport work - There is private JSSynchronizationContext implementation which is too synchronous From 346cd7db900a917dde7ca67fbcd6785a73935709 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 18 Oct 2023 17:34:42 +0200 Subject: [PATCH 035/108] perf --- accepted/2023/wasm-browser-threads.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index bc03076fa..8f002bf55 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -512,8 +512,12 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - what's overall perf impact for Blazor's `renderBatch` ? ## Performance +- as compared to ST build for dotnet wasm - the dispatch between threads (caused by JS object thread affinity) will have negative performance impact on the JS interop - in case of HTTP/WS clients used via Streams, it could be surprizing +- browser performance is lower when working with SharedArrayBuffer +- Mono performance is lower because there are GC safe-points and locks in the VM code +- startup is slower because creation of WebWorker instances is slow ## Spin-waiting in JS - if we want to keep synchronous JS APIs to work on UI thread, we have to spin-wait From 210b2c691db2cc93352df0908d7051403fbd0e33 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 26 Oct 2023 19:03:00 +0200 Subject: [PATCH 036/108] suspended threads and proxy disposal --- accepted/2023/wasm-browser-threads.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 8f002bf55..3331955a6 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -296,7 +296,9 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - all of them need to be used and disposed on correct thread - how to dispatch to correct thread is one of the questions here - all of them are registered to 2 GCs - - maybe `Dispose` could be schedule asynchronously instead of blocking Mono GC + - `Dispose` need to be schedule asynchronously instead of blocking Mono GC + - because of the proxy thread affinity, but the target thread is suspended during GC, so we could not dispatch to it, at that time. + - the JS handles need to be freed only after both sides unregistered it (at the same time). - `JSObject` - have thread ID on them, so we know which thread owns them - `JSException` From 000bad72b1fb7a6e29395515ae39d25a8cea5946 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 26 Oct 2023 19:27:59 +0200 Subject: [PATCH 037/108] VFS notes --- accepted/2023/wasm-browser-threads.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 3331955a6..d116ca0ea 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -520,6 +520,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - browser performance is lower when working with SharedArrayBuffer - Mono performance is lower because there are GC safe-points and locks in the VM code - startup is slower because creation of WebWorker instances is slow +- VFS access is slow because it's dispatched to UI thread ## Spin-waiting in JS - if we want to keep synchronous JS APIs to work on UI thread, we have to spin-wait @@ -566,6 +567,10 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - `JSImport` used for logging: `globalThis.console.debug`, `globalThis.console.error`, `globalThis.console.info`, `globalThis.console.warn`, `Blazor._internal.dotNetCriticalError` - probably could be any JS context +## Virtual filesystem +- we use emscripten's VFS, which is JavaScript implementation in the UI thread. +- the POSIX operations are synchronously dispatched to UI thread. + ## WebPack, Rollup friendly - it's not clear how to make this single-file - because web workers need to start separate script(s) via `new Worker('./dotnet.js', {type: 'module'})` From e8c3ae542d08fc5637e0b217496ee54533ffa608 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 26 Oct 2023 19:30:12 +0200 Subject: [PATCH 038/108] stdout --- accepted/2023/wasm-browser-threads.md | 1 + 1 file changed, 1 insertion(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index d116ca0ea..ef5185dc6 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -521,6 +521,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - Mono performance is lower because there are GC safe-points and locks in the VM code - startup is slower because creation of WebWorker instances is slow - VFS access is slow because it's dispatched to UI thread +- console output is slow because it's POSIX stream is dispatched to UI thread, call per `put_char` ## Spin-waiting in JS - if we want to keep synchronous JS APIs to work on UI thread, we have to spin-wait From b70a81ac399e0737be73ff3acfb8a0591ff395e9 Mon Sep 17 00:00:00 2001 From: Jeremy Koritzinsky Date: Thu, 26 Oct 2023 10:50:41 -0700 Subject: [PATCH 039/108] Add initial Swift Interop design doc (#302) Co-authored-by: Aaron Robinson Co-authored-by: Rolf Bjarne Kvinge --- INDEX.md | 1 + proposed/swift-interop.md | 189 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 190 insertions(+) create mode 100644 proposed/swift-interop.md diff --git a/INDEX.md b/INDEX.md index 62799f597..a197805ae 100644 --- a/INDEX.md +++ b/INDEX.md @@ -98,5 +98,6 @@ Use update-index to regenerate it: | | [Rate limits](proposed/rate-limit.md) | [John Luo](https://github.com/juntaoluo), [Sourabh Shirhatti](https://github.com/shirhatti) | | | [Readonly references in C# and IL verification.](proposed/verifiable-ref-readonly.md) | | | | [Ref returns in C# and IL verification.](proposed/verifiable-ref-returns.md) | | +| | [Swift Interop](proposed/swift-interop.md) | [Andy Gocke](https://github.com/agocke), [Jeremy Koritzinsky](https://github.com/jkoritzinsky) | | | [Target AVX2 in R2R images](proposed/vector-instruction-set-default.md) | [Richard Lander](https://github.com/richlander) | diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md new file mode 100644 index 000000000..9bc811275 --- /dev/null +++ b/proposed/swift-interop.md @@ -0,0 +1,189 @@ +# Swift Interop + +**Owner** [Andy Gocke](https://github.com/agocke) | [Jeremy Koritzinsky](https://github.com/jkoritzinsky) + +The [Swift](https://developer.apple.com/swift/) programming language is developed primarily by Apple for use in their product lines. It is a successor to the previous official language for these platforms, Objective-C. C# has had limited Objective-C interop support for quite a while and it powers frameworks like MAUI and Xamarin Forms that run on Apple devices. + +Swift is now becoming the dominant language on Apple devices and eclipsing Objective-C. Many important libraries are now Swift-first or Swift-only. Objective-C binding is also becoming more difficult for these libraries. To continue to interoperate with external libraries and frameworks, C# needs to be able to interoperate with Swift. + +We will aim to provide a solution where C# needs to be able to interoperate directly with Swift, without going through an intermediate language. + +## Scenarios and User Experience + +We would expect that users would be able to write C# code that can do simple Swift interop. Additionally, we would expect that for cases where the interop system does not support seamless interop, developers could write a shim in Swift that could be called from C# code. Developers should not need to write a shim in C or Assembly to interact with Swift APIs. + +## Requirements + +### Goals + +In short, we should completely eliminate the required C/assembly sandwich that's currently required to call Swift from C# code, and potentially vice versa. In particular, neither C# nor Swift users should have to deal with lower-level system state, like registers or stack state, to call Swift from C#. This support must be added into Mono, NativeAOT, and CoreCLR. Ready-to-Run support would be preferable, but is not required for the first release. Support on all Apple platforms (macOS, iOS, tvOS, macCatalyst) is the first priority. + +### Non-Goals + +C# and Swift are different languages with different language semantics. It is not a goal to map every construct from one language to the other. However, there are some terms in both languages that are sufficiently similar that they can be mapped to an identical semantic term in the other language. Interop should be seen as a Venn diagram where each language forms its own circle, and interop is in the (much smaller) space of equivalent terms that are shared between them. Support on non-Apple platforms is not required for the first release of this support. + +## Stakeholders and Reviewers + +- [@dotnet/interop-contrib](https://github.com/orgs/dotnet/teams/interop-contrib) +- [@dotnet/macios](https://github.com/orgs/dotnet/teams/macios) +- [@lambdageek](https://github.com/lambdageek) +- [@SamMonoRT](https://github.com/SamMonoRT) +- [@kotlarmilos](https://github.com/kotlarmilos) +- [@JulieLeeMSFT](https://github.com/JulieLeeMSFT) +- [@amanasifkhalid](https://github.com/amanasifkhalid) +- [@davidortinau](https://github.com/davidortinau) +- [@jkotas](https://github.com/jkotas) +- [@stephen-hawley](https://github.com/stephen-hawley) + +## Design + +We plan to split the work into at least three separate components. There will be work at the runtime layer to handle the basic calling-convention and register-allocation work to enable calling basic Swift functions without needing to write custom C or Assembly code or wrapping every Swift function with a C-style shape. Upstack, there will be suite of code-generation tools to provide a higher-level projection of Swift concepts into .NET. Finally, the product will include some mechanism to easily reference Swift libraries from .NET on platforms that natively provide a Swift runtime and libraries, Apple platforms in particular. + +### Runtime + +#### Calling Conventions + +Swift has two calling conventions at its core, the "Swift" calling convention, and the "SwiftAsync" calling convention. We will begin by focusing on the "Swift" calling convention, as it is the most common. The Swift calling convention has a few elements that we need to contend with. Primarily, it allows passing arguments in more registers. Additionally, it has two dedicated registers for specific components of the calling convention, the self register and the error register. The runtime support for this calling convention must support all of these features. The additional registers per argument is relatively easy to support, as each of our compilers must already have some support for this concept to run on Linux or Apple platforms today. We have a few options for the remaining two features we need to support. + +The "SwiftAsync" calling convention has an additional "async context" register. When a "SwiftAsync" function is called by a "SwiftAsync" function, it must be tail-called. A function that uses this calling convention also pops the argument area. In LLVM-IR, this calling convention is referred to as `swifttailcc`. In Clang, this convention is specified as `swiftasynccall`. Additionally, the "SwiftAsync" calling convention does not have the error register and must not have a return value. See the [LLVM-IR language reference](https://github.com/llvm/llvm-project/blob/54fe7ef70069a48c252a7e1b0c6ed8efda0bc440/llvm/docs/LangRef.rst#L452) and the [Clang attribute reference](https://clang.llvm.org/docs/AttributeReference.html#swiftasynccall) for an explaination of the calling convention. + +In addition to these two calling conventions, Swift allows developers to export their functions with different calling conventions. The `@convention(c)` option exports a function using the standard `C` calling convention. In .NET, developers can call functions with this calling convention using traditional interop solutions. The `@convention(block)` option exports a function using the Objective-C Block calling convention. The Swift interop experience will not support calling methods exposed with this option as the primary use case is for interop with Objective-C. The `@convention(swift)` option is the same as not specifying the option. + +Swift also provides a strong Objective-C interop story through the `@objc` attribute. Types with the `@objc` attribute are exposed through the Swift ABI as well as through Objective-C selectors. We will not call into Swift APIs using the Objective-C interop in this interop story. However, the upstack projection tooling may need to do additional work for types that may need to have Objective-C interop support exposed (in the case that a .NET user type derives from a Swift type that had the `@objc` attribute). + +##### Self register + +We have selected the following option for supporting the Self register in the calling convention: + +1. Use specially-typed argument with a type, `SwiftSelf`, to represent which parameter should go into the self register. + +We will provide a `SwiftSelf` type to specify "this argument goes in the self register". Specifying the type twice in a signature would generate an `InvalidProgramException`. This would allow the `self` argument to be specified anywhere in the argument list. Alternatively, many sections of the Swift ABI documentation, as well as the Clang docs refer to this parameter as the "context" argument instead of the "self" argument, so an alternative name could be `SwiftContext`. + +For reference, explicitly declaring a function with the Swift or SwiftAsync calling conventions in Clang requires the "context" argument, the value that goes in the "self" register, as the last parameter or the penultimate parameter followed by the error parameter. + +The self register, like the error register and the async context register discussed later, is always a pointer-sized value. As a result, if we introduce any special intrinsic types for the calling convention, we don't need to make the type generic as we can always use a `void*` to represent the value at the lowest level. + +Additionally, we have selected this design as this provides consistency with the error register and async context register handling, discussed below. + +We have rejected the following designs for the reasons mentioned below: + +1. Use an attribute like `[SwiftSelf]` on a parameter to specify it should go in the self register. +2. Specify that combining the `Swift` calling convention with the `MemberFunction` calling convention means that the first argument is the `self` argument. + +Rejected Option 1 seems like a natural fit, but there is one significant limitation: Attributes cannot be used in function pointers. Function pointers are a vital scenario for us to support as we want to support virtual method tables as they are used in scenarios like protocol witness tables. + +Rejected Option 2 is a natural fit as the `MemberFunction` calling convention combined with the various C-based calling conventions specifies that there is a `this` argument as the first argument. Defining `Swift` + `MemberFunction` to imply/require the `self` argument is a great conceptual extension of the existing model. Although, in Swift, sometimes the `self` register is used for non-instance state. For example, in static functions, the type metadata is passed as the `self` argument. Since static functions are not member functions, we may want to not use the `MemberFunction` calling convention. As a result, we have rejected this option. + +###### Error register + +We have selected an approach for handling the error register in the Swift calling convention: + +1. Use a special type named something like `SwiftError*` to indicate the error parameter + +This approach expresses that the to-be-called Swift function uses the Error Register in the signature and they both require signature manipulation in the JIT/AOT compilers. Like with `SwiftSelf`, we would throw an `InvalidProgramException` for a signature with multiple `SwiftError` parameters. We use a pointer-to-`SwiftError` type to indicate that the error register is a by-ref/out parameter. We don't use managed pointers as our modern JITs can reason about unmanaged pointers well enough that we do not end up losing any performance taking this route. The `UnmanagedCallersOnly` implementation will require a decent amount of JIT work to emulate a local variable for the register value, but we have prior art in the Clang implementation of the Swift error register that we can fall back on. + +Additionally, we have selected this design as this provides consistency with the self register and async context register handling, discussed below. + +We've rejected the following designs for the reasons mentioned below: + +1. Use an attribute like `[SwiftError]` on a `ref` or `out` parameter to indicate the error parameter. +2. Use a special return type `SwiftReturn` to indicate the error parameter and combine it with the return value. +3. Use special helper functions `Marshal.Get/SetLastSwiftError` to get and set the last Swift error. Our various compilers will automatically emit a call to `SetLastSwiftError` from Swift functions and emit a call to `GetLastSwiftError` in the epilog of `UnmanagedCallersOnly` methods. The projection generation will emit calls to `Marshal.Get/SetLastSwiftError` to get and set these values to and from exceptions. +4. Implicitly transform the error register into an exception at the interop boundary. +5. Use `SwiftError` on a `ref` or `out` parameter to indicate the error parameter. + +We have a prototype that uses Rejected Option 1; however, using an attribute has the same limitations as the attribute option for the self register. + +Rejected Option 2 provides a more functional-programming-style API that is more intuitive than the `SwiftError` options above, but likely more expensive to implement. As a result, Option 2 would be better as a higher-level option and not the option implemented by the JIT/AOT compilers. + +Rejected Option 3 provides an alternative approach. As the "Error Register" is always register-sized, we can use runtime-supported helper functions to stash away and retrieve the error register. Unlike options 2 or 3, we don't need to do any signature manipulation as the concept of "this Swift call can return an error" is not represented in the signature. Responsibility to convert the returned error value into a .NET type, such as an exception, would fall to the projection tooling. Since option 4 does not express whether or not the target function throws at the signature level, the JIT/AOT compilers would always need to emit the calls to the helpers when compiling calls to and from Swift code. If the projection of Swift types into .NET will always use exception and not pass Swift errors as the error types directly, then Option 3 reduces the design burden on the runtime teams by removing the type that would only be used at the lowest level of the Swift/.NET interop. However, Option 3 would leave some performance on the table as it would effectively require us to store and read the error value into thread-local storage when we stash and retrieve the error value instead of reading the error value from the register directly. Due to the performance hit of reading/writing thread-local storage at each transition, we have rejected this option. + +Rejected Option 4 would be most similar to the Objective-C interop experience. However, this experience would require more work in the JIT/AOT compilers and would make the translation between .NET exception and Swift error codes inflexible. Modern .NET interop solutions generally push error-exception translation mechanisms to be controlled by higher-level interop code generators instead of the runtime for flexibility. Not selecting this option would require all `CallConvSwift` `UnmanagedCallersOnly` methods to wrap their contents in a `try-catch` to translate any exceptions. This is already done for the COM source generator and had to be done by Binding Tools for Swift, so the pattern has a lot of implementation expertise. + +Rejected Option 5 would have provided a better ".NET" shape than our selected alternative; however, the additional complexity in the UnmanagedCallersOnly case is not worth the cost. + +##### Async Context Register + +In the SwiftAsync calling convention, there is also an Async Context register, similar to the Self and Error registers. Like the error register, the async context must be passed by a pointer value. As a result, similar options apply here, with the same constraints. Additionally, we don't already have an existing `CallConvAsync` calling convention modifier, so going the calling convention route like proposed for the self register is not practical. As a result, we will likely need to use a special type like `SwiftAsyncContext` to represent the async context register, similar to the proposals for the error register and the self register. + +##### Tuples + +In the Swift language, tuples are "unpacked" in the calling convention. Each element of a tuple is passed as a separate parameter. C# has its own language-level tuple feature in the ValueTuple family of types. The Swift/.NET interop story could choose to automatically handle tuple types in a similar way for a `CallConvSwift` function call. However, processing the tuple types in the JIT/AOT compilers is complicated and expensive. It would be much cheaper to defer this work upstack to the language projection, which can split out the tuple into individual parameters to pass to the underlying function. The runtime and libraries teams could add analyzers to detect value-tuples being passed to `CallConvSwift` functions at compile time to help avoid pits of failure. As we expect most developers to use the higher-level tooling and to not use `CallConvSwift` directly, we would likely defer any analyzer work until we have a suitable use case. + +##### Automatic Reference Counting and Lifetime Management + +Swift has a very strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. The Binding Tools for Swift tooling handles these explicit lifetime semantics with some generated Swift code. In the new Swift/.NET interop, management of these lifetime semantics will be done by the Swift projection and not by the raw calling-convention support. If any GC interation is required to handle the lifetime semantics correctly, we should take an approach more similar to the ComWrappers support (higher-level, less complex interop interface) than the Objective-C interop support (lower-level, basically only usable by the ObjCRuntime implementation). + +##### Structs/Enums + +Like .NET, Swift has both value types and class types. The value types in Swift include both structs and enums. When passed at the ABI layer, they are generally treated as their composite structure and passed by value in registers. However, when Library Evolution mode is enabled, struct layouts are considered "opaque" and their size, alignment, and layout can vary between compile-time and runtime. As a result, all structs and enums in Library Evolution mode are passed by a pointer instead of in registers. Frozen structs and enums (annotated with the `@frozen` attribute) are not considered opaque and will be enregistered. We plan to interoperate with Swift through the Library Evolution mode, so we will generally be able to pass structures using opaque layout. + +When calling a function that returns an opaque struct, the Swift ABI always requires the struct to be passed by a return buffer. Since we cannot know the size at compile time, we'll need to handle this case manually in the calling convention signature as well by manually introduce the struct return buffer parameter. Alternatively, we could enlighten the JIT/AOT compilers about how to look up the correct buffer size for an opaque struct. This would likely be quite expensive, so we should avoid this option and just manually handle the return buffer. We can determine the return buffer size by using the value witness table for the struct type. We could always move to the JIT-intrinsic model in the future if we desired. + +At the lowest level of the calling convention, we do not consider Library Evolution to be a different calling convention than the Swift calling convention. Library Evolution requires that some types are passed by a pointer/reference, but it does not fundamentally change the calling convention. Effectively, Library Evolution forces the least optimizable choice to be taken at every possible point. As a result, we should not handle Library Evolution as a separate calling convention and instead we can manually handle it at the projection layer. + +### Projecting Swift into .NET + +The majority of the work for Swift/.NET interop is determining how a type that exists in Swift should exist in .NET and what shape it should have. This section is a work in progress and will discuss how each feature in Swift will be projected into .NET, particularly in cases where there is not a corresponding .NET or C# language feature. Each feature should have a subheading describing how the projection will look and how any mechanisms to make it work will be designed. + +All designs in this section should be designed such that they are trimming and AOT-compatible by construction. We should work to ensure that no feature requires whole-program analysis (such as custom steps in the IL Linker) to be trim or AOT compatible. + +#### Swift to .NET Language Feature Projections + +##### Structs/Value Types + +Unlike .NET, Swift's struct types have strong lifetime semantics more similar to C++ types than .NET structs. At the Swift ABI layer, there are broadly three types of structs/enums: "POD/Trivial" structs, "Bitwise Takable/Movable" structs, and non-bitwise movable structs. The [Swift documentation](https://github.com/apple/swift/blob/main/docs/ABIStabilityManifesto.md#layout-and-properties-of-types) covers these different kinds of structs. Let's look at how we could map each of these categories of structs into .NET. + +"POD/Trivial" structs have no memory management required and no special logic for copying/moving/deleting the struct instance. Structs of this category can be represented as C# structs with the same field layout. + +"Bitwise Takable/Movable" structs have some memory management logic and require calls to Swift's ref-counting machinery to maintain expected lifetimes. Structs of this category can be projected into C# as a struct. When creating this C# struct, we would semantically treat each field as a separate local, create the C# projection of it, and save this "local" value into a field in the C# struct. + +Structs that are non-bitwise-movable are more difficult. They cannot be moved by copying their bits; their copy constructors must be used in all copy scenarios. When mapping these structs to C#, we must take care that we do not copy the underlying memory and to call the deallocate function when the C# usage of the struct falls out of scope. These use cases best match up to C# class semantics, not struct semantics. + +We plan to interop with Swift's Library Evolution mode, which brings an additional wrinkle into the Swift struct story. Swift's Library Evolution mode abstracts away all type layout and semantic information unless a type is explicitly marked as `@frozen`. In the Library Evolution case, all structs have "opaque" layout, meaning that their exact layout and category cannot be determined until runtime. As a result, we need to treat all "opaque" layout structs as possibly non-bitwise-movable at compile time as we will not know until runtime what the exact layout is. Swift/C++ interop is not required to use the Library Evolution mode in all cases as it can statically link against Swift libraries, so it is not limited by opaque struct layouts in every case. The size and layout information of a struct is available in its [Value Witness Table](https://github.com/apple/swift/blob/main/docs/ABIStabilityManifesto.md#value-witness-table), so we can look up this information at runtime for allocating struct instances and manipulating struct memory correctly. + +##### Tuples + +If possible, Swift tuples should be represented as `ValueTuple`s in .NET. If this is not possible, then they should be represented as types with a `Deconstruct` method similar to `ValueTuple` to allow a tuple-like experience in C#. + +#### Projection Tooling Components + +The projection tooling should be split into these components: + +##### Importing Swift into .NET + +1. A tool that takes in a `.swiftinterface` file or Swift sources and produces C# code. +2. A library that provides the basic support for Swift interop that the generated code builds on. +3. User tooling to easily generate Swift projections for a given set of `.framework`s. + - This tooling would build a higher-level interface on top of the tool in item 1 that is more user-friendly and project-system-integrated. +4. (optional) A NuGet package, possibly referencable by `FrameworkReference` or automatically included when targeting macOS, Mac Catalyst, iOS, or tvOS platforms that exposes the platform APIs for each `.framework` that is exposed from Swift to .NET. + - This would be required to provide a single source of truth for Swift types so they can be exposed across an assembly boundary. + +##### Exporting .NET to Swift + +There are two components to exporting .NET to Swift: Implementing existing Swift types in .NET and passing instances of those types to Swift, and exposing novel types from .NET code to Swift code to be created from Swift. Exposing novel types from .NET code to Swift code is considered out of scope at this time. + +For implementing existing Swift types in .NET, we will require one of the following tooling options: + +1. A Roslyn source generator to generate any supporting code needed to produce any required metadata, such as type metadata and witness tables, to pass instances of Swift-type-implementing .NET types defined in the current project to Swift. +2. An IL-post-processing tool to generate the supporting code and metadata from the compiled assembly. + +If we were to use an IL-post-processing tool here, we would break Hot Reload in assemblies that implement Swift types, even for .NET-only code, due to introducing new tokens that the Hot Reload "client" (aka Roslyn) does not know about. As a result, we should prefer the Roslyn source generator approach. + +### Distribution + +The calling convention work will be implemented by the .NET runtimes in dotnet/runtime. + +The projection tooling will not ship as part of the runtime. It should be available as a separate NuGet package, possibly as a .NET CLI tool package. The projections should either be included automatically as part of the TPMs for macOS, iOS, and tvOS, or should be easily referenceable. + +## Q & A + +- How does this interop interact with the existing Objective-C interop experience? + - This interop story will exist separately from the Objective-C interop story. We will not provide additional support for passing representations of Swift types to Objective-C projections or vice-versa. We may re-evaluate this based on user pain points and cost. +- Library Evolution mode seems to add a lot of complexity. Do we need to interact with it for our v1 solution? + - We need to use LibraryEvolution mode for our Swift interop as that is the only ABI-stable story for Swift. Otherwise we'd need to re-compile the Swift code per-OS-version for each possible target OS, instead of building the managed code once for all target iOS or macOS versions (which is how .NET generally works today). Also, we wouldn't be able to use the documented `.swiftinterface` files. We'd need to use the compiler-specific `.swiftmodule` files or parse Swift code directly, both of which are much more expensive. Additionally, many core Swift libraries are only exposed with Library Evolution enabled. + +## Related GitHub Issues + +- Top level issue in dotnet/runtime: https://github.com/dotnet/runtime/issues/93631 +- API proposal for CallConvSwift calling convention: https://github.com/dotnet/runtime/issues/64215 From 58791ad22ef9eecfc38d4642773a146790cc17c6 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 6 Nov 2023 19:59:42 +0100 Subject: [PATCH 040/108] wip --- accepted/2023/wasm-browser-threads.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index ef5185dc6..ed6d93af3 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -592,6 +592,9 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - we could synchronously wait for another thread to do async operations - to fetch another DLL which was not pre-downloaded +## New pthreads +- with deputy design we could set `PTHREAD_POOL_SIZE_STRICT=0` and enable threads to be created dynamically + # Current state 2023 Sep - we already ship MT version of the runtime in the wasm-tools workload. - It's enabled by `true` and it requires COOP HTTP headers. From 4f289d889dd8db32dd555113e83320476cbf17b2 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 8 Nov 2023 14:12:03 +0100 Subject: [PATCH 041/108] wip --- accepted/2023/wasm-browser-threads.md | 1 + 1 file changed, 1 insertion(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index ed6d93af3..8da6bf94c 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -343,6 +343,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - or we could throw PNSE, but it may be difficult for user code to - know what thread created the client - have means how to dispatch the call there + - other unknowing users are `XmlUrlResolver`, `XmlDownloadManager`, `X509ResourceClient`, ... - because we could have blocking wait now, we could also implement synchronous APIs of HTTP/WS - so that existing user code bases would just work without change - at the moment they throw PNSE From 39a37b54798a0f49ccae0af314ecafb71d722b3e Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 9 Nov 2023 11:31:45 +0100 Subject: [PATCH 042/108] responsive --- accepted/2023/wasm-browser-threads.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 8da6bf94c..0631076a1 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -21,6 +21,7 @@ - sync JS to async JS to sync C# - allow calls to synchronous JSExport from UI thread (callback) - don't prevent future marshaling of JS [transferable objects](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects), like streams and canvas. +- offload CPU intensive part of WASM startup to WebWorker, os that the pre-rendered (blazor) UI could stay responsive during Mono VM startup. † Note: all the text below discusses MT build only, unless explicit about ST build. @@ -73,6 +74,9 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w **5)** Blazor's `renderBatch` is using direct memory access +**6)** Dynamic creation of new WebWorker requires async operations on emscripten main thread. +- we could pre-allocate fixed size pthread pool. But one size doesn't fit all and it's expensive to create too large pool. + ## Define terms - UI thread - this is the main browser "thread", the one with DOM on it @@ -273,6 +277,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - benefit: get closer to purity of sidecar design without loosing perf - this could be done later as purity optimization - in this design the mono could be started on deputy thread + - this will keep UI responsive during startup - UI would not be mono attached thread. - See [details](#Get_rid_of_Mono_GC_boundary_breach) From 62e8b9e85ee607d7902ab0a939524b6d1303359f Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 9 Nov 2023 13:09:08 +0100 Subject: [PATCH 043/108] goals and non-goals --- accepted/2023/wasm-browser-threads.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 0631076a1..01e4b1eef 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -1,14 +1,17 @@ # Multi-threading on a browser ## Goals -- CPU intensive workloads on dotnet thread pool +- CPU intensive workloads on dotnet thread pool. +- Allow user to start new managed threads using `new Thread` and join it. +- Add new C# API for creating web workers with JS interop. Allow JS async/promises via external event loop. - enable blocking `Task.Wait` and `lock()` like APIs from C# user code on all threads - Current public API throws PNSE for it - This is core part on MT value proposition. - If people want to use existing MT code-bases, most of the time, the code is full of locks. - People want to use existing desktop/server multi-threaded code as is. -- allow HTTP and WS C# APIs to be used from any thread despite underlying JS object affinity -- JSImport/JSExport interop in maximum possible extent +- allow HTTP and WS C# APIs to be used from any thread despite underlying JS object affinity. +- Blazor `BeginInvokeDotNet`/`EndInvokeDotNetAfterTask` APIs work correctly in multithreaded apps. +- JSImport/JSExport interop in maximum possible extent. - don't change/break single threaded build. † ## Lower priority goals @@ -23,6 +26,9 @@ - don't prevent future marshaling of JS [transferable objects](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects), like streams and canvas. - offload CPU intensive part of WASM startup to WebWorker, os that the pre-rendered (blazor) UI could stay responsive during Mono VM startup. +## Non-goals +- interact with JS on managed threads other than UI thread or dedicated `JSWebWorker` + † Note: all the text below discusses MT build only, unless explicit about ST build. ## Key idea in this proposal From 3e165627155f38c9cf56fbd3417b1523fecd8d91 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 9 Nov 2023 13:54:40 +0100 Subject: [PATCH 044/108] clarify --- accepted/2023/wasm-browser-threads.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 01e4b1eef..4fa5483ab 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -24,10 +24,10 @@ - sync JS to async JS to sync C# - allow calls to synchronous JSExport from UI thread (callback) - don't prevent future marshaling of JS [transferable objects](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects), like streams and canvas. -- offload CPU intensive part of WASM startup to WebWorker, os that the pre-rendered (blazor) UI could stay responsive during Mono VM startup. +- offload CPU intensive part of WASM startup to WebWorker, so that the pre-rendered (blazor) UI could stay responsive during Mono VM startup. ## Non-goals -- interact with JS on managed threads other than UI thread or dedicated `JSWebWorker` +- interact with JS state on `WebWorker` of managed threads other than UI thread or dedicated `JSWebWorker` † Note: all the text below discusses MT build only, unless explicit about ST build. From 346c9056c43c45f5d2a22100c86afe05608409bd Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 9 Nov 2023 14:04:36 +0100 Subject: [PATCH 045/108] clarify --- accepted/2023/wasm-browser-threads.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 4fa5483ab..364ebeb16 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -19,7 +19,7 @@ - sync C# to async JS - dynamic creation of new pthread - implement crypto via `subtle` browser API - - allow lazy `[DLLImport]` to download from the server + - allow MonoVM to lazily download DLLs from the server, instead of during startup. - implement synchronous APIs of the HTTP and WS clients. At the moment they throw PNSE. - sync JS to async JS to sync C# - allow calls to synchronous JSExport from UI thread (callback) @@ -174,7 +174,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - where to run the C# main entrypoint - **p)** could be on the UI thread - **q)** could be on the "deputy" or "sidecar" thread -- where to implement sync-to-async: crypto/DLLImport/HTTP APIs/ +- where to implement sync-to-async: crypto/DLL download/HTTP APIs/ - **r)** out of scope - **s)** in the UI thread - **t)** in a dedicated web worker @@ -599,7 +599,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - we could synchronously wait for another thread to do async operations - and use [async API of subtle crypto](https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto) -## Lazy DLLImport - download +## Lazy DLL download - once we have have all managed threads outside of the UI thread - we could synchronously wait for another thread to do async operations - to fetch another DLL which was not pre-downloaded From 293e31be9dcc61eb424d63051e70dbcf053d5fb9 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 9 Nov 2023 14:52:21 +0100 Subject: [PATCH 046/108] reorganize the doc --- accepted/2023/wasm-browser-threads.md | 395 ++++++++++++++------------ 1 file changed, 207 insertions(+), 188 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 364ebeb16..1120acd2a 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -83,6 +83,38 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w **6)** Dynamic creation of new WebWorker requires async operations on emscripten main thread. - we could pre-allocate fixed size pthread pool. But one size doesn't fit all and it's expensive to create too large pool. +# Summary + +## (14) Deputy + emscripten dispatch to UI + JSWebWorker + without sync JSExport + +This is Pavel's preferred design based on experiments and tests so far. +For other possible design options [see below](#Interesting_combinations). + +- Emscripten startup on UI thread + - C functions of emscripten +- MonoVM startup on UI thread + - non-GC C functions of mono are still available + - there is risk that UI will be suspended by pending GC + - it keeps `renderBatch` working as is + - it could be later optimized for purity to **(16)**. Pavel would like this. + - but it's difficult to get rid of many mono C functions we currently use +- managed `Main()` would be dispatched onto dedicated web worker called "deputy thread" + - because the UI thread would be mostly idling, it could + - render UI, keep debugger working + - dynamically create pthreads +- sync JSExports would not be supported on UI thread + - later sync calls could opt-in and we implement **(13)** via spin-wait +- JS interop only on dedicated `JSWebWorker` + +## Sidecar options + +There are few downsides to them +- if we keep main managed thread and emscripten thread the same, pthreads can't be created dynamically + - we could upgrade it to design **(15)** and have extra thread for running managed `Main()` +- we will have to implement extra layer of dispatch from UI to sidecar + - this could be pure JS via `postMessage`, which is slow and can't do spin-wait. + - we could have `SharedArrayBuffer` for the messages, but we would have to implement (another?) marshaling. + ## Define terms - UI thread - this is the main browser "thread", the one with DOM on it @@ -111,182 +143,6 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - we already have prototype of the similar functionality - which can spin-wait -## Implementation options (only some combinations are possible) -- how to deal with blocking C# code on UI thread - - **A)** pretend it's not a problem (this we already have) - - **B)** move user C# code to web worker - - **C)** move all Mono to web worker -- how to deal with blocking in synchronous JS calls from UI thread (like `onClick` callback) - - **D)** pretend it's not a problem (this we already have) - - **E)** throw PNSE when synchronous JSExport is called on UI thread - - **F)** dispatch calls to synchronous JSExport to web worker and spin-wait on JS side of UI thread. -- how to implement JS interop between managed main thread and UI thread (DOM) - - **G)** put it out of scope for MT, manually implement what Blazor needs - - **H)** pure JS dispatch between threads, [comlink](https://github.com/GoogleChromeLabs/comlink) style - - **I)** C/emscripten dispatch of infrastructure to marshal individual parameters - - **J)** C/emscripten dispatch of method binding and invoke, but marshal parameters on UI thread - - **K)** pure C# dispatch between threads -- how to implement JS interop on non-main web worker - - **L)** disable it for all non-main threads - - **M)** disable it for managed thread pool threads - - **N)** allow it only for threads created as dedicated resource `WebWorker` via new API - - **O)** enables it on all workers (let user deal with JS state) -- how to dispatch calls to the right JS thread context - - **P)** via `SynchronizationContext` before `JSImport` stub, synchronously, stack frames - - **Q)** via `SynchronizationContext` inside `JSImport` C# stub - - **R)** via `emscripten_dispatch_to_thread_async` inside C code of `` -- how to implement GC/dispose of `JSObject` proxies - - **S)** per instance: synchronous dispatch the call to correct thread via `SynchronizationContext` - - **T)** per instance: async schedule the cleanup - - at the detach of the thread. We already have `forceDisposeProxies` - - could target managed thread be paused during GC ? -- where to instantiate initial user JS modules (like Blazor's) - - **U)** in the UI thread - - **V)** in the deputy/sidecar thread -- where to instantiate `JSHost.ImportAsync` modules - - **W)** in the UI thread - - **X)** in the deputy/sidecar thread - - **Y)** allow it only for dedicated `JSWebWorker` threads - - **Z)** disable it - - same for `JSHost.GlobalThis`, `JSHost.DotnetInstance` -- how to implement Blazor's `renderBatch` - - **a)** keep as is, wrap it with GC pause, use legacy JS interop on UI thread - - **b)** extract some of the legacy JS interop into Blazor codebase - - **c)** switch to Blazor server mode. Web worker create the batch of bytes and UI thread apply it to DOM -- where to create HTTP+WS JS objects - - **d)** in the UI thread - - **e)** in the managed main thread - - **f)** in first calling `JSWebWorker` managed thread -- how to dispatch calls to HTTP+WS JS objects - - **g)** try to stick to the same thread via `ConfigureAwait(false)`. - - doesn't really work. `Task` migrate too freely - - **h)** via C# `SynchronizationContext` - - **i)** via `emscripten_dispatch_to_thread_async` - - **j)** via `postMessage` - - **k)** same whatever we choose for `JSImport` - - note there are some synchronous calls on WS -- where to create the emscripten instance - - **l)** could be on the UI thread - - **m)** could be on the "sidecar" thread -- where to start the Mono VM - - **n)** could be on the UI thread - - **o)** could be on the "sidecar" thread -- where to run the C# main entrypoint - - **p)** could be on the UI thread - - **q)** could be on the "deputy" or "sidecar" thread -- where to implement sync-to-async: crypto/DLL download/HTTP APIs/ - - **r)** out of scope - - **s)** in the UI thread - - **t)** in a dedicated web worker - - **z)** in the sidecar or deputy -- where to marshal JSImport/JSExport parameters/return/exception - - **u)** could be only values types, proxies out of scope - - **v)** could be on UI thread (with deputy design and Mono there) - - **w)** could be on sidecar (with double proxies of parameters via comlink) - - **x)** could be on sidecar (with comlink calls per parameter) - -# Interesting combinations - -## (8) Minimal support -- **A,D,G,L,P,S,U,Y,a,f,h,l,n,p,v** -- this is what we [already have today](#Current-state-2023-Sep) -- it could deadlock or die, -- JS interop on threads requires lot of user code attention -- Keeps problems **1,2,3,4** - -## (9) Sidecar + no JS interop + narrow Blazor support -- **C,E,G,L,P,S,U,Z,c,d,h,m,o,q,u** -- minimal effort, low risk, low capabilities -- move both emscripten and Mono VM sidecar thread -- no user code JS interop on any thread -- internal solutions for Blazor needs -- Ignores problems **1,2,3,4,5** - -## (10) Sidecar + only async just JS proxies UI + JSWebWorker + Blazor WASM server -- **C,E,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** -- no C or managed code on UI thread - - this architectural clarity is major selling point for sidecar design -- no support for blocking sync JSExport calls from UI thread (callbacks) - - it will throw PNSE -- this will create double proxy for `Task`, `JSObject`, `Func<>` etc - - difficult to GC, difficult to debug -- double marshaling of parameters -- Solves **1,2** for managed code. -- Avoids **1,2** for JS callback - - emscripten main loop stays responsive only when main managed thread is idle -- Solves **3,4,5** - -## (11) Sidecar + async & sync just JS proxies UI + JSWebWorker + Blazor WASM server -- **C,F,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** -- no C or managed code on UI thread -- support for blocking sync JSExport calls from UI thread (callbacks) - - at blocking the UI is at least well isolated from runtime code - - it makes responsibility for sync call clear -- this will create double proxy for `Task`, `JSObject`, `Func<>` etc - - difficult to GC, difficult to debug -- double marshaling of parameters -- Solves **1,2** for managed code - - unless there is sync `JSImport`->`JSExport` call -- Ignores **1,2** for JS callback - - emscripten main loop stays responsive only when main managed thread is idle -- Solves **3,4,5** - -## (12) Deputy + managed dispatch to UI + JSWebWorker + with sync JSExport -- **B,F,K,N,Q,S/T,U,W,a/b/c,d+f,h,l,n,s/z,v** -- this uses `JSSynchronizationContext` to dispatch calls to UI thread - - this is "dirty" as compared to sidecar because some managed code is actually running on UI thread - - it needs to also use `SynchronizationContext` for `JSExport` and callbacks, to dispatch to deputy. -- blazor render could be both legacy render or Blazor server style - - because we have both memory and mono on the UI thread -- Solves **1,2** for managed code - - unless there is sync `JSImport`->`JSExport` call -- Ignores **1,2** for JS callback - - emscripten main loop could deadlock on sync JSExport -- Solves **3,4,5** - -## (13) Deputy + emscripten dispatch to UI + JSWebWorker + with sync JSExport -- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** -- is variation of **(12)** - - with emscripten dispatch and marshaling in UI thread -- this uses `emscripten_dispatch_to_thread_async` for `call_entry_point`, `complete_task`, `cwraps.mono_wasm_invoke_method_bound`, `mono_wasm_invoke_bound_function`, `mono_wasm_invoke_import`, `call_delegate_method` to get to the UI thread. -- it uses other `cwraps` locally on UI thread, like `mono_wasm_new_root`, `stringToMonoStringRoot`, `malloc`, `free`, `create_task_callback_method` - - it means that interop related managed runtime code is running on the UI thread, but not the user code. - - it means that parameter marshalling is fast (compared to sidecar) - - this deputy design is major selling point #2 - - it still needs to enter GC barrier and so it could block UI for GC run shortly -- blazor render could be both legacy render or Blazor server style - - because we have both memory and mono on the UI thread -- Solves **1,2** for managed code - - unless there is sync `JSImport`->`JSExport` call -- Ignores **1,2** for JS callback - - emscripten main loop could deadlock on sync JSExport -- Solves **3,4,5** - -## (14) Deputy + emscripten dispatch to UI + JSWebWorker + without sync JSExport -- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** -- is variation of **(13)** - - without support for synchronous JSExport -- Solves **1,2** for managed code - - emscripten main loop stays responsive - - unless there is sync `JSImport`->`JSExport` call -- Avoids **2** for JS callback - - by throwing PNSE -- Solves **3,4,5** - -## (15) Deputy + Sidecar + UI thread -- 2 levels of indirection. -- benefit: blocking JSExport from UI thread doesn't block emscripten loop -- downside: complex and more resource intensive - -## (16) Deputy without Mono, no GC barrier breach for interop -- variation on **(13)** or **(14)** where we get rid of per-parameter calls to Mono -- benefit: get closer to purity of sidecar design without loosing perf - - this could be done later as purity optimization -- in this design the mono could be started on deputy thread - - this will keep UI responsive during startup -- UI would not be mono attached thread. -- See [details](#Get_rid_of_Mono_GC_boundary_breach) - # Details ## JSImport and marshaled JS functions @@ -622,17 +478,180 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - Many unit tests fail on MT https://github.com/dotnet/runtime/pull/91536 - there are MT C# ref assemblies, which don't throw PNSE for MT build of the runtime for blocking APIs. -## Task breakdown -- [ ] rename `WebWorker` API to `JSWebWorker` ? -- [ ] `ToManaged(out Task)` to be called before the actual JS method -- [ ] public API for `JSHost.SynchronizationContext` which could be used by code generator. -- [ ] reimplement `JSSynchronizationContext` to be more async -- [ ] implement Blazor's `WebAssemblyDispatcher` + [feedback](https://github.com/dotnet/aspnetcore/pull/48991) -- [ ] optional: make underlying emscripten WebWorker pool allocation dynamic, or provide C# API for that. -- [ ] optional: implement async function/delegate marshaling in JSImport/JSExport parameters. -- [ ] optional: enable blocking HTTP/WS APIs -- [ ] optional: enable lazy DLL download by blocking the caller -- [ ] optional: implement crypto -- [ ] measure perf impact +## Implementation options (only some combinations are possible) +- how to deal with blocking C# code on UI thread + - **A)** pretend it's not a problem (this we already have) + - **B)** move user C# code to web worker + - **C)** move all Mono to web worker +- how to deal with blocking in synchronous JS calls from UI thread (like `onClick` callback) + - **D)** pretend it's not a problem (this we already have) + - **E)** throw PNSE when synchronous JSExport is called on UI thread + - **F)** dispatch calls to synchronous JSExport to web worker and spin-wait on JS side of UI thread. +- how to implement JS interop between managed main thread and UI thread (DOM) + - **G)** put it out of scope for MT, manually implement what Blazor needs + - **H)** pure JS dispatch between threads, [comlink](https://github.com/GoogleChromeLabs/comlink) style + - **I)** C/emscripten dispatch of infrastructure to marshal individual parameters + - **J)** C/emscripten dispatch of method binding and invoke, but marshal parameters on UI thread + - **K)** pure C# dispatch between threads +- how to implement JS interop on non-main web worker + - **L)** disable it for all non-main threads + - **M)** disable it for managed thread pool threads + - **N)** allow it only for threads created as dedicated resource `WebWorker` via new API + - **O)** enables it on all workers (let user deal with JS state) +- how to dispatch calls to the right JS thread context + - **P)** via `SynchronizationContext` before `JSImport` stub, synchronously, stack frames + - **Q)** via `SynchronizationContext` inside `JSImport` C# stub + - **R)** via `emscripten_dispatch_to_thread_async` inside C code of `` +- how to implement GC/dispose of `JSObject` proxies + - **S)** per instance: synchronous dispatch the call to correct thread via `SynchronizationContext` + - **T)** per instance: async schedule the cleanup + - at the detach of the thread. We already have `forceDisposeProxies` + - could target managed thread be paused during GC ? +- where to instantiate initial user JS modules (like Blazor's) + - **U)** in the UI thread + - **V)** in the deputy/sidecar thread +- where to instantiate `JSHost.ImportAsync` modules + - **W)** in the UI thread + - **X)** in the deputy/sidecar thread + - **Y)** allow it only for dedicated `JSWebWorker` threads + - **Z)** disable it + - same for `JSHost.GlobalThis`, `JSHost.DotnetInstance` +- how to implement Blazor's `renderBatch` + - **a)** keep as is, wrap it with GC pause, use legacy JS interop on UI thread + - **b)** extract some of the legacy JS interop into Blazor codebase + - **c)** switch to Blazor server mode. Web worker create the batch of bytes and UI thread apply it to DOM +- where to create HTTP+WS JS objects + - **d)** in the UI thread + - **e)** in the managed main thread + - **f)** in first calling `JSWebWorker` managed thread +- how to dispatch calls to HTTP+WS JS objects + - **g)** try to stick to the same thread via `ConfigureAwait(false)`. + - doesn't really work. `Task` migrate too freely + - **h)** via C# `SynchronizationContext` + - **i)** via `emscripten_dispatch_to_thread_async` + - **j)** via `postMessage` + - **k)** same whatever we choose for `JSImport` + - note there are some synchronous calls on WS +- where to create the emscripten instance + - **l)** could be on the UI thread + - **m)** could be on the "sidecar" thread +- where to start the Mono VM + - **n)** could be on the UI thread + - **o)** could be on the "sidecar" thread +- where to run the C# main entrypoint + - **p)** could be on the UI thread + - **q)** could be on the "deputy" or "sidecar" thread +- where to implement sync-to-async: crypto/DLL download/HTTP APIs/ + - **r)** out of scope + - **s)** in the UI thread + - **t)** in a dedicated web worker + - **z)** in the sidecar or deputy +- where to marshal JSImport/JSExport parameters/return/exception + - **u)** could be only values types, proxies out of scope + - **v)** could be on UI thread (with deputy design and Mono there) + - **w)** could be on sidecar (with double proxies of parameters via comlink) + - **x)** could be on sidecar (with comlink calls per parameter) + +# Interesting combinations + +## (8) Minimal support +- **A,D,G,L,P,S,U,Y,a,f,h,l,n,p,v** +- this is what we [already have today](#Current-state-2023-Sep) +- it could deadlock or die, +- JS interop on threads requires lot of user code attention +- Keeps problems **1,2,3,4** + +## (9) Sidecar + no JS interop + narrow Blazor support +- **C,E,G,L,P,S,U,Z,c,d,h,m,o,q,u** +- minimal effort, low risk, low capabilities +- move both emscripten and Mono VM sidecar thread +- no user code JS interop on any thread +- internal solutions for Blazor needs +- Ignores problems **1,2,3,4,5** + +## (10) Sidecar + only async just JS proxies UI + JSWebWorker + Blazor WASM server +- **C,E,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** +- no C or managed code on UI thread + - this architectural clarity is major selling point for sidecar design +- no support for blocking sync JSExport calls from UI thread (callbacks) + - it will throw PNSE +- this will create double proxy for `Task`, `JSObject`, `Func<>` etc + - difficult to GC, difficult to debug +- double marshaling of parameters +- Solves **1,2** for managed code. +- Avoids **1,2** for JS callback + - emscripten main loop stays responsive only when main managed thread is idle +- Solves **3,4,5** + +## (11) Sidecar + async & sync just JS proxies UI + JSWebWorker + Blazor WASM server +- **C,F,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** +- no C or managed code on UI thread +- support for blocking sync JSExport calls from UI thread (callbacks) + - at blocking the UI is at least well isolated from runtime code + - it makes responsibility for sync call clear +- this will create double proxy for `Task`, `JSObject`, `Func<>` etc + - difficult to GC, difficult to debug +- double marshaling of parameters +- Solves **1,2** for managed code + - unless there is sync `JSImport`->`JSExport` call +- Ignores **1,2** for JS callback + - emscripten main loop stays responsive only when main managed thread is idle +- Solves **3,4,5** + +## (12) Deputy + managed dispatch to UI + JSWebWorker + with sync JSExport +- **B,F,K,N,Q,S/T,U,W,a/b/c,d+f,h,l,n,s/z,v** +- this uses `JSSynchronizationContext` to dispatch calls to UI thread + - this is "dirty" as compared to sidecar because some managed code is actually running on UI thread + - it needs to also use `SynchronizationContext` for `JSExport` and callbacks, to dispatch to deputy. +- blazor render could be both legacy render or Blazor server style + - because we have both memory and mono on the UI thread +- Solves **1,2** for managed code + - unless there is sync `JSImport`->`JSExport` call +- Ignores **1,2** for JS callback + - emscripten main loop could deadlock on sync JSExport +- Solves **3,4,5** + +## (13) Deputy + emscripten dispatch to UI + JSWebWorker + with sync JSExport +- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** +- is variation of **(12)** + - with emscripten dispatch and marshaling in UI thread +- this uses `emscripten_dispatch_to_thread_async` for `call_entry_point`, `complete_task`, `cwraps.mono_wasm_invoke_method_bound`, `mono_wasm_invoke_bound_function`, `mono_wasm_invoke_import`, `call_delegate_method` to get to the UI thread. +- it uses other `cwraps` locally on UI thread, like `mono_wasm_new_root`, `stringToMonoStringRoot`, `malloc`, `free`, `create_task_callback_method` + - it means that interop related managed runtime code is running on the UI thread, but not the user code. + - it means that parameter marshalling is fast (compared to sidecar) + - this deputy design is major selling point #2 + - it still needs to enter GC barrier and so it could block UI for GC run shortly +- blazor render could be both legacy render or Blazor server style + - because we have both memory and mono on the UI thread +- Solves **1,2** for managed code + - unless there is sync `JSImport`->`JSExport` call +- Ignores **1,2** for JS callback + - emscripten main loop could deadlock on sync JSExport +- Solves **3,4,5** + +## (14) Deputy + emscripten dispatch to UI + JSWebWorker + without sync JSExport +- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** +- is variation of **(13)** + - without support for synchronous JSExport +- Solves **1,2** for managed code + - emscripten main loop stays responsive + - unless there is sync `JSImport`->`JSExport` call +- Avoids **2** for JS callback + - by throwing PNSE +- Solves **3,4,5** + +## (15) Deputy + Sidecar + UI thread +- 2 levels of indirection. +- benefit: blocking JSExport from UI thread doesn't block emscripten loop +- downside: complex and more resource intensive + +## (16) Deputy without Mono, no GC barrier breach for interop +- variation on **(13)** or **(14)** where we get rid of per-parameter calls to Mono +- benefit: get closer to purity of sidecar design without loosing perf + - this could be done later as purity optimization +- in this design the mono could be started on deputy thread + - this will keep UI responsive during startup +- UI would not be mono attached thread. +- See [details](#Get_rid_of_Mono_GC_boundary_breach) Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 \ No newline at end of file From 28cbd014a7e0da1704f0c95965a4c6670f835408 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 9 Nov 2023 15:04:08 +0100 Subject: [PATCH 047/108] links --- accepted/2023/wasm-browser-threads.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 1120acd2a..e1cf4ad2c 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -88,7 +88,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w ## (14) Deputy + emscripten dispatch to UI + JSWebWorker + without sync JSExport This is Pavel's preferred design based on experiments and tests so far. -For other possible design options [see below](#Interesting_combinations). +For other possible design options [see below](#Interesting-combinations). - Emscripten startup on UI thread - C functions of emscripten @@ -652,6 +652,6 @@ There are few downsides to them - in this design the mono could be started on deputy thread - this will keep UI responsive during startup - UI would not be mono attached thread. -- See [details](#Get_rid_of_Mono_GC_boundary_breach) +- See [details](#Get-rid-of-Mono-GC-boundary-breach) Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 \ No newline at end of file From 963a3020edebbcee39c7bb81a8b6e1d274a51518 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 13 Nov 2023 13:06:35 +0100 Subject: [PATCH 048/108] more --- accepted/2023/wasm-browser-threads.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index e1cf4ad2c..64b1178f9 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -97,7 +97,8 @@ For other possible design options [see below](#Interesting-combinations). - there is risk that UI will be suspended by pending GC - it keeps `renderBatch` working as is - it could be later optimized for purity to **(16)**. Pavel would like this. - - but it's difficult to get rid of many mono C functions we currently use + - the mono startup is CPU heavy and it blocks rendering even for server side rendered UI. + - but it's difficult to get rid of many mono [C functions we currently use](#Move-Mono-startup-to-deputy) - managed `Main()` would be dispatched onto dedicated web worker called "deputy thread" - because the UI thread would be mostly idling, it could - render UI, keep debugger working @@ -352,8 +353,9 @@ There are few downsides to them - doesn't block and doesn't propagate exceptions - this is slow -## Get rid of Mono GC boundary breach +## Move Mono startup to deputy - related to design **(16)** +- Get rid of Mono GC boundary breach - `Task`/`Promise` - improved in https://github.com/dotnet/runtime/pull/93010 - `MonoString` @@ -382,7 +384,7 @@ There are few downsides to them - what's overall perf impact for Blazor's `renderBatch` ? ## Performance -- as compared to ST build for dotnet wasm +As compared to ST build for dotnet wasm: - the dispatch between threads (caused by JS object thread affinity) will have negative performance impact on the JS interop - in case of HTTP/WS clients used via Streams, it could be surprizing - browser performance is lower when working with SharedArrayBuffer @@ -390,6 +392,7 @@ There are few downsides to them - startup is slower because creation of WebWorker instances is slow - VFS access is slow because it's dispatched to UI thread - console output is slow because it's POSIX stream is dispatched to UI thread, call per `put_char` +- any VFS access is slow because it dispatched to UI thread ## Spin-waiting in JS - if we want to keep synchronous JS APIs to work on UI thread, we have to spin-wait From b4530bc0ea4a68f8182f96b5ed87eb90a79073a0 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 13 Nov 2023 17:41:22 +0100 Subject: [PATCH 049/108] JSImport dispatch --- accepted/2023/wasm-browser-threads.md | 153 ++++++++++++++++++++++++-- 1 file changed, 143 insertions(+), 10 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 64b1178f9..71b83410c 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -249,15 +249,27 @@ There are few downsides to them - see `emscripten_dispatch_to_thread_async` below # Dispatching JSImport - what should happen -- is normally bound to JS context of the calling managed thread -- but it could be called with `JSObject` parameters which are bound to another thread - - if targets don't match each other throw `ArgumentException` ? - - if it's called from thread-pool thread - - which is not `JSWebWorker` - - should we dispatch it by affinity of the parameters ? - - if parameters affinity do match each other but not match current `JSWebWorker` - - should we dispatch it by affinity of the parameters ? - - this would solve HTTP/WS scenarios +- when there is no extra code-gen flag + - for backward compatibility, dispatch handled by user + - assert that we are on `JSWebWorker` or main thread + - assert all parameters affinity to current thread +- when generated with `JSImportAttribute.Affinity==UI` + - dispatch to UI thread + - assert all parameters affinity to UI thread + - could be called from any thread, including thread pool + - there is no `JSSynchronizationContext` in deputy's UI, use emscripten. + - emscripten can't block caller +- when generated with `JSImportAttribute.Affinity==ByParams` + - dispatch to thread of parameters + - assert all parameters have same affinity + - could be called from any thread, including thread pool +- how to obtain `JSHandle` of function in the target thread ? + - call `JSFunctionBinding.BindJSFunction` inside of generated dispatch callback +- how to dispatch to UI in deputy design ? + - A) double dispatch, C# -> main, emscripten -> UI + - B) make whole dispatch emscripten only, implement blocking wait in C for emscripten sync calls. + - C) only allow sync call on non-UI target + - see scratch area at the bottom # Dispatching JSExport - what should happen - when caller is UI, we need to dispatch back to managed thread @@ -269,6 +281,7 @@ There are few downsides to them # Dispatching call - options - `JSSynchronizationContext` - in deputy design + - this would not work for dispatch to UI thread as it doesn't have sync context - is implementation of `SynchronizationContext` installed to - managed thread with `JSWebWorker` - or main managed thread @@ -315,6 +328,7 @@ There are few downsides to them - `resolve(null)` ? - `reject(null)` ? - from C# to JS (UI) + - how to obtain JSHandle of function in the target thread ? - async - needs to deal with `stackalloc` in C# generated stub, by copying the buffer - sync @@ -657,4 +671,123 @@ As compared to ST build for dotnet wasm: - UI would not be mono attached thread. - See [details](#Get-rid-of-Mono-GC-boundary-breach) -Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 \ No newline at end of file +Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 + + +---------------------- Scratch area + +```cs +[ThreadStaticAttribute] +JSFunctionBinding __signature_Log_2101499449; + +[global::System.Diagnostics.DebuggerNonUserCode] +public static partial void Log(JSObject ws) +{ + global::System.Span __arguments_buffer = stackalloc JSMarshalerArgument[3]; + ref JSMarshalerArgument __arg_exception = ref __arguments_buffer[0]; + __arg_exception.Initialize(); + ref JSMarshalerArgument __arg_return = ref __arguments_buffer[1]; + __arg_return.Initialize(); + + ref JSMarshalerArgument __ws_native__js_arg = ref __arguments_buffer[2]; + __ws_native__js_arg.ToJS(ws); + + JSFunctionBinding.Post(ws.SynchronizationContext, static (object? x) => { + if (__signature_Log_2101499449 == null) + { + __signature_Log_2101499449 = JSFunctionBinding.BindJSFunction("xxx", null, new JSMarshalerType[] { JSMarshalerType.Discard, JSMarshalerType.JSObject }); + } + JSFunctionBinding.InvokeJS(__signature_Log_2101499449, x); + }, __arguments_buffer); +} +``` + + +```cs +[ThreadStaticAttribute] +JSFunctionBinding __signature_Log_2101499449; + +[global::System.Diagnostics.DebuggerNonUserCode] +public static partial void Log(JSObject ws) +{ + global::System.Span __arguments_buffer = stackalloc JSMarshalerArgument[3]; + ref JSMarshalerArgument __arg_exception = ref __arguments_buffer[0]; + __arg_exception.Initialize(); + ref JSMarshalerArgument __arg_return = ref __arguments_buffer[1]; + __arg_return.Initialize(); + + ref JSMarshalerArgument __ws_native__js_arg = ref __arguments_buffer[2]; + __ws_native__js_arg.ToJS(ws); + + JSFunctionBinding.InvokeJSAt(ws.SynchronizationContext, static () => { + if (__signature_Log_2101499449 == null) + { + __signature_Log_2101499449 = JSFunctionBinding.BindJSFunction("xxx", null, new JSMarshalerType[] { JSMarshalerType.Discard, JSMarshalerType.JSObject }); + } + return __signature_Log_2101499449; + }, __arguments_buffer); +} + + +public static partial string MemberEcho(string message) +{ + Span __arguments_buffer = stackalloc JSMarshalerArgument[3]; + + __message_native__js_arg.ToJS(message); + + JSFunctionBinding.InvokeJSAtSync(ws.SynchronizationContext, static () => { + if (__signature_Log_2101499449 == null) + { + __signature_Log_2101499449 = JSFunctionBinding.BindJSFunction("xxx", null, new JSMarshalerType[] { JSMarshalerType.Discard, JSMarshalerType.JSObject }); + } + return __signature_Log_2101499449; + }, __arguments_buffer); + + __arg_return.ToManaged(out __retVal); + + return __retVal; +} +[ThreadStaticAttribute] +static JSFunctionBinding __signature_MemberEcho_630990033; + +Task JSFunctionBinding.InvokeJSAtAsync(IntPtr targetThreadId, Func jsFuncProvider, Span args){ + emscripten_dispatch_async(targetThreadId, jsFuncProvider, args); + return somePromise; +} + +void JSFunctionBinding.InvokeJSAtSync(IntPtr targetThreadId, Func jsFuncProvider, Span args){ + var sem=Semaphore() + emscripten_dispatch_async(targetThreadId, jsFuncProvider, args); + sem.Wait(); +} + +``` + + +current in Net7, Net8 +```cs +[ThreadStaticAttribute] +static JSFunctionBinding __signature_Log_2101499449; + +[global::System.Diagnostics.DebuggerNonUserCode] +public static partial void Log(string message) +{ + if (__signature_Log_2101499449 == null) + { + __signature_Log_2101499449 = JSFunctionBinding.BindJSFunction("globalThis.console.log", null, new JSMarshalerType[] { JSMarshalerType.Discard, JSMarshalerType.String }); + } + + global::System.Span __arguments_buffer = stackalloc JSMarshalerArgument[3]; + ref JSMarshalerArgument __arg_exception = ref __arguments_buffer[0]; + __arg_exception.Initialize(); + ref JSMarshalerArgument __arg_return = ref __arguments_buffer[1]; + __arg_return.Initialize(); + + ref JSMarshalerArgument __message_native__js_arg = ref __arguments_buffer[2]; + + __message_native__js_arg.ToJS(message); + JSFunctionBinding.InvokeJS(__signature_Log_2101499449, __arguments_buffer); +} +``` + + From 4b424ff00d30476ad8633d63d30d543b7ee0d154 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 15 Nov 2023 16:22:11 +0100 Subject: [PATCH 050/108] JSImport ideas --- accepted/2023/wasm-browser-threads.md | 128 +++++++------------------- 1 file changed, 32 insertions(+), 96 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 71b83410c..5529d8d00 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -240,13 +240,16 @@ There are few downsides to them - it needs to stay backward compatible with Net7, Net8 already generated code - how to detect that there is new version of generated code ? - it needs to do it via public C# API - - possibly new API `JSHost.Post` or `JSHost.Send` + - possibly new API `JSHost.Post` and `JSHost.Send` + - or `JSHost.InvokeInTargetSync` and `JSHost.InvokeInTargetAsync` - it needs to re-consider current `stackalloc` - probably by re-ordering Roslyn generated code of `__arg_return.ToManaged(out __retVal);` before `JSFunctionBinding.InvokeJS` - it needs to propagate exceptions -- Roslyn generator: JSExport - if we make it responsible for the dispatch +- Roslyn generator: JSExport - can't be used + - this is just the UI -> deputy dispatch, which is not C# code - Mono/C/JS internal layer - see `emscripten_dispatch_to_thread_async` below +- TODO: API SynCContext as parameter of `JSImport` # Dispatching JSImport - what should happen - when there is no extra code-gen flag @@ -263,13 +266,33 @@ There are few downsides to them - dispatch to thread of parameters - assert all parameters have same affinity - could be called from any thread, including thread pool -- how to obtain `JSHandle` of function in the target thread ? - - call `JSFunctionBinding.BindJSFunction` inside of generated dispatch callback + +# Dispatching JSImport in deputy design - how to do it - how to dispatch to UI in deputy design ? - A) double dispatch, C# -> main, emscripten -> UI - B) make whole dispatch emscripten only, implement blocking wait in C for emscripten sync calls. - C) only allow sync call on non-UI target - - see scratch area at the bottom +- how to obtain `JSHandle` of function in the target thread ? + - there are 2 dimensions: the thread and the method + - there are 2 steps: + - A) obtain existing `JSHandle` on next call (if available) + - to avoid double dispatch, this needs to be accessible + - by any caller thread + - or by UI thread C code (not managed) + - B) if this is first call to the method on the target thread, create the target `JSHandle` by binding existing JS function + - collecting the metadata is generated C# code + - therefore we need to get the metadata buffer on caller main thread: double dispatch + - store new `JSHandle` somewhere +- possible solution + assign `static` unique ID to the function on C# side during first call. + - A) Call back to C# if the method was not bound yet (which thread ?). + - B) Keep the metadata buffer + - make `JSFunctionBinding` registration static (not thread-static) + - never free the buffer + - pass the buffer on each call to the target + - late bind `JSHandle` + - store the `JSHandle` on JS side (thread static) associated with method ID + # Dispatching JSExport - what should happen - when caller is UI, we need to dispatch back to managed thread @@ -277,7 +300,7 @@ There are few downsides to them - when caller is `JSWebWorker`, - we are probably on correct thread already - when caller is callback from HTTP/WS we could dispatch to any managed thread -- caller can't be managed thread pool, because they would not use JS `self` context +- callers are not from managed thread pool, by design. Because we don't want any JS code running there. # Dispatching call - options - `JSSynchronizationContext` - in deputy design @@ -500,6 +523,7 @@ As compared to ST build for dotnet wasm: - **A)** pretend it's not a problem (this we already have) - **B)** move user C# code to web worker - **C)** move all Mono to web worker + - **D)** like **A)** just move call of the C# `Main()` to `JSWebWorker` - how to deal with blocking in synchronous JS calls from UI thread (like `onClick` callback) - **D)** pretend it's not a problem (this we already have) - **E)** throw PNSE when synchronous JSExport is called on UI thread @@ -674,97 +698,9 @@ As compared to ST build for dotnet wasm: Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 ----------------------- Scratch area - -```cs -[ThreadStaticAttribute] -JSFunctionBinding __signature_Log_2101499449; - -[global::System.Diagnostics.DebuggerNonUserCode] -public static partial void Log(JSObject ws) -{ - global::System.Span __arguments_buffer = stackalloc JSMarshalerArgument[3]; - ref JSMarshalerArgument __arg_exception = ref __arguments_buffer[0]; - __arg_exception.Initialize(); - ref JSMarshalerArgument __arg_return = ref __arguments_buffer[1]; - __arg_return.Initialize(); - - ref JSMarshalerArgument __ws_native__js_arg = ref __arguments_buffer[2]; - __ws_native__js_arg.ToJS(ws); - - JSFunctionBinding.Post(ws.SynchronizationContext, static (object? x) => { - if (__signature_Log_2101499449 == null) - { - __signature_Log_2101499449 = JSFunctionBinding.BindJSFunction("xxx", null, new JSMarshalerType[] { JSMarshalerType.Discard, JSMarshalerType.JSObject }); - } - JSFunctionBinding.InvokeJS(__signature_Log_2101499449, x); - }, __arguments_buffer); -} -``` - - -```cs -[ThreadStaticAttribute] -JSFunctionBinding __signature_Log_2101499449; - -[global::System.Diagnostics.DebuggerNonUserCode] -public static partial void Log(JSObject ws) -{ - global::System.Span __arguments_buffer = stackalloc JSMarshalerArgument[3]; - ref JSMarshalerArgument __arg_exception = ref __arguments_buffer[0]; - __arg_exception.Initialize(); - ref JSMarshalerArgument __arg_return = ref __arguments_buffer[1]; - __arg_return.Initialize(); - - ref JSMarshalerArgument __ws_native__js_arg = ref __arguments_buffer[2]; - __ws_native__js_arg.ToJS(ws); - - JSFunctionBinding.InvokeJSAt(ws.SynchronizationContext, static () => { - if (__signature_Log_2101499449 == null) - { - __signature_Log_2101499449 = JSFunctionBinding.BindJSFunction("xxx", null, new JSMarshalerType[] { JSMarshalerType.Discard, JSMarshalerType.JSObject }); - } - return __signature_Log_2101499449; - }, __arguments_buffer); -} - - -public static partial string MemberEcho(string message) -{ - Span __arguments_buffer = stackalloc JSMarshalerArgument[3]; - - __message_native__js_arg.ToJS(message); - - JSFunctionBinding.InvokeJSAtSync(ws.SynchronizationContext, static () => { - if (__signature_Log_2101499449 == null) - { - __signature_Log_2101499449 = JSFunctionBinding.BindJSFunction("xxx", null, new JSMarshalerType[] { JSMarshalerType.Discard, JSMarshalerType.JSObject }); - } - return __signature_Log_2101499449; - }, __arguments_buffer); - - __arg_return.ToManaged(out __retVal); - - return __retVal; -} -[ThreadStaticAttribute] -static JSFunctionBinding __signature_MemberEcho_630990033; - -Task JSFunctionBinding.InvokeJSAtAsync(IntPtr targetThreadId, Func jsFuncProvider, Span args){ - emscripten_dispatch_async(targetThreadId, jsFuncProvider, args); - return somePromise; -} - -void JSFunctionBinding.InvokeJSAtSync(IntPtr targetThreadId, Func jsFuncProvider, Span args){ - var sem=Semaphore() - emscripten_dispatch_async(targetThreadId, jsFuncProvider, args); - sem.Wait(); -} - -``` - +## Scratch pad -current in Net7, Net8 +current generated `JSImport` in Net7, Net8 ```cs [ThreadStaticAttribute] static JSFunctionBinding __signature_Log_2101499449; From e197b3c4be417f6eda1f9737e37f674fde64e0cd Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 20 Nov 2023 17:32:14 +0100 Subject: [PATCH 051/108] more --- accepted/2023/wasm-browser-threads.md | 42 ++++++++++++++++++++------- 1 file changed, 31 insertions(+), 11 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 5529d8d00..ebbeecaa2 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -292,6 +292,7 @@ There are few downsides to them - pass the buffer on each call to the target - late bind `JSHandle` - store the `JSHandle` on JS side (thread static) associated with method ID +- TODO: double dispatch in Blazor # Dispatching JSExport - what should happen @@ -697,33 +698,52 @@ As compared to ST build for dotnet wasm: Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 +## (17) Emscripten em_queue in deputy, managed UI thread +- is interesting because it avoids cross-thread dispatch to UI + - including double dispatch in Blazor's `RendererSynchronizationContext` +- avoids solving **1,2** +- low level hacking of emscripten design assumptions ## Scratch pad current generated `JSImport` in Net7, Net8 ```cs -[ThreadStaticAttribute] -static JSFunctionBinding __signature_Log_2101499449; - -[global::System.Diagnostics.DebuggerNonUserCode] -public static partial void Log(string message) +[DebuggerNonUserCode] +public static partial Task WebSocketReceive(JSObject webSocket, nint bufferPtr, int bufferLength) { - if (__signature_Log_2101499449 == null) + if (__signature_WebSocketReceive_1144640460 == null) { - __signature_Log_2101499449 = JSFunctionBinding.BindJSFunction("globalThis.console.log", null, new JSMarshalerType[] { JSMarshalerType.Discard, JSMarshalerType.String }); + __signature_WebSocketReceive_1144640460 = JSFunctionBinding.BindJSFunction("INTERNAL.ws_wasm_receive", null, new JSMarshalerType[] { + JSMarshalerType.Task(), + JSMarshalerType.JSObject, + JSMarshalerType.IntPtr, + JSMarshalerType.Int32 + }); } - global::System.Span __arguments_buffer = stackalloc JSMarshalerArgument[3]; + Span __arguments_buffer = stackalloc JSMarshalerArgument[5]; ref JSMarshalerArgument __arg_exception = ref __arguments_buffer[0]; __arg_exception.Initialize(); ref JSMarshalerArgument __arg_return = ref __arguments_buffer[1]; __arg_return.Initialize(); + Task __retVal; + + ref JSMarshalerArgument __bufferLength_native__js_arg = ref __arguments_buffer[4]; + ref JSMarshalerArgument __bufferPtr_native__js_arg = ref __arguments_buffer[3]; + ref JSMarshalerArgument __webSocket_native__js_arg = ref __arguments_buffer[2]; + + __bufferLength_native__js_arg.ToJS(bufferLength); + __bufferPtr_native__js_arg.ToJS(bufferPtr); + __webSocket_native__js_arg.ToJS(webSocket); - ref JSMarshalerArgument __message_native__js_arg = ref __arguments_buffer[2]; + JSFunctionBinding.InvokeJS(__signature_WebSocketReceive_1144640460, __arguments_buffer); - __message_native__js_arg.ToJS(message); - JSFunctionBinding.InvokeJS(__signature_Log_2101499449, __arguments_buffer); + __arg_return.ToManaged(out __retVal); + + return __retVal; } + +static JSFunctionBinding __signature_WebSocketReceive_1144640460; ``` From 16aeb0a954bec051e4fca5c3f3f7483aa2439cdc Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Mon, 20 Nov 2023 19:16:54 +0100 Subject: [PATCH 052/108] more --- accepted/2023/wasm-browser-threads.md | 1 + 1 file changed, 1 insertion(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index ebbeecaa2..486002414 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -743,6 +743,7 @@ public static partial Task WebSocketReceive(JSObject webSocket, nint bufferPtr, return __retVal; } +[ThreadStaticAttribute] static JSFunctionBinding __signature_WebSocketReceive_1144640460; ``` From cf32eb03bf98d389ef8c0b6dfdc6287ef0610893 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Tue, 21 Nov 2023 11:20:22 +0100 Subject: [PATCH 053/108] more --- accepted/2023/wasm-browser-threads.md | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 486002414..5392e12cb 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -122,10 +122,12 @@ There are few downsides to them - it can't block-wait, only spin-wait - "sidecar" thread - possible design - is a web worker with emscripten and mono VM started on it + - there is no emscripten on UI thread - for Blazor rendering MAUI/BlazorWebView use the same concept - doing this allows all managed threads to allow blocking wait - "deputy" thread - possible design - is a web worker and pthread with C# `Main` entrypoint + - emscripten startup stays on UI thread - doing this allows all managed threads to allow blocking wait - "managed thread" - is a thread with emscripten pthread and Mono VM attached thread and GC barriers @@ -745,6 +747,30 @@ public static partial Task WebSocketReceive(JSObject webSocket, nint bufferPtr, [ThreadStaticAttribute] static JSFunctionBinding __signature_WebSocketReceive_1144640460; + +[DebuggerNonUserCode] +internal static unsafe void __Wrapper_Dummy_1616792047(JSMarshalerArgument* __arguments_buffer) +{ + Task meaningPromise; + ref JSMarshalerArgument __arg_exception = ref __arguments_buffer[0]; + ref JSMarshalerArgument __arg_return = ref __arguments_buffer[1]; + + ref JSMarshalerArgument __meaningPromise_native__js_arg = ref __arguments_buffer[2]; + try + { + + __meaningPromise_native__js_arg.ToManaged(out meaningPromise, + static (ref JSMarshalerArgument __task_result_arg, out int __task_result) => + { + __task_result_arg.ToManaged(out __task_result); + }); + Sample.Test.Dummy(meaningPromise); + } + catch (global::System.Exception ex) + { + __arg_exception.ToJS(ex); + } +} ``` From 40d53d79a170194ee6b5b4e65fd6fa91575230fd Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Wed, 22 Nov 2023 10:28:24 +0100 Subject: [PATCH 054/108] more --- accepted/2023/wasm-browser-threads.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 5392e12cb..012773bf7 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -706,10 +706,29 @@ Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 - avoids solving **1,2** - low level hacking of emscripten design assumptions +## (18) Soft deputy +- keep both Mono and emscripten in the UI thread +- use `SynchronizationContext` to do the dispatch +- make it easy and default to run any user code in deputy thread + - all Blazor events and callbacks like `onClick` to deputy + - move SignalR to deputy + - move Blazor entry point to deputy +- hope that UI thread is mostly idle + - enable dynamic thread allocation + - throw exceptions in dev loop when UI thread does `lock` or `.Wait` in user code + ## Scratch pad current generated `JSImport` in Net7, Net8 + ```cs + +[JSImport(Dispatch.UI)] +public static partial Task WebSocketReceive(JSObject webSocket, nint bufferPtr, int bufferLength); + +[JSImport(Dispatch.Params)] +public static partial Task WebSocketReceive(JSObject webSocket, nint bufferPtr, int bufferLength); + [DebuggerNonUserCode] public static partial Task WebSocketReceive(JSObject webSocket, nint bufferPtr, int bufferLength) { From 4fd24c9f186e60dfe6b1c699820a4980268dddc8 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 23 Nov 2023 10:07:16 +0100 Subject: [PATCH 055/108] promise + .Wait deadlock --- accepted/2023/wasm-browser-threads.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 012773bf7..64bf1bfe6 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -83,6 +83,8 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w **6)** Dynamic creation of new WebWorker requires async operations on emscripten main thread. - we could pre-allocate fixed size pthread pool. But one size doesn't fit all and it's expensive to create too large pool. +**7)** There could be pending HTTP promise (which needs browser event loop to resolve) and blocking `.Wait` on the same thread and same task/chain. Leading to deadlock. + # Summary ## (14) Deputy + emscripten dispatch to UI + JSWebWorker + without sync JSExport From bf0a7674796b9b966717822617c69bf4b41a3357 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 23 Nov 2023 10:08:49 +0100 Subject: [PATCH 056/108] more --- accepted/2023/wasm-browser-threads.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 64bf1bfe6..c4a15398c 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -203,6 +203,7 @@ There are few downsides to them - could do full JSImport/JSExport to it's own JS `self` context - there is `JSSynchronizationContext`` installed on it - so that user code could dispatch back to it, in case that it needs to call `JSObject` proxy (with thread affinity) +- this thread needs to throw on any `.Wait` because of the problem **7** ## HTTP and WS clients - are implemented in terms of `JSObject` and `Promise` proxies @@ -280,7 +281,7 @@ There are few downsides to them - there are 2 dimensions: the thread and the method - there are 2 steps: - A) obtain existing `JSHandle` on next call (if available) - - to avoid double dispatch, this needs to be accessible + - to avoid double dispatch, this needs to be accessible - by any caller thread - or by UI thread C code (not managed) - B) if this is first call to the method on the target thread, create the target `JSHandle` by binding existing JS function @@ -719,7 +720,7 @@ Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 - enable dynamic thread allocation - throw exceptions in dev loop when UI thread does `lock` or `.Wait` in user code -## Scratch pad +## Scratch pad current generated `JSImport` in Net7, Net8 @@ -736,11 +737,11 @@ public static partial Task WebSocketReceive(JSObject webSocket, nint bufferPtr, { if (__signature_WebSocketReceive_1144640460 == null) { - __signature_WebSocketReceive_1144640460 = JSFunctionBinding.BindJSFunction("INTERNAL.ws_wasm_receive", null, new JSMarshalerType[] { - JSMarshalerType.Task(), - JSMarshalerType.JSObject, - JSMarshalerType.IntPtr, - JSMarshalerType.Int32 + __signature_WebSocketReceive_1144640460 = JSFunctionBinding.BindJSFunction("INTERNAL.ws_wasm_receive", null, new JSMarshalerType[] { + JSMarshalerType.Task(), + JSMarshalerType.JSObject, + JSMarshalerType.IntPtr, + JSMarshalerType.Int32 }); } @@ -750,7 +751,7 @@ public static partial Task WebSocketReceive(JSObject webSocket, nint bufferPtr, ref JSMarshalerArgument __arg_return = ref __arguments_buffer[1]; __arg_return.Initialize(); Task __retVal; - + ref JSMarshalerArgument __bufferLength_native__js_arg = ref __arguments_buffer[4]; ref JSMarshalerArgument __bufferPtr_native__js_arg = ref __arguments_buffer[3]; ref JSMarshalerArgument __webSocket_native__js_arg = ref __arguments_buffer[2]; @@ -780,7 +781,7 @@ internal static unsafe void __Wrapper_Dummy_1616792047(JSMarshalerArgument* __ar try { - __meaningPromise_native__js_arg.ToManaged(out meaningPromise, + __meaningPromise_native__js_arg.ToManaged(out meaningPromise, static (ref JSMarshalerArgument __task_result_arg, out int __task_result) => { __task_result_arg.ToManaged(out __task_result); From be0990f9173cdd22d946028ac2558a2d138ec577 Mon Sep 17 00:00:00 2001 From: Chet Husk Date: Mon, 4 Dec 2023 16:59:58 -0600 Subject: [PATCH 057/108] initial blurb --- INDEX.md | 1 + proposed/sdk-analysis-level.md | 167 +++++++++++++++++++++++++++++++++ 2 files changed, 168 insertions(+) create mode 100644 proposed/sdk-analysis-level.md diff --git a/INDEX.md b/INDEX.md index a197805ae..bacdfc0d9 100644 --- a/INDEX.md +++ b/INDEX.md @@ -98,6 +98,7 @@ Use update-index to regenerate it: | | [Rate limits](proposed/rate-limit.md) | [John Luo](https://github.com/juntaoluo), [Sourabh Shirhatti](https://github.com/shirhatti) | | | [Readonly references in C# and IL verification.](proposed/verifiable-ref-readonly.md) | | | | [Ref returns in C# and IL verification.](proposed/verifiable-ref-returns.md) | | +| | [SDK Analysis Level Property and Usage](proposed/sdk-analysis-level.md) | (PM) [Chet Husk](https://github.com/baronfel), (Engineering) [Daniel Plaisted](https://github.com/dsplaisted) | | | [Swift Interop](proposed/swift-interop.md) | [Andy Gocke](https://github.com/agocke), [Jeremy Koritzinsky](https://github.com/jkoritzinsky) | | | [Target AVX2 in R2R images](proposed/vector-instruction-set-default.md) | [Richard Lander](https://github.com/richlander) | diff --git a/proposed/sdk-analysis-level.md b/proposed/sdk-analysis-level.md new file mode 100644 index 000000000..af00aabbe --- /dev/null +++ b/proposed/sdk-analysis-level.md @@ -0,0 +1,167 @@ +# SDK Analysis Level Property and Usage + + +**Owner** (PM) [Chet Husk](https://github.com/baronfel) | (Engineering) [Daniel Plaisted](https://github.com/dsplaisted) + +Today, users of the .NET SDK have a large degree of control over the way that the .NET SDK and the +tools bundled in it emit diagnostics (warnings and errors). This control is provided in part by a +series of MSBuild properties that control the severity levels of specific warnings, if certain messages +should be treated as diagnostics, even a coarse-grained way to set the entire analysis baseline for a project. + +The default values for these properties are often driven by the Target Framework(s) chosen for a project, and +as a result users have developed an expectation of behaviors around changing the Target Framework of a project. +It's generally understood that when a TFM is changed, new kinds of analysis may be enabled, and as a result what +was a successful build may now have errors. + +This cadence is predictable and repeatable for users - new TFMs are generally only introduced once a year - but +for tooling developers this cadence means that important changes can generally only be tied to a new TFM release. +New diagnostics can be introduced mid cycle, but they can only be enabled by default in a new TFM. Failure to adhere +to this pattern results in pain for our users, as they are often not in control of the versions of the SDK used +to build their code, often because their environment is managed by an external force like an IT department or +a build environment. + +Some changes are not able to be logically tied to the TFM, however, and we have no toolset-level parallels +to the existing MSBuild Properties of `AnalysisLevel`, `WarningLevel`, `NoWarn`, etc. This means we have no way to introduce +changes that are activated just by installing a particular version of the SDK, and we have no clear way +to communicate intent to tools that naturally operate outside of the scope of a project - like NuGet at +the repo/solution level. + +To fill this gap, and to provide users a way to simply and easily control the behaviors of the SDK and tools, +I propose that we: + +* Add a new property called `SdkAnalysisLevel` to the base .NET SDK Targets, right at the beginning of target evaluation +* Set this property default value to be the `MAJOR.Minor.FeatureBand` of the current SDK (e.g. 7.0.100, 7.0.400, 8.0.100) +* Increment this value in line with the SDK’s actual version as it advances across releases +* Use this property to determine the default values of existing properties like `AnalysisLevel` and `WarningLevel` in the absence of any user-provided defaults +* Pass this value wholesale to tools like the compilers – where there is a complicated decision matrix for determining the effective verbosity of any given diagnostic + +## Scenarios and User Experience + +### Scenario 1: Jolene doesn't control her CI/CD environment + +In this scenario Jolene is a developer on a team that uses a CI/CD environment that is +managed by an external team. The infrastructure team has decided that the version of the .NET SDK that +will be preinstalled on the environment will be 8.0.200, but this version introduced a new +warning that is treated as an error by default. Jolene's team doesn't have time to fix the +diagnostic until next month, so for now she instructs the build to behave as if it were +using the 8.0.100 SDK by setting the `SdkAnalysisLevel` property to `8.0.100` in a +`Directory.Build.props` file in her repo root: + +```xml + + + 8.0.100 + + +``` + +### Scenario 2: Marcus is on a GPO-managed device + +Marcus is working on a small single-project .NET MAUI app, and his company manages his hardware via GPO. +On release day, the company pushed out Visual Studio updates to his team, and as a result his prior +feature band of `8.0.200` is no longer available - it's been replaced by `8.0.300`. He's not sure +about these warnings new warnings so he unblocks himself by setting `8.0.200` +in his project file to unblock builds until he has time to investigate. + +--- + +In both of these scenarios, the user is *not in control* of their environment. They can control their code, +but they do not have the time to address the full set of problems introduced by an update. They use the new +property to request older behavior, for a _limited time_, until they can address the issues. In addition, +this single property was able to control the behavior of the SDK, the compilers, and any other tools that +have been onboarded to the new scheme, without having to look up, comment out, or add many `NoWarn` properties +to their project files. + + + +## Requirements + +### Goals + + + +### Non-Goals + + + +## Stakeholders and Reviewers + + + +## Design + + + +## Q & A + + From b8a61149efbc5c4d370b1e9b81ab52c511d4dcc4 Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Tue, 5 Dec 2023 11:50:27 +0100 Subject: [PATCH 058/108] Update .NET Swift interop memory management documentation This commit adds context related to .NET Swift interop memory management. It highlights the details of managing memory correctly when Swift and .NET code interact. It covers aspects such as ownership of memory, interaction between ARC and .NET GC, and the role of projection tools. --- proposed/swift-interop.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index 9bc811275..c78a1f918 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -74,7 +74,7 @@ Rejected Option 1 seems like a natural fit, but there is one significant limitat Rejected Option 2 is a natural fit as the `MemberFunction` calling convention combined with the various C-based calling conventions specifies that there is a `this` argument as the first argument. Defining `Swift` + `MemberFunction` to imply/require the `self` argument is a great conceptual extension of the existing model. Although, in Swift, sometimes the `self` register is used for non-instance state. For example, in static functions, the type metadata is passed as the `self` argument. Since static functions are not member functions, we may want to not use the `MemberFunction` calling convention. As a result, we have rejected this option. -###### Error register +##### Error register We have selected an approach for handling the error register in the Swift calling convention: @@ -110,10 +110,6 @@ In the SwiftAsync calling convention, there is also an Async Context register, s In the Swift language, tuples are "unpacked" in the calling convention. Each element of a tuple is passed as a separate parameter. C# has its own language-level tuple feature in the ValueTuple family of types. The Swift/.NET interop story could choose to automatically handle tuple types in a similar way for a `CallConvSwift` function call. However, processing the tuple types in the JIT/AOT compilers is complicated and expensive. It would be much cheaper to defer this work upstack to the language projection, which can split out the tuple into individual parameters to pass to the underlying function. The runtime and libraries teams could add analyzers to detect value-tuples being passed to `CallConvSwift` functions at compile time to help avoid pits of failure. As we expect most developers to use the higher-level tooling and to not use `CallConvSwift` directly, we would likely defer any analyzer work until we have a suitable use case. -##### Automatic Reference Counting and Lifetime Management - -Swift has a very strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. The Binding Tools for Swift tooling handles these explicit lifetime semantics with some generated Swift code. In the new Swift/.NET interop, management of these lifetime semantics will be done by the Swift projection and not by the raw calling-convention support. If any GC interation is required to handle the lifetime semantics correctly, we should take an approach more similar to the ComWrappers support (higher-level, less complex interop interface) than the Objective-C interop support (lower-level, basically only usable by the ObjCRuntime implementation). - ##### Structs/Enums Like .NET, Swift has both value types and class types. The value types in Swift include both structs and enums. When passed at the ABI layer, they are generally treated as their composite structure and passed by value in registers. However, when Library Evolution mode is enabled, struct layouts are considered "opaque" and their size, alignment, and layout can vary between compile-time and runtime. As a result, all structs and enums in Library Evolution mode are passed by a pointer instead of in registers. Frozen structs and enums (annotated with the `@frozen` attribute) are not considered opaque and will be enregistered. We plan to interoperate with Swift through the Library Evolution mode, so we will generally be able to pass structures using opaque layout. @@ -122,6 +118,12 @@ When calling a function that returns an opaque struct, the Swift ABI always requ At the lowest level of the calling convention, we do not consider Library Evolution to be a different calling convention than the Swift calling convention. Library Evolution requires that some types are passed by a pointer/reference, but it does not fundamentally change the calling convention. Effectively, Library Evolution forces the least optimizable choice to be taken at every possible point. As a result, we should not handle Library Evolution as a separate calling convention and instead we can manually handle it at the projection layer. +##### Automatic Reference Counting and Lifetime Management + +Swift has a very strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. When .NET calls into Swift, the .NET GC is responsible for managing all managed objects. Unmanaged objects from C# should either implement IDisposable or utilize a designated thin wrapper over the C API, such as the NativeMemory class to explicitly release memory. It's important to ensure that when a Swift callee allocates an object that is returned to .NET, ownership of the memory is not retained after the call returns. In another scenario, when Swift calls into .NET, the Swift should explicitly dereference all objects after the call to release memory. If a C# managed object is allocated and returned to Swift, the .NET GC will eventually collect it, but Swift will keep track using ARC, which represents an invalid case and should be handled by projection tools. If C# unmanaged objects are returned to Swift, they are tracked as normal Swift objects through ARC, and by dereferencing them, Swift will release the memory. + +The Binding Tools for Swift tooling handles these explicit lifetime semantics with some generated Swift code. In the new Swift/.NET interop, management of these lifetime semantics will be done by the Swift projection and not by the raw calling-convention support. If any GC interation is required to handle the lifetime semantics correctly, we should take an approach more similar to the ComWrappers support (higher-level, less complex interop interface) than the Objective-C interop support (lower-level, basically only usable by the ObjCRuntime implementation). + ### Projecting Swift into .NET The majority of the work for Swift/.NET interop is determining how a type that exists in Swift should exist in .NET and what shape it should have. This section is a work in progress and will discuss how each feature in Swift will be projected into .NET, particularly in cases where there is not a corresponding .NET or C# language feature. Each feature should have a subheading describing how the projection will look and how any mechanisms to make it work will be designed. From 56a0a7f6f2b04778e8a8f7e8cbdc3e8440c07316 Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Tue, 5 Dec 2023 11:56:43 +0100 Subject: [PATCH 059/108] Update paragraph --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index c78a1f918..b2fe05a4e 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -120,7 +120,7 @@ At the lowest level of the calling convention, we do not consider Library Evolut ##### Automatic Reference Counting and Lifetime Management -Swift has a very strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. When .NET calls into Swift, the .NET GC is responsible for managing all managed objects. Unmanaged objects from C# should either implement IDisposable or utilize a designated thin wrapper over the C API, such as the NativeMemory class to explicitly release memory. It's important to ensure that when a Swift callee allocates an object that is returned to .NET, ownership of the memory is not retained after the call returns. In another scenario, when Swift calls into .NET, the Swift should explicitly dereference all objects after the call to release memory. If a C# managed object is allocated and returned to Swift, the .NET GC will eventually collect it, but Swift will keep track using ARC, which represents an invalid case and should be handled by projection tools. If C# unmanaged objects are returned to Swift, they are tracked as normal Swift objects through ARC, and by dereferencing them, Swift will release the memory. +Swift has a very strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. When .NET calls into Swift, the .NET GC is responsible for managing all managed objects. Unmanaged objects from C# should either implement IDisposable or utilize a designated thin wrapper over the C API, such as the NativeMemory class to explicitly release memory. It's important to ensure that when a Swift callee allocates an object that is returned to .NET, ownership of the memory is not retained after the call returns. In another scenario, when Swift calls into .NET, the Swift should explicitly dereference all objects after the call to release memory. If a C# managed object is allocated and returned to Swift, the .NET GC will eventually collect it, but Swift will keep track using ARC, which represents an invalid case and should be handled by projection tools. When a C# unmanaged object is returned to Swift, it is treated as a regular Swift object through ARC, and Swift releases the memory by dereferencing it. The Binding Tools for Swift tooling handles these explicit lifetime semantics with some generated Swift code. In the new Swift/.NET interop, management of these lifetime semantics will be done by the Swift projection and not by the raw calling-convention support. If any GC interation is required to handle the lifetime semantics correctly, we should take an approach more similar to the ComWrappers support (higher-level, less complex interop interface) than the Objective-C interop support (lower-level, basically only usable by the ObjCRuntime implementation). From b7b7dfb1f5bec57b0b944abb6eda5f89bf1818a4 Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Tue, 5 Dec 2023 16:18:58 +0100 Subject: [PATCH 060/108] Update proposed/swift-interop.md Co-authored-by: Aaron Robinson --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index b2fe05a4e..c7e8b4b24 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -120,7 +120,7 @@ At the lowest level of the calling convention, we do not consider Library Evolut ##### Automatic Reference Counting and Lifetime Management -Swift has a very strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. When .NET calls into Swift, the .NET GC is responsible for managing all managed objects. Unmanaged objects from C# should either implement IDisposable or utilize a designated thin wrapper over the C API, such as the NativeMemory class to explicitly release memory. It's important to ensure that when a Swift callee allocates an object that is returned to .NET, ownership of the memory is not retained after the call returns. In another scenario, when Swift calls into .NET, the Swift should explicitly dereference all objects after the call to release memory. If a C# managed object is allocated and returned to Swift, the .NET GC will eventually collect it, but Swift will keep track using ARC, which represents an invalid case and should be handled by projection tools. When a C# unmanaged object is returned to Swift, it is treated as a regular Swift object through ARC, and Swift releases the memory by dereferencing it. +Swift has a strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. When .NET calls into Swift, the .NET GC is responsible for managing all managed objects. Unmanaged objects from C# should either implement `IDisposable` or utilize a designated thin wrapper over the Swift memory allocator, currently accessible through the `NativeMemory` class, to explicitly release memory. It's important to ensure that when a Swift callee allocates an object that is returned to .NET, ownership of the memory is not retained after the call returns. In another scenario, when Swift calls into .NET, Swift should explicitly dereference all objects after the call to release memory. If a C# managed object is allocated and returned to Swift, the .NET GC will eventually collect it, but Swift will keep track using ARC, which represents an invalid case and should be handled by projection tools. When a C# unmanaged object is returned to Swift, it is treated as a regular Swift object through ARC, and Swift releases the memory by dereferencing it. The Binding Tools for Swift tooling handles these explicit lifetime semantics with some generated Swift code. In the new Swift/.NET interop, management of these lifetime semantics will be done by the Swift projection and not by the raw calling-convention support. If any GC interation is required to handle the lifetime semantics correctly, we should take an approach more similar to the ComWrappers support (higher-level, less complex interop interface) than the Objective-C interop support (lower-level, basically only usable by the ObjCRuntime implementation). From 412c53496077353ebd552f12ebfd3db8edb16282 Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Tue, 5 Dec 2023 16:19:03 +0100 Subject: [PATCH 061/108] Update proposed/swift-interop.md Co-authored-by: Aaron Robinson --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index c7e8b4b24..ba8347adf 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -122,7 +122,7 @@ At the lowest level of the calling convention, we do not consider Library Evolut Swift has a strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. When .NET calls into Swift, the .NET GC is responsible for managing all managed objects. Unmanaged objects from C# should either implement `IDisposable` or utilize a designated thin wrapper over the Swift memory allocator, currently accessible through the `NativeMemory` class, to explicitly release memory. It's important to ensure that when a Swift callee allocates an object that is returned to .NET, ownership of the memory is not retained after the call returns. In another scenario, when Swift calls into .NET, Swift should explicitly dereference all objects after the call to release memory. If a C# managed object is allocated and returned to Swift, the .NET GC will eventually collect it, but Swift will keep track using ARC, which represents an invalid case and should be handled by projection tools. When a C# unmanaged object is returned to Swift, it is treated as a regular Swift object through ARC, and Swift releases the memory by dereferencing it. -The Binding Tools for Swift tooling handles these explicit lifetime semantics with some generated Swift code. In the new Swift/.NET interop, management of these lifetime semantics will be done by the Swift projection and not by the raw calling-convention support. If any GC interation is required to handle the lifetime semantics correctly, we should take an approach more similar to the ComWrappers support (higher-level, less complex interop interface) than the Objective-C interop support (lower-level, basically only usable by the ObjCRuntime implementation). +The Binding Tools for Swift tooling handles these explicit lifetime semantics with generated Swift code. In the new Swift/.NET interop, management of these lifetime semantics will be done by the Swift projection layer and not by the raw calling-convention support. If any GC interaction is required to handle the lifetime semantics correctly, we should take an approach more similar to the `ComWrappers` support (higher-level, less complex interop interface) rather than the Objective-C interop support (lower-level, basically only usable by the ObjCRuntime implementation). ### Projecting Swift into .NET From 44283b9390a8a9c9bdeaa84aa9641dcfe6247ee4 Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Tue, 5 Dec 2023 16:30:15 +0100 Subject: [PATCH 062/108] Fix wording in a paragraph --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index ba8347adf..f3cda17cb 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -120,7 +120,7 @@ At the lowest level of the calling convention, we do not consider Library Evolut ##### Automatic Reference Counting and Lifetime Management -Swift has a strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. When .NET calls into Swift, the .NET GC is responsible for managing all managed objects. Unmanaged objects from C# should either implement `IDisposable` or utilize a designated thin wrapper over the Swift memory allocator, currently accessible through the `NativeMemory` class, to explicitly release memory. It's important to ensure that when a Swift callee allocates an object that is returned to .NET, ownership of the memory is not retained after the call returns. In another scenario, when Swift calls into .NET, Swift should explicitly dereference all objects after the call to release memory. If a C# managed object is allocated and returned to Swift, the .NET GC will eventually collect it, but Swift will keep track using ARC, which represents an invalid case and should be handled by projection tools. When a C# unmanaged object is returned to Swift, it is treated as a regular Swift object through ARC, and Swift releases the memory by dereferencing it. +Swift has a strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. When .NET calls into Swift, the .NET GC is responsible for managing all managed objects. Unmanaged objects from C# should either implement `IDisposable` or utilize a designated thin wrapper over the Swift memory allocator, currently accessible through the `NativeMemory` class, to explicitly release memory. It's important to ensure that when a Swift callee function allocates an object that is returned to .NET, the memory is not dereferenced after the call returns. In another scenario, when Swift calls into .NET, a developer in Swift should explicitly dereference all objects after the call to release memory. If a C# managed object is allocated and returned to Swift, the .NET GC will eventually collect it, but Swift will keep track using ARC, which represents an invalid case and should be handled by projection tools. When a C# unmanaged object is returned to Swift, it is treated as a regular Swift object through ARC, and Swift releases the memory by dereferencing it. The Binding Tools for Swift tooling handles these explicit lifetime semantics with generated Swift code. In the new Swift/.NET interop, management of these lifetime semantics will be done by the Swift projection layer and not by the raw calling-convention support. If any GC interaction is required to handle the lifetime semantics correctly, we should take an approach more similar to the `ComWrappers` support (higher-level, less complex interop interface) rather than the Objective-C interop support (lower-level, basically only usable by the ObjCRuntime implementation). From 18674de57402a69c116c9c04dbda19305c9aa645 Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Tue, 5 Dec 2023 22:24:34 +0100 Subject: [PATCH 063/108] Resolve PR comments --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index f3cda17cb..b9657ccec 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -120,7 +120,7 @@ At the lowest level of the calling convention, we do not consider Library Evolut ##### Automatic Reference Counting and Lifetime Management -Swift has a strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. When .NET calls into Swift, the .NET GC is responsible for managing all managed objects. Unmanaged objects from C# should either implement `IDisposable` or utilize a designated thin wrapper over the Swift memory allocator, currently accessible through the `NativeMemory` class, to explicitly release memory. It's important to ensure that when a Swift callee function allocates an object that is returned to .NET, the memory is not dereferenced after the call returns. In another scenario, when Swift calls into .NET, a developer in Swift should explicitly dereference all objects after the call to release memory. If a C# managed object is allocated and returned to Swift, the .NET GC will eventually collect it, but Swift will keep track using ARC, which represents an invalid case and should be handled by projection tools. When a C# unmanaged object is returned to Swift, it is treated as a regular Swift object through ARC, and Swift releases the memory by dereferencing it. +Swift has a strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. When .NET calls into Swift, the .NET GC is responsible for managing all managed objects. Unmanaged objects from C# should either implement `IDisposable` or utilize a designated thin wrapper over the Swift memory allocator, currently accessible through the `NativeMemory` class, to explicitly release memory. It's important to ensure that when a Swift callee function allocates an "unsafe" or "raw" pointer types, such as UnsafeMutablePointer and UnsafeRawPointer, where explicit control over memory is needed, and the pointer is returned to .NET, the memory is not dereferenced after the call returns. Also, if a C# managed object is allocated in a callee function and returned to Swift, the .NET GC will eventually collect it, but Swift will keep track using ARC, which represents an invalid case and should be handled by projection tools. The Binding Tools for Swift tooling handles these explicit lifetime semantics with generated Swift code. In the new Swift/.NET interop, management of these lifetime semantics will be done by the Swift projection layer and not by the raw calling-convention support. If any GC interaction is required to handle the lifetime semantics correctly, we should take an approach more similar to the `ComWrappers` support (higher-level, less complex interop interface) rather than the Objective-C interop support (lower-level, basically only usable by the ObjCRuntime implementation). From 4f9d4a2fe6c27e1c3d6732cbd7e560f48d4d456a Mon Sep 17 00:00:00 2001 From: Chet Husk Date: Wed, 6 Dec 2023 18:46:30 -0600 Subject: [PATCH 064/108] finish v1 of the doc --- proposed/sdk-analysis-level.md | 121 +++++++++++++++++++-------------- 1 file changed, 70 insertions(+), 51 deletions(-) diff --git a/proposed/sdk-analysis-level.md b/proposed/sdk-analysis-level.md index af00aabbe..2491cd0c5 100644 --- a/proposed/sdk-analysis-level.md +++ b/proposed/sdk-analysis-level.md @@ -42,7 +42,35 @@ I propose that we: In this scenario Jolene is a developer on a team that uses a CI/CD environment that is managed by an external team. The infrastructure team has decided that the version of the .NET SDK that will be preinstalled on the environment will be 8.0.200, but this version introduced a new -warning that is treated as an error by default. Jolene's team doesn't have time to fix the +warning that is treated as an error by default. At this point, Jolene has a few choices to make about how to resolve this error. + +* Resolve the new warning. This may take an indeterminate amount of time, and blocks the teams progress in the meantime. +* Add a `NoWarn` for the new warning(s) and continue working. This unblocks the team but can be a bit of a whack-a-mole situation, as new warnings will require new `NoWarn`s. +* Add a `SdkAnalysisLevel` with a value of the SDK she was previously using. This unblocks the team and informs tooling of the desired unified warnings experience that Jolene's team is coming from. + +```mermaid +graph TD + classDef good fill:green + classDef bad fill:red + start([New SDK Update]) + opt1{Resolve the warning} + opt2{Add NoWarn} + opt3{Add `SdkAnalysisLevel`} + result1[Unknown time, blocks team] + class result1 bad + result2[Unblocks team, not future-proof] + class result2 bad + result3[Unblocks team, future-proof] + class result3 good + start-->opt1 + start-->opt2 + start-->opt3 + opt1-->result1 + opt2-->result2 + opt3-->result3 +``` + +After assessing these choices Jolene's team doesn't have time to fix the diagnostic until next month, so for now she instructs the build to behave as if it were using the 8.0.100 SDK by setting the `SdkAnalysisLevel` property to `8.0.100` in a `Directory.Build.props` file in her repo root: @@ -72,35 +100,16 @@ this single property was able to control the behavior of the SDK, the compilers, have been onboarded to the new scheme, without having to look up, comment out, or add many `NoWarn` properties to their project files. - ## Requirements ### Goals +* Users have a unified way to manage diagnostics for all the tools in the SDK +* Tooling has a way to understand what compatibility level a user explicitly wants +* Users are able to upgrade across SDKs without being forced to immediately resolve warnings +* Tools bundled in the SDK adopt `SdkAnalysisLevel` as a guideline + +| Team | Representatives | +| ---- | --------------- | +| SDK | @marcpopmsft @dsplaisted | +| MSBuild | @rainersigwald @ladipro | +| NuGet | @aortiz-msft @nkolev92 | +| Roslyn | @jaredpar @CyrusNajmabadi | +| F# | @vzarytovskii @KevinRansom | ## Design - +### Valid values of `SdkAnalysisLevel` + +The implementation of `SdkAnalysisLevel` is itself quite straightforward - the default value of `SdkAnalysisLevel` is always +the stable 'SDK Feature Band' of the SDK that is currently being run, and the only valid values for the property are other SDK Feature Bands. +An SDK Feature Band is a Semantic Version of the form `..`, where the `` version is a multiple of 100. Some +valid feature bands for this discussion might be 6.0.100, 7.0.300, or 8.0.200. + +### Where to define `SdkAnalysisLevel` + +The more interesting question is where such a value might be set. Users typically set values for properties in one of three ways: + +* as a MSBuild global property via an environment variable or `-p` option of MSBuild/.NET CLI +* as a MSBuild local property in a Project file (or MSBuild logic Imported by a project file) +* as a MSBuild local property in a Directory.Build.props file (this is broadly the same as option 2 but happens implicitly) + +If we would like users to be able to set this new property in any of these, the SDK cannot set a default value until after the Project file has been evaluated. +If we want all of the build logic in `.targets` files to be able to consume or derive from the value then we must set the value as early as possible. +These two constraints point to setting the property as early as possible during the `.targets` evaluation of the SDK - for this reason +we should calculate the new property [at the beginning of the base SDK targets](https://github.com/dotnet/sdk/blob/558ea28cd054702d01aac87e547d51be4656d3e5/src/Tasks/Microsoft.NET.Build.Tasks/targets/Microsoft.NET.Sdk.targets#L11). + +### Relation to existing WarningLevel and AnalysisLevel properties + +These properties cover a lot of the same ground, but are defined/imported [too late in project evalation](https://github.com/dotnet/sdk/blob/558ea28cd054702d01aac87e547d51be4656d3e5/src/Tasks/Microsoft.NET.Build.Tasks/targets/Microsoft.NET.Sdk.targets#L1315) to be usable by the rest of the SDK. In addition, +we cannot safely move their evaluation earlier in the overall process due to compatibility concerns. Since `SdkAnalysisLevel` will be defined earlier, +we should define how it impacts these two. Curently WarningLevel and AnalysisLevel are [hardcoded](https://github.com/dotnet/sdk/blob/558ea28cd054702d01aac87e547d51be4656d3e5/src/Tasks/Microsoft.NET.Build.Tasks/targets/Microsoft.NET.Sdk.Analyzers.targets#L25C6-L25C26) and must be updated with each new SDK. We will change this to infer the AnalysisLevel based on the value of `SdkAnalysisLevel` - the `` portion of the `SdkAnalysisLevel` will become the value of `AnalysisLevel`, which itself influences the default for `WarningLevel`. ## Q & A From 5a48262c40e4966b0fa36aa05c3443f10b113293 Mon Sep 17 00:00:00 2001 From: Chet Husk Date: Wed, 6 Dec 2023 18:52:35 -0600 Subject: [PATCH 065/108] Can't forget my fellow PMs --- proposed/sdk-analysis-level.md | 1 + 1 file changed, 1 insertion(+) diff --git a/proposed/sdk-analysis-level.md b/proposed/sdk-analysis-level.md index 2491cd0c5..82d6ec2d5 100644 --- a/proposed/sdk-analysis-level.md +++ b/proposed/sdk-analysis-level.md @@ -144,6 +144,7 @@ are brought that you need to scope out. | NuGet | @aortiz-msft @nkolev92 | | Roslyn | @jaredpar @CyrusNajmabadi | | F# | @vzarytovskii @KevinRansom | +| PM | @KathleenDollard @MadsTorgersen | ## Design From 23111d8c3cae69187a0939d79ed6f5edeb0b8f81 Mon Sep 17 00:00:00 2001 From: Chet Husk Date: Wed, 13 Dec 2023 10:19:03 -0600 Subject: [PATCH 066/108] clarify ordering --- proposed/sdk-analysis-level.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/proposed/sdk-analysis-level.md b/proposed/sdk-analysis-level.md index 82d6ec2d5..a60d80241 100644 --- a/proposed/sdk-analysis-level.md +++ b/proposed/sdk-analysis-level.md @@ -163,11 +163,13 @@ The more interesting question is where such a value might be set. Users typicall * as a MSBuild local property in a Project file (or MSBuild logic Imported by a project file) * as a MSBuild local property in a Directory.Build.props file (this is broadly the same as option 2 but happens implicitly) -If we would like users to be able to set this new property in any of these, the SDK cannot set a default value until after the Project file has been evaluated. +If we would like users to be able to set this new property in any of these, the SDK cannot determine the 'final' value of `SdkAnalysisLevel` until after the Project file has been evaluated. If we want all of the build logic in `.targets` files to be able to consume or derive from the value then we must set the value as early as possible. These two constraints point to setting the property as early as possible during the `.targets` evaluation of the SDK - for this reason we should calculate the new property [at the beginning of the base SDK targets](https://github.com/dotnet/sdk/blob/558ea28cd054702d01aac87e547d51be4656d3e5/src/Tasks/Microsoft.NET.Build.Tasks/targets/Microsoft.NET.Sdk.targets#L11). +While an earlier location could be chosen (e.g. Microsoft.NET.Sdk.props) this would open up a layering/timing issue where the value of `SdkAnalysisLevel` could be set to a default, then computed by some other MSBuild props before the users' input had been evaluated. This would be confusing and error-prone for users, and would require a lot of extra work to ensure that the value was always correct. For this reason we choose to set the value in the base SDK targets as described above. + ### Relation to existing WarningLevel and AnalysisLevel properties These properties cover a lot of the same ground, but are defined/imported [too late in project evalation](https://github.com/dotnet/sdk/blob/558ea28cd054702d01aac87e547d51be4656d3e5/src/Tasks/Microsoft.NET.Build.Tasks/targets/Microsoft.NET.Sdk.targets#L1315) to be usable by the rest of the SDK. In addition, From 5d9f282a6edecabad2c17970098706d546c5da9d Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Thu, 21 Dec 2023 14:19:09 +0100 Subject: [PATCH 067/108] Specify delegates with CallConvSwift and UnmanagedCallersOnly calling convention --- proposed/swift-interop.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index b9657ccec..e0f764cdc 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -57,7 +57,7 @@ We have selected the following option for supporting the Self register in the ca 1. Use specially-typed argument with a type, `SwiftSelf`, to represent which parameter should go into the self register. -We will provide a `SwiftSelf` type to specify "this argument goes in the self register". Specifying the type twice in a signature would generate an `InvalidProgramException`. This would allow the `self` argument to be specified anywhere in the argument list. Alternatively, many sections of the Swift ABI documentation, as well as the Clang docs refer to this parameter as the "context" argument instead of the "self" argument, so an alternative name could be `SwiftContext`. +We will provide a `SwiftSelf` type to specify "this argument goes in the self register". Specifying the type twice in a signature would generate an `InvalidProgramException`. This would allow the `self` argument to be specified anywhere in the argument list. Alternatively, many sections of the Swift ABI documentation, as well as the Clang docs refer to this parameter as the "context" argument instead of the "self" argument, so an alternative name could be `SwiftContext`. The `UnmanagedCallersOnly` leverages the existing proposal and treats delegates annotated with `UnmanagedCallersOnly` as Swift-like functions capable of handling context registers. In the prolog of the delegate, the self register value will be loaded into the `SwiftSelf` argument. For reference, explicitly declaring a function with the Swift or SwiftAsync calling conventions in Clang requires the "context" argument, the value that goes in the "self" register, as the last parameter or the penultimate parameter followed by the error parameter. @@ -80,7 +80,7 @@ We have selected an approach for handling the error register in the Swift callin 1. Use a special type named something like `SwiftError*` to indicate the error parameter -This approach expresses that the to-be-called Swift function uses the Error Register in the signature and they both require signature manipulation in the JIT/AOT compilers. Like with `SwiftSelf`, we would throw an `InvalidProgramException` for a signature with multiple `SwiftError` parameters. We use a pointer-to-`SwiftError` type to indicate that the error register is a by-ref/out parameter. We don't use managed pointers as our modern JITs can reason about unmanaged pointers well enough that we do not end up losing any performance taking this route. The `UnmanagedCallersOnly` implementation will require a decent amount of JIT work to emulate a local variable for the register value, but we have prior art in the Clang implementation of the Swift error register that we can fall back on. +This approach expresses that the to-be-called Swift function uses the Error Register in the signature and they both require signature manipulation in the JIT/AOT compilers. Like with `SwiftSelf`, we would throw an `InvalidProgramException` for a signature with multiple `SwiftError` parameters. We use a pointer-to-`SwiftError` type to indicate that the error register is a by-ref/out parameter. We don't use managed pointers as our modern JITs can reason about unmanaged pointers well enough that we do not end up losing any performance taking this route. The `UnmanagedCallersOnly` implementation will require a decent amount of JIT work to emulate a local variable for the register value, but we have prior art in the Clang implementation of the Swift error register that we can fall back on. Delegates annotated with `UnmanagedCallersOnly` will store the `SwiftError*` value in the error register in their epilog. Additionally, we have selected this design as this provides consistency with the self register and async context register handling, discussed below. From 62087cf6b2076dfb48f118706802e8b711499e2b Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Thu, 21 Dec 2023 18:30:13 +0100 Subject: [PATCH 068/108] Update proposed/swift-interop.md Co-authored-by: Jan Kotas --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index e0f764cdc..ab16bf9ee 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -80,7 +80,7 @@ We have selected an approach for handling the error register in the Swift callin 1. Use a special type named something like `SwiftError*` to indicate the error parameter -This approach expresses that the to-be-called Swift function uses the Error Register in the signature and they both require signature manipulation in the JIT/AOT compilers. Like with `SwiftSelf`, we would throw an `InvalidProgramException` for a signature with multiple `SwiftError` parameters. We use a pointer-to-`SwiftError` type to indicate that the error register is a by-ref/out parameter. We don't use managed pointers as our modern JITs can reason about unmanaged pointers well enough that we do not end up losing any performance taking this route. The `UnmanagedCallersOnly` implementation will require a decent amount of JIT work to emulate a local variable for the register value, but we have prior art in the Clang implementation of the Swift error register that we can fall back on. Delegates annotated with `UnmanagedCallersOnly` will store the `SwiftError*` value in the error register in their epilog. +This approach expresses that the to-be-called Swift function uses the Error Register in the signature and they both require signature manipulation in the JIT/AOT compilers. Like with `SwiftSelf`, we would throw an `InvalidProgramException` for a signature with multiple `SwiftError` parameters. We use a pointer-to-`SwiftError` type to indicate that the error register is a by-ref/out parameter. We don't use managed pointers as our modern JITs can reason about unmanaged pointers well enough that we do not end up losing any performance taking this route. The `UnmanagedCallersOnly` implementation will require a decent amount of JIT work to emulate a local variable for the register value, but we have prior art in the Clang implementation of the Swift error register that we can fall back on. Methods annotated with `UnmanagedCallersOnly` will pass the `SwiftError*` value in the error register. Additionally, we have selected this design as this provides consistency with the self register and async context register handling, discussed below. From 77597bb03cb6919169a1fb60755ab1e76cee5765 Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Thu, 21 Dec 2023 18:39:19 +0100 Subject: [PATCH 069/108] Update self register handling in function pointers --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index ab16bf9ee..a82a7f90b 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -57,7 +57,7 @@ We have selected the following option for supporting the Self register in the ca 1. Use specially-typed argument with a type, `SwiftSelf`, to represent which parameter should go into the self register. -We will provide a `SwiftSelf` type to specify "this argument goes in the self register". Specifying the type twice in a signature would generate an `InvalidProgramException`. This would allow the `self` argument to be specified anywhere in the argument list. Alternatively, many sections of the Swift ABI documentation, as well as the Clang docs refer to this parameter as the "context" argument instead of the "self" argument, so an alternative name could be `SwiftContext`. The `UnmanagedCallersOnly` leverages the existing proposal and treats delegates annotated with `UnmanagedCallersOnly` as Swift-like functions capable of handling context registers. In the prolog of the delegate, the self register value will be loaded into the `SwiftSelf` argument. +We will provide a `SwiftSelf` type to specify "this argument goes in the self register". Specifying the type twice in a signature would generate an `InvalidProgramException`. This would allow the `self` argument to be specified anywhere in the argument list. Alternatively, many sections of the Swift ABI documentation, as well as the Clang docs refer to this parameter as the "context" argument instead of the "self" argument, so an alternative name could be `SwiftContext`. Function pointers annotated with `UnmanagedCallersOnly` will be treated as Swift-like functions capable of handling context registers. The value stored in the self register will be loaded into the `SwiftSelf` argument. For reference, explicitly declaring a function with the Swift or SwiftAsync calling conventions in Clang requires the "context" argument, the value that goes in the "self" register, as the last parameter or the penultimate parameter followed by the error parameter. From e041ba1a7f868df9a789cab3e30a5e44edcdaa81 Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Thu, 21 Dec 2023 19:00:21 +0100 Subject: [PATCH 070/108] Update proposed/swift-interop.md Co-authored-by: Jan Kotas --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index a82a7f90b..1f84e0d27 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -57,7 +57,7 @@ We have selected the following option for supporting the Self register in the ca 1. Use specially-typed argument with a type, `SwiftSelf`, to represent which parameter should go into the self register. -We will provide a `SwiftSelf` type to specify "this argument goes in the self register". Specifying the type twice in a signature would generate an `InvalidProgramException`. This would allow the `self` argument to be specified anywhere in the argument list. Alternatively, many sections of the Swift ABI documentation, as well as the Clang docs refer to this parameter as the "context" argument instead of the "self" argument, so an alternative name could be `SwiftContext`. Function pointers annotated with `UnmanagedCallersOnly` will be treated as Swift-like functions capable of handling context registers. The value stored in the self register will be loaded into the `SwiftSelf` argument. +We will provide a `SwiftSelf` type to specify "this argument goes in the self register". Specifying the type twice in a signature would generate an `InvalidProgramException`. This would allow the `self` argument to be specified anywhere in the argument list. Alternatively, many sections of the Swift ABI documentation, as well as the Clang docs refer to this parameter as the "context" argument instead of the "self" argument, so an alternative name could be `SwiftContext`. Method annotated with `UnmanagedCallersOnly` will be treated as Swift-like functions capable of handling context registers. The value stored in the self register will be loaded into the `SwiftSelf` argument. For reference, explicitly declaring a function with the Swift or SwiftAsync calling conventions in Clang requires the "context" argument, the value that goes in the "self" register, as the last parameter or the penultimate parameter followed by the error parameter. From e0d11ae08f28564635b81e2257838d50f2da6142 Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Thu, 21 Dec 2023 19:19:49 +0100 Subject: [PATCH 071/108] Add sentence for CallConvSwift and its application in various contexts --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index 1f84e0d27..23a896a91 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -43,7 +43,7 @@ We plan to split the work into at least three separate components. There will be #### Calling Conventions -Swift has two calling conventions at its core, the "Swift" calling convention, and the "SwiftAsync" calling convention. We will begin by focusing on the "Swift" calling convention, as it is the most common. The Swift calling convention has a few elements that we need to contend with. Primarily, it allows passing arguments in more registers. Additionally, it has two dedicated registers for specific components of the calling convention, the self register and the error register. The runtime support for this calling convention must support all of these features. The additional registers per argument is relatively easy to support, as each of our compilers must already have some support for this concept to run on Linux or Apple platforms today. We have a few options for the remaining two features we need to support. +Swift has two calling conventions at its core, the "Swift" calling convention, and the "SwiftAsync" calling convention. We will begin by focusing on the "Swift" calling convention, as it is the most common. The Swift calling convention has a few elements that we need to contend with. Primarily, it allows passing arguments in more registers. Additionally, it has two dedicated registers for specific components of the calling convention, the self register and the error register. The runtime support for this calling convention must support all of these features. The additional registers per argument is relatively easy to support, as each of our compilers must already have some support for this concept to run on Linux or Apple platforms today. We have a few options for the remaining two features we need to support. The `CallConvSwift` modifier can be applied in all contexts where other existing calling convention modifiers can be used. This means it can be used with UnmanagedCallConvAttribute for P/Invoke (`[UnmanagedCallConv(CallConvs = new Type[] { typeof(CallConvSwift) })]`), UnmanagedCallersOnlyAttribute.CallConv for reverse P/Invoke (`[UnmanagedCallersOnly(CallConvs = new Type[] { typeof(CallConvSwift) })]`), and with unmanaged function pointers (`delegate* unmanaged[Swift]<>`). The "SwiftAsync" calling convention has an additional "async context" register. When a "SwiftAsync" function is called by a "SwiftAsync" function, it must be tail-called. A function that uses this calling convention also pops the argument area. In LLVM-IR, this calling convention is referred to as `swifttailcc`. In Clang, this convention is specified as `swiftasynccall`. Additionally, the "SwiftAsync" calling convention does not have the error register and must not have a return value. See the [LLVM-IR language reference](https://github.com/llvm/llvm-project/blob/54fe7ef70069a48c252a7e1b0c6ed8efda0bc440/llvm/docs/LangRef.rst#L452) and the [Clang attribute reference](https://clang.llvm.org/docs/AttributeReference.html#swiftasynccall) for an explaination of the calling convention. From 812056e30c0b912ad56936ec14b923083a83e5a0 Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Thu, 21 Dec 2023 20:01:45 +0100 Subject: [PATCH 072/108] Update proposed/swift-interop.md Co-authored-by: Jan Kotas --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index 23a896a91..86fc7de8e 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -57,7 +57,7 @@ We have selected the following option for supporting the Self register in the ca 1. Use specially-typed argument with a type, `SwiftSelf`, to represent which parameter should go into the self register. -We will provide a `SwiftSelf` type to specify "this argument goes in the self register". Specifying the type twice in a signature would generate an `InvalidProgramException`. This would allow the `self` argument to be specified anywhere in the argument list. Alternatively, many sections of the Swift ABI documentation, as well as the Clang docs refer to this parameter as the "context" argument instead of the "self" argument, so an alternative name could be `SwiftContext`. Method annotated with `UnmanagedCallersOnly` will be treated as Swift-like functions capable of handling context registers. The value stored in the self register will be loaded into the `SwiftSelf` argument. +We will provide a `SwiftSelf` type to specify "this argument goes in the self register". Specifying the type twice in a signature would generate an `InvalidProgramException`. This would allow the `self` argument to be specified anywhere in the argument list. Alternatively, many sections of the Swift ABI documentation, as well as the Clang docs refer to this parameter as the "context" argument instead of the "self" argument, so an alternative name could be `SwiftContext`. Methods annotated with `UnmanagedCallersOnly` will be treated as Swift-like functions capable of handling context registers. The value stored in the self register will be loaded into the `SwiftSelf` argument. For reference, explicitly declaring a function with the Swift or SwiftAsync calling conventions in Clang requires the "context" argument, the value that goes in the "self" register, as the last parameter or the penultimate parameter followed by the error parameter. From db1c74b402b7bc322cf2b35d8522b76918bd9073 Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Wed, 3 Jan 2024 09:31:07 +0100 Subject: [PATCH 073/108] Update proposed/swift-interop.md Co-authored-by: Aaron Robinson --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index 86fc7de8e..cc4ae353b 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -43,7 +43,7 @@ We plan to split the work into at least three separate components. There will be #### Calling Conventions -Swift has two calling conventions at its core, the "Swift" calling convention, and the "SwiftAsync" calling convention. We will begin by focusing on the "Swift" calling convention, as it is the most common. The Swift calling convention has a few elements that we need to contend with. Primarily, it allows passing arguments in more registers. Additionally, it has two dedicated registers for specific components of the calling convention, the self register and the error register. The runtime support for this calling convention must support all of these features. The additional registers per argument is relatively easy to support, as each of our compilers must already have some support for this concept to run on Linux or Apple platforms today. We have a few options for the remaining two features we need to support. The `CallConvSwift` modifier can be applied in all contexts where other existing calling convention modifiers can be used. This means it can be used with UnmanagedCallConvAttribute for P/Invoke (`[UnmanagedCallConv(CallConvs = new Type[] { typeof(CallConvSwift) })]`), UnmanagedCallersOnlyAttribute.CallConv for reverse P/Invoke (`[UnmanagedCallersOnly(CallConvs = new Type[] { typeof(CallConvSwift) })]`), and with unmanaged function pointers (`delegate* unmanaged[Swift]<>`). +Swift has two calling conventions at its core, the "Swift" calling convention, and the "SwiftAsync" calling convention. We will begin by focusing on the "Swift" calling convention, as it is the most common. The Swift calling convention has a few elements that we need to contend with. Primarily, it allows passing arguments in more registers. Additionally, it has two dedicated registers for specific components of the calling convention, the self register and the error register. The runtime support for this calling convention must support all of these features. The additional registers per argument is relatively easy to support, as each of our compilers must already have some support for this concept to run on Linux or Apple platforms today. We have a few options for the remaining two features we need to support. The `CallConvSwift` modifier can be applied in all contexts where other existing calling convention modifiers can be used. This means it can be used with [`UnmanagedCallConvAttribute`](https://learn.microsoft.com/dotnet/api/system.runtime.interopservices.unmanagedcallconvattribute) for P/Invoke (`[UnmanagedCallConv(CallConvs = new Type[] { typeof(CallConvSwift) })]`), [`UnmanagedCallersOnlyAttribute.CallConv`](https://learn.microsoft.com/dotnet/api/system.runtime.interopservices.unmanagedcallersonlyattribute.callconvs) for reverse P/Invoke (`[UnmanagedCallersOnly(CallConvs = new Type[] { typeof(CallConvSwift) })]`), and with unmanaged function pointers (`delegate* unmanaged[Swift]<>`). The "SwiftAsync" calling convention has an additional "async context" register. When a "SwiftAsync" function is called by a "SwiftAsync" function, it must be tail-called. A function that uses this calling convention also pops the argument area. In LLVM-IR, this calling convention is referred to as `swifttailcc`. In Clang, this convention is specified as `swiftasynccall`. Additionally, the "SwiftAsync" calling convention does not have the error register and must not have a return value. See the [LLVM-IR language reference](https://github.com/llvm/llvm-project/blob/54fe7ef70069a48c252a7e1b0c6ed8efda0bc440/llvm/docs/LangRef.rst#L452) and the [Clang attribute reference](https://clang.llvm.org/docs/AttributeReference.html#swiftasynccall) for an explaination of the calling convention. From 9ab1bc6ce59b75760fb9e377ff378b912a9ca29c Mon Sep 17 00:00:00 2001 From: Milos Kotlar Date: Wed, 3 Jan 2024 09:31:45 +0100 Subject: [PATCH 074/108] Update proposed/swift-interop.md Co-authored-by: Aaron Robinson --- proposed/swift-interop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index cc4ae353b..1cd87403e 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -57,7 +57,7 @@ We have selected the following option for supporting the Self register in the ca 1. Use specially-typed argument with a type, `SwiftSelf`, to represent which parameter should go into the self register. -We will provide a `SwiftSelf` type to specify "this argument goes in the self register". Specifying the type twice in a signature would generate an `InvalidProgramException`. This would allow the `self` argument to be specified anywhere in the argument list. Alternatively, many sections of the Swift ABI documentation, as well as the Clang docs refer to this parameter as the "context" argument instead of the "self" argument, so an alternative name could be `SwiftContext`. Methods annotated with `UnmanagedCallersOnly` will be treated as Swift-like functions capable of handling context registers. The value stored in the self register will be loaded into the `SwiftSelf` argument. +We will provide a `SwiftSelf` type to specify "this argument goes in the self register". Specifying the type twice in a signature would generate an `InvalidProgramException`. This would allow the `self` argument to be specified anywhere in the argument list. Alternatively, many sections of the Swift ABI documentation, as well as the Clang docs refer to this parameter as the "context" argument instead of the "self" argument, so an alternative name could be `SwiftContext`. Methods annotated with `UnmanagedCallersOnly` will be capable of handling the self register; the value stored in the self register will be loaded into the `SwiftSelf` argument. For reference, explicitly declaring a function with the Swift or SwiftAsync calling conventions in Clang requires the "context" argument, the value that goes in the "self" register, as the last parameter or the penultimate parameter followed by the error parameter. From 6a42f0e4fdadfa63893f9dc096ff06e32316af54 Mon Sep 17 00:00:00 2001 From: Jeremy Koritzinsky Date: Mon, 8 Jan 2024 08:08:24 -0800 Subject: [PATCH 075/108] Handle part of Swift frozen struct ABI in the projection layers (#310) --- proposed/swift-interop.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index b9657ccec..1c5ddf05c 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -118,6 +118,8 @@ When calling a function that returns an opaque struct, the Swift ABI always requ At the lowest level of the calling convention, we do not consider Library Evolution to be a different calling convention than the Swift calling convention. Library Evolution requires that some types are passed by a pointer/reference, but it does not fundamentally change the calling convention. Effectively, Library Evolution forces the least optimizable choice to be taken at every possible point. As a result, we should not handle Library Evolution as a separate calling convention and instead we can manually handle it at the projection layer. +For frozen structs and enums, Swift has a complicated lowering process where the struct or enum type's layout are recursively flattened to a sequence of primitives. If this sequence is length 4 or less, the values of this type are split into the elements of this sequence for parameter passing instead of passing the struct as a whole. Structs and enums that cannot be broken down in this way are passed by-reference to their specified frozen layout. Due to high implementation cost in the RyuJIT, in particular in the `UnmanagedCallersOnly` scenario, we should implement this first pass of lowering in the projection layer; the only types allowed for `CallConvSwift` calling convention in method or function pointer signatures are primitives, our special Swift register types, and pointer types. For reference, this lowering pass is done in the Swift compiler when lowering from Swift IL to LLVM IR. This design decision reinforces our direction of having the Runtime layer of Swift interop support similar features as the LLVM IR representation of Swift. + ##### Automatic Reference Counting and Lifetime Management Swift has a strongly-defined lifetime and ownership model. This model is specified in the Swift ABI and is similar to Objective-C's ARC (Automatic Reference Counting) system. When .NET calls into Swift, the .NET GC is responsible for managing all managed objects. Unmanaged objects from C# should either implement `IDisposable` or utilize a designated thin wrapper over the Swift memory allocator, currently accessible through the `NativeMemory` class, to explicitly release memory. It's important to ensure that when a Swift callee function allocates an "unsafe" or "raw" pointer types, such as UnsafeMutablePointer and UnsafeRawPointer, where explicit control over memory is needed, and the pointer is returned to .NET, the memory is not dereferenced after the call returns. Also, if a C# managed object is allocated in a callee function and returned to Swift, the .NET GC will eventually collect it, but Swift will keep track using ARC, which represents an invalid case and should be handled by projection tools. From 1b507bda1cff8f86cfe44af3842a54911205ac3d Mon Sep 17 00:00:00 2001 From: Jeremy Koritzinsky Date: Mon, 29 Jan 2024 13:42:03 -0800 Subject: [PATCH 076/108] Add sections talking about Swift SIMD types and interop with them (#313) --- proposed/swift-interop.md | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/proposed/swift-interop.md b/proposed/swift-interop.md index 4879c018b..29b00ee97 100644 --- a/proposed/swift-interop.md +++ b/proposed/swift-interop.md @@ -118,7 +118,15 @@ When calling a function that returns an opaque struct, the Swift ABI always requ At the lowest level of the calling convention, we do not consider Library Evolution to be a different calling convention than the Swift calling convention. Library Evolution requires that some types are passed by a pointer/reference, but it does not fundamentally change the calling convention. Effectively, Library Evolution forces the least optimizable choice to be taken at every possible point. As a result, we should not handle Library Evolution as a separate calling convention and instead we can manually handle it at the projection layer. -For frozen structs and enums, Swift has a complicated lowering process where the struct or enum type's layout are recursively flattened to a sequence of primitives. If this sequence is length 4 or less, the values of this type are split into the elements of this sequence for parameter passing instead of passing the struct as a whole. Structs and enums that cannot be broken down in this way are passed by-reference to their specified frozen layout. Due to high implementation cost in the RyuJIT, in particular in the `UnmanagedCallersOnly` scenario, we should implement this first pass of lowering in the projection layer; the only types allowed for `CallConvSwift` calling convention in method or function pointer signatures are primitives, our special Swift register types, and pointer types. For reference, this lowering pass is done in the Swift compiler when lowering from Swift IL to LLVM IR. This design decision reinforces our direction of having the Runtime layer of Swift interop support similar features as the LLVM IR representation of Swift. +For frozen structs and enums, Swift has a complicated lowering process where the struct or enum type's layout are recursively flattened to a sequence of primitives. If this sequence is length 4 or less, the values of this type are split into the elements of this sequence for parameter passing instead of passing the struct as a whole. Structs and enums that cannot be broken down in this way are passed by-reference to their specified frozen layout. When a frozen struct or enum with a primitive sequence of 4 elements or less is returned from a function, it is returned as if it were a structure of the elements of the primitive sequence. We will implement this pass in the VM/JIT layer. Direct users of the `CallConvSwift` calling convention will be allowed to specify blittable struct types, which the runtime and JIT/AOT compilers will inspect and classify to create identical behavior as the Swift compiler's primitive sequence lowering combined with LLVM calling convention register allocation. + +The projection tooling will provide blittable representations of any projected frozen struct or enum types, at minimum for any of these types that are passed by value to any Swift APIs, to support the above ABI handling. These blittable types are not required to be lowered to primitive types; they may have fields of other blittable representations of other Swift types or other blittable structs. These blittable representations will be the "struct or enum type layouts" mentioned in the paragraph above. + +##### SIMD Types + +We will pass the `System.Runtime.Intrinsics.VectorX` types in SIMD registers in the same way the default unmanaged calling convention specifies. We will treat the `Vector2/3/4` types as non-SIMD types as we will not project any Swift SIMD types to these types (see the SIMD Types section in the projection design). + +CoreCLR and NativeAOT currently block the `VectorX` types from P/Invokes as this behavior is currently not well-supported by RyuJIT. As the target libraries for .NET 9 do not use the Swift `SIMDX` types, we can defer SIMD support until a future release of Swift interop. Unlike the existing P/Invoke blocking behavior however, we should block these types in the JIT if possible as the existing `VectorX` blocking for P/Invokes is not robust. ##### Automatic Reference Counting and Lifetime Management @@ -150,6 +158,17 @@ We plan to interop with Swift's Library Evolution mode, which brings an addition If possible, Swift tuples should be represented as `ValueTuple`s in .NET. If this is not possible, then they should be represented as types with a `Deconstruct` method similar to `ValueTuple` to allow a tuple-like experience in C#. +##### SIMD types + +Swift has its own built-in SIMD types; however they're named based on the number of elements, not based on the width of the vector type. For example, Swift has `SIMD2`, `SIMD4`, up to `SIMD64`. When the instantiations of these types correspond to an intrinsic vector type, they are treated as that type. Otherwise, they are treated as a struct of vectors. In .NET, our vector types are named based on their vector with, so `Vector128`, `Vector256`, etc. + +For instantiated generic types that are within the size of an processor intrinsic vector type, there exists a correspondence between a Swift SIMD type and a .NET SIMD type. For example, `SIMD4` corresponds to `Vector128`. +However, this correspondence breaks down for SIMD types larger than the largest vector register width (i.e. larger than 512 bytes) or for unconstrained generic types like `SIMD4`. These cases; however, should be quite rare. In the "too-large" case, the values are passed into Swift as though the type is a struct of vectors. In the case of unconstrained generic types, the SIMD values are passed indirectly. Both of these cases are suboptimal and we don't know of any public Swift APIs that fall into either of these scenarios. + +We recommend that the projection tooling will map each Swift SIMD instantiation to the corresponding `VectorX` type in .NET. For cases where there is no corresponding type or where an API takes or returns an unconstrained generic `SIMDX` value, we can map the APIs to regular projected structs for the SIMD types based on the above rules. + +As mentioned in the calling-convention section above, none of the libraries we are targetting in .NET 9 expose APIs that take the SIMD types. As a result, we can defer this support until the future and block projecting of SIMD types in the meantime. + #### Projection Tooling Components The projection tooling should be split into these components: From 9a8cafb54f68fb738a8f1bc58ba9bd35bd008d05 Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 8 Nov 2023 11:50:17 -0800 Subject: [PATCH 077/108] Supporting local SDK deployment in global.json This proposal extends global.json such that it supports locally deployed instances of the .NET SDK. This will significantly reduce friction our local deployment has with developer tools like Visual Studio, VS Code and make our own developer story much simpler. --- proposed/local-sdk-global-json.md | 83 +++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) create mode 100644 proposed/local-sdk-global-json.md diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md new file mode 100644 index 000000000..3e3f056e4 --- /dev/null +++ b/proposed/local-sdk-global-json.md @@ -0,0 +1,83 @@ + +Provide SDK hint paths in global.json +=== + +## Summary +This proposal adds two new properties to the `sdk` object in [global.json](https://learn.microsoft.com/en-us/dotnet/core/tools/global-json#globaljson-schema). + +```json +{ + "sdk": { + "additionalPaths": [ ".dotnet" ], + "additionalPathsOnly": false, + } +} +``` + +These properties will be considered by the resolver host when attempting to locate a compatible .NET SDK. This particular configuration would cause the local directory `.dotnet` to be considered _in addition_ to the current set of locations. + +## Motivation +There is currently a disconnect between the ways the .NET SDK is deployed in practice and what the host resolver can discover when searching for compatible SDKs. By default the host resolver is only going to search for SDKS in machine wide locations: `C:\Program Files\dotnet`, `%PATH%`, etc ... The .NET SDK though is commonly deployed to local locations: `%LocalAppData%\Microsoft\dotnet`, `$HOME/.dotnet`. Many repos embrace this and restore the correct .NET for their builds into a local `.dotnet` directory. + +The behavior of the host resolver is incompatible with local based deployments. It simply will not find these deployments and instead search in machine wide locations. That means tools like Visual Studio and VS Code simply do not work with local deployment by default. Developers must take additional steps like manipulating `%PATH%` before launching these editors. That reduces the usefulness of items like the quick launch bar, short cuts, etc ... + +This is further complicated when developers mix local and machine wide installations. The host resolver will find the first `dotnet` according to its lookup rules and search only there for a compatible SDK. Once developers manipulate `%PATH%` to prefer local SDKS the resolver will stop considering machine wide SDKS. That can lead to situations where there is machine wide SDK that works for a given global.json but the host resolver will not consider it because the developer setup `%PATH%` to consider a locally installed SDK. That can be very frustrating for end users. + +This disconnect between the resolver and deployment has lead to customers introducing a number of creative work arounds: + +- [scripts](https://github.com/dotnet/razor/pull/9550) to launch VS Code while considering locally deployed .NET SDKs +- [docs and scripts](https://github.com/dotnet/sdk/blob/518c60dbe98b51193b3a9ad9fc44e055e6e10fa0/documentation/project-docs/developer-guide.md?plain=1#L38) to setup the environment and launch VS so it can find the deployed .NET SDKs. +- [scritps](https://github.com/dotnet/runtime/blob/main/dotnet.cmd) that wrap `dotnet` to find the _correct_ `dotnet` to use during build. + +These scripts are not one offs, they are increasingly common items in repos in `github.com/dotnet` to attempt to fix the disconnect. Even so many of these solutions are incomplete because they themselves only consider local deployment. They dont't fully support the full set of ways the SDK can be deployed. + +This is not a problem specific to the .NET team. Indeed this problem is felt sharply there due to arcade infrastructure using xcopy style deployment into `.dotnet`. Our customers feel this problem as well. Consider the following examples: + +- This [issue](https://github.com/dotnet/sdk/issues/8254) from 2017 attempting to solve this problsem. It gets several hits a year from customers who are similarly struggling with our toolings inability to handle local deployment. +- This [internal discussion](https://teams.microsoft.com/l/message/19:ed7a508bf00c4b088a7760359f0d0308@thread.skype/1698341652961?tenantId=72f988bf-86f1-41af-91ab-2d7cd011db47&groupId=4ba7372f-2799-4677-89f0-7a1aaea3706c&parentMessageId=1698341652961&teamName=.NET%20Developer%20Experience&channelName=InfraSwat&createdTime=1698341652961) from a C# team member. They wanted to use VS as the product is shipped to customers and got blocked when we shipped an SDK that didn't have a corresponding MSI and hence VS couldn't load Roslyn anymore. +- [VS code](https://github.com/dotnet/vscode-csharp/issues/6471) having to adjust to consider local directories for SDK because our resolver can't find them. + +## Detailed Design +The global.json file will support two new properties under the `sdk` object: + +- `"additionalLocations"`: this is a list of paths that the host resolver should consider when looking for compatible SDKs. Relative paths will be interpreted relative to the global.json file. +- `"additionalLocationOnly"`: when true the resolver will _only_ consider paths in `additionalLocations`. It will not consider any machine wide locations (unless they are specified in `additionalLocations`). The default for this property is `false`. + +The `additionalLocations` works similar to how multi-level lookup works. It adds additional locations that the host resolver should consider when trying to resolve a compatible .NET SDK. For example: + +```json +{ + "sdk": { + "additionalPaths": [ ".dotnet" ], + } +} +``` + +In this configuration the host resolver would find a compatible SDK if it exists in `.dotnet` or a machine wide location. + +The values in the `additionalLocations` property can be a relative or absolute path. When a relative path is used it will be resolved relative to the location of global.json. These values also support the following substitutions: + +- `"$local$"`: this matches the local installation point of .NET SDK for the current operating system: `%LocalAppData%\Microsoft\dotnet`` on Windows and `$HOME/.dotnet`` on Linux/macOS. +- `"$machine$"`: this matches the machine installation point of .NET for the current operating system. +- `%VARIABLE%/$VARIABLE`: environment variables will be substituted. Either the Windows or Unix format can be used here and will be normalized for the operating system the host resolver executes on. + +The host resolver will consider the `additionalLocations` in the order they are defined and will stop at the first match. If none of the locations have a matching SDK then it it will fall back to considering machine wide installations (unless `additionalLocationsOnly` is `true`). + +This design requires us to only change the host resolver. That means other tooling like Visual Studio, VS Code, MSBuild, etc ... would transparently benefit from this change. Repositories could update global.json to have `additionalLocations` support `.dotnet` and Visual Studio would automatically find it without any design changes. + +## Considerations +### Installation Points +One item to keep in mind when considering this area is the .NET SDK can be installed in many locations. The most common are: + +- Machine wide +- User wide: `%LocalAppData%\Microsoft\dotnet`` on Windows and `$HOME/.dotnet`` on Linux/macOS. +- Repo: `.dotnet` + +Our installation tooling tends to avoid redundant installations. For example if restoring a repository that requires 7.0.400, the tooling will not install it locally if 7.0.400 is installed machine wide. It also will not necessarily delete the local `.dotnet` folder or the user wide folder. That means developers end up with .NET SDK installs in all three locations but only the machine wide install has the correct ones. + +As a result solutions like "just use .dotnet if it exists" fall short. It will work in a lot of casse but will fail in more complex scenarios. To completely close the disconnect here we need to consider all the possible locations. + +### Do we need additionalLocationsOnly? +The necessity of this property is questionable. The design includes it for completeness and understanding that the goal of some developers is complete isolation from machine state. As long as we're considering designs that embrace local deployment, it seemed sensible to extend the design to embrace _only_ local deployment. + +At the same time the motivation for this is much smaller. It would be reasonable to cut this from the design and consider it at a future time when the motivation is higher. From d4c7b8271885a0b57a6d5b3c993a1ace4d2201ae Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 8 Nov 2023 14:23:40 -0800 Subject: [PATCH 078/108] Apply suggestions from code review Co-authored-by: Rainer Sigwald Co-authored-by: Sandy Armstrong --- proposed/local-sdk-global-json.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index 3e3f056e4..7885ab525 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -19,7 +19,7 @@ These properties will be considered by the resolver host when attempting to loca ## Motivation There is currently a disconnect between the ways the .NET SDK is deployed in practice and what the host resolver can discover when searching for compatible SDKs. By default the host resolver is only going to search for SDKS in machine wide locations: `C:\Program Files\dotnet`, `%PATH%`, etc ... The .NET SDK though is commonly deployed to local locations: `%LocalAppData%\Microsoft\dotnet`, `$HOME/.dotnet`. Many repos embrace this and restore the correct .NET for their builds into a local `.dotnet` directory. -The behavior of the host resolver is incompatible with local based deployments. It simply will not find these deployments and instead search in machine wide locations. That means tools like Visual Studio and VS Code simply do not work with local deployment by default. Developers must take additional steps like manipulating `%PATH%` before launching these editors. That reduces the usefulness of items like the quick launch bar, short cuts, etc ... +The behavior of the host resolver is incompatible with local based deployments. It will not find these deployments without additional environment variable configuration and instead search in machine wide locations. That means tools like Visual Studio and VS Code simply do not work with local deployment by default. Developers must take additional steps like manipulating `%PATH%` before launching these editors. That reduces the usefulness of items like the quick launch bar, short cuts, etc ... This is further complicated when developers mix local and machine wide installations. The host resolver will find the first `dotnet` according to its lookup rules and search only there for a compatible SDK. Once developers manipulate `%PATH%` to prefer local SDKS the resolver will stop considering machine wide SDKS. That can lead to situations where there is machine wide SDK that works for a given global.json but the host resolver will not consider it because the developer setup `%PATH%` to consider a locally installed SDK. That can be very frustrating for end users. @@ -57,7 +57,7 @@ In this configuration the host resolver would find a compatible SDK if it exists The values in the `additionalLocations` property can be a relative or absolute path. When a relative path is used it will be resolved relative to the location of global.json. These values also support the following substitutions: -- `"$local$"`: this matches the local installation point of .NET SDK for the current operating system: `%LocalAppData%\Microsoft\dotnet`` on Windows and `$HOME/.dotnet`` on Linux/macOS. +- `"$user$"`: this matches the user-local installation point of .NET SDK for the current operating system: `%LocalAppData%\Microsoft\dotnet`` on Windows and `$HOME/.dotnet`` on Linux/macOS. - `"$machine$"`: this matches the machine installation point of .NET for the current operating system. - `%VARIABLE%/$VARIABLE`: environment variables will be substituted. Either the Windows or Unix format can be used here and will be normalized for the operating system the host resolver executes on. @@ -77,7 +77,7 @@ Our installation tooling tends to avoid redundant installations. For example if As a result solutions like "just use .dotnet if it exists" fall short. It will work in a lot of casse but will fail in more complex scenarios. To completely close the disconnect here we need to consider all the possible locations. -### Do we need additionalLocationsOnly? +### Do we need additionalPathsOnly? The necessity of this property is questionable. The design includes it for completeness and understanding that the goal of some developers is complete isolation from machine state. As long as we're considering designs that embrace local deployment, it seemed sensible to extend the design to embrace _only_ local deployment. At the same time the motivation for this is much smaller. It would be reasonable to cut this from the design and consider it at a future time when the motivation is higher. From b68c9825f31c4a4e071dcca145a78dcff9fefe1d Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 8 Nov 2023 14:35:40 -0800 Subject: [PATCH 079/108] consistency --- proposed/local-sdk-global-json.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index 7885ab525..fbffec3ba 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -40,10 +40,10 @@ This is not a problem specific to the .NET team. Indeed this problem is felt sha ## Detailed Design The global.json file will support two new properties under the `sdk` object: -- `"additionalLocations"`: this is a list of paths that the host resolver should consider when looking for compatible SDKs. Relative paths will be interpreted relative to the global.json file. -- `"additionalLocationOnly"`: when true the resolver will _only_ consider paths in `additionalLocations`. It will not consider any machine wide locations (unless they are specified in `additionalLocations`). The default for this property is `false`. +- `"additionalPaths"`: this is a list of paths that the host resolver should consider when looking for compatible SDKs. Relative paths will be interpreted relative to the global.json file. +- `"additionalPathsOnly"`: when true the resolver will _only_ consider paths in `additionalPaths`. It will not consider any machine wide locations (unless they are specified in `additionalPaths`). The default for this property is `false`. -The `additionalLocations` works similar to how multi-level lookup works. It adds additional locations that the host resolver should consider when trying to resolve a compatible .NET SDK. For example: +The `additionalPaths` works similar to how multi-level lookup works. It adds additional locations that the host resolver should consider when trying to resolve a compatible .NET SDK. For example: ```json { @@ -55,15 +55,15 @@ The `additionalLocations` works similar to how multi-level lookup works. It adds In this configuration the host resolver would find a compatible SDK if it exists in `.dotnet` or a machine wide location. -The values in the `additionalLocations` property can be a relative or absolute path. When a relative path is used it will be resolved relative to the location of global.json. These values also support the following substitutions: +The values in the `additionalPaths` property can be a relative or absolute path. When a relative path is used it will be resolved relative to the location of global.json. These values also support the following substitutions: - `"$user$"`: this matches the user-local installation point of .NET SDK for the current operating system: `%LocalAppData%\Microsoft\dotnet`` on Windows and `$HOME/.dotnet`` on Linux/macOS. - `"$machine$"`: this matches the machine installation point of .NET for the current operating system. - `%VARIABLE%/$VARIABLE`: environment variables will be substituted. Either the Windows or Unix format can be used here and will be normalized for the operating system the host resolver executes on. -The host resolver will consider the `additionalLocations` in the order they are defined and will stop at the first match. If none of the locations have a matching SDK then it it will fall back to considering machine wide installations (unless `additionalLocationsOnly` is `true`). +The host resolver will consider the `additionalPaths` in the order they are defined and will stop at the first match. If none of the locations have a matching SDK then it it will fall back to considering machine wide installations (unless `additionalPathsOnly` is `true`). -This design requires us to only change the host resolver. That means other tooling like Visual Studio, VS Code, MSBuild, etc ... would transparently benefit from this change. Repositories could update global.json to have `additionalLocations` support `.dotnet` and Visual Studio would automatically find it without any design changes. +This design requires us to only change the host resolver. That means other tooling like Visual Studio, VS Code, MSBuild, etc ... would transparently benefit from this change. Repositories could update global.json to have `additionalPaths` support `.dotnet` and Visual Studio would automatically find it without any design changes. ## Considerations ### Installation Points From 498a01200758eb673136e2ca97d2f53b357b9716 Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 8 Nov 2023 15:11:38 -0800 Subject: [PATCH 080/108] PR feedback --- proposed/local-sdk-global-json.md | 32 ++++++++++++++++++++++++++----- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index fbffec3ba..15c09e047 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -1,4 +1,3 @@ - Provide SDK hint paths in global.json === @@ -31,7 +30,7 @@ This disconnect between the resolver and deployment has lead to customers introd These scripts are not one offs, they are increasingly common items in repos in `github.com/dotnet` to attempt to fix the disconnect. Even so many of these solutions are incomplete because they themselves only consider local deployment. They dont't fully support the full set of ways the SDK can be deployed. -This is not a problem specific to the .NET team. Indeed this problem is felt sharply there due to arcade infrastructure using xcopy style deployment into `.dotnet`. Our customers feel this problem as well. Consider the following examples: +This problem also manifests in how customers naturally want to use our development tools like Visual Studio or VS Code. It's felt sharply on the .NET team, or any external customer who wants to contribute to .NET, due to how arcade infrastructure uses xcopy deployment into `.dotnet`. External teams like Unity also feel this pain in their development: - This [issue](https://github.com/dotnet/sdk/issues/8254) from 2017 attempting to solve this problsem. It gets several hits a year from customers who are similarly struggling with our toolings inability to handle local deployment. - This [internal discussion](https://teams.microsoft.com/l/message/19:ed7a508bf00c4b088a7760359f0d0308@thread.skype/1698341652961?tenantId=72f988bf-86f1-41af-91ab-2d7cd011db47&groupId=4ba7372f-2799-4677-89f0-7a1aaea3706c&parentMessageId=1698341652961&teamName=.NET%20Developer%20Experience&channelName=InfraSwat&createdTime=1698341652961) from a C# team member. They wanted to use VS as the product is shipped to customers and got blocked when we shipped an SDK that didn't have a corresponding MSI and hence VS couldn't load Roslyn anymore. @@ -53,7 +52,22 @@ The `additionalPaths` works similar to how multi-level lookup works. It adds add } ``` -In this configuration the host resolver would find a compatible SDK if it exists in `.dotnet` or a machine wide location. +In this configuration the host resolver would find a compatible SDK if it exists in `.dotnet` or a machine wide location. The host resolver will consider the `additionalPaths` in the order they are defined and will stop at the first match. If none of the locations have a matching SDK then + will fall back to the existing SDK resolution strategy: `%PATH%` followed by machine wide installations. + +This lookup will stop on the first match which means it won't necessarily find the best match. Consider a scenario with a global.json that has: + +```json +{ + "sdk": { + "additionalPaths": [ ".dotnet" ], + "version": "7.0.200", + "rollForward": "latestFeature" + } +} +``` + +In a scenario where the `.dotnet` directory had 7.0.200 SDK but there was a machine wide install of 7.0.300 SDK, the host resolver would pick 7.0.200 out of `.dotnet`. That location is considered first, it has a matching SDK and hence discovery stops there. The values in the `additionalPaths` property can be a relative or absolute path. When a relative path is used it will be resolved relative to the location of global.json. These values also support the following substitutions: @@ -61,8 +75,6 @@ The values in the `additionalPaths` property can be a relative or absolute path. - `"$machine$"`: this matches the machine installation point of .NET for the current operating system. - `%VARIABLE%/$VARIABLE`: environment variables will be substituted. Either the Windows or Unix format can be used here and will be normalized for the operating system the host resolver executes on. -The host resolver will consider the `additionalPaths` in the order they are defined and will stop at the first match. If none of the locations have a matching SDK then it it will fall back to considering machine wide installations (unless `additionalPathsOnly` is `true`). - This design requires us to only change the host resolver. That means other tooling like Visual Studio, VS Code, MSBuild, etc ... would transparently benefit from this change. Repositories could update global.json to have `additionalPaths` support `.dotnet` and Visual Studio would automatically find it without any design changes. ## Considerations @@ -81,3 +93,13 @@ As a result solutions like "just use .dotnet if it exists" fall short. It will w The necessity of this property is questionable. The design includes it for completeness and understanding that the goal of some developers is complete isolation from machine state. As long as we're considering designs that embrace local deployment, it seemed sensible to extend the design to embrace _only_ local deployment. At the same time the motivation for this is much smaller. It would be reasonable to cut this from the design and consider it at a future time when the motivation is higher. + +### Best match or first match? +This proposal is designed at giving global.json more control over how SDKs are found. If the global.json asked for a specific path to be considered and it has a matching SDK but a different SDK was chosen, that seems counter intuitive. Even in the case where the chosen SDK was _better_. This is a motivating scenario for CI where certainty around SDK is often more desirable than _better_. This is why the host discovery stops at first match vs. looking at all location and choosing the best match. + +Best match is a valid approach though. Can certainly see the argument for some customers wanting that. Feel like it cuts against the proposal a bit because it devalues `additionalPaths` a bit. If the resolver is switched to best match then feel like the need for `additionalPathsOnly` is much stronger. There would certainly be a customer segment that wanted to isolate from machine state in that case. + + + +The host resolver search stops at the first matching SDK. This proposal + From 7763bfde068ebd166f22cb7e5f1d32280761777a Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 8 Nov 2023 15:16:24 -0800 Subject: [PATCH 081/108] Update proposed/local-sdk-global-json.md Co-authored-by: Igor Velikorossov --- proposed/local-sdk-global-json.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index 15c09e047..4ccf3bc2e 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -71,7 +71,7 @@ In a scenario where the `.dotnet` directory had 7.0.200 SDK but there was a mach The values in the `additionalPaths` property can be a relative or absolute path. When a relative path is used it will be resolved relative to the location of global.json. These values also support the following substitutions: -- `"$user$"`: this matches the user-local installation point of .NET SDK for the current operating system: `%LocalAppData%\Microsoft\dotnet`` on Windows and `$HOME/.dotnet`` on Linux/macOS. +- `"$user$"`: this matches the user-local installation point of .NET SDK for the current operating system: `%LocalAppData%\Microsoft\dotnet` on Windows and `$HOME/.dotnet` on Linux/macOS. - `"$machine$"`: this matches the machine installation point of .NET for the current operating system. - `%VARIABLE%/$VARIABLE`: environment variables will be substituted. Either the Windows or Unix format can be used here and will be normalized for the operating system the host resolver executes on. From e8e7b0f938709ca347ffab3847ae750646df847f Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 8 Nov 2023 16:31:41 -0800 Subject: [PATCH 082/108] Apply suggestions from code review Co-authored-by: Igor Velikorossov --- proposed/local-sdk-global-json.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index 4ccf3bc2e..0a3d9e72e 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -18,7 +18,7 @@ These properties will be considered by the resolver host when attempting to loca ## Motivation There is currently a disconnect between the ways the .NET SDK is deployed in practice and what the host resolver can discover when searching for compatible SDKs. By default the host resolver is only going to search for SDKS in machine wide locations: `C:\Program Files\dotnet`, `%PATH%`, etc ... The .NET SDK though is commonly deployed to local locations: `%LocalAppData%\Microsoft\dotnet`, `$HOME/.dotnet`. Many repos embrace this and restore the correct .NET for their builds into a local `.dotnet` directory. -The behavior of the host resolver is incompatible with local based deployments. It will not find these deployments without additional environment variable configuration and instead search in machine wide locations. That means tools like Visual Studio and VS Code simply do not work with local deployment by default. Developers must take additional steps like manipulating `%PATH%` before launching these editors. That reduces the usefulness of items like the quick launch bar, short cuts, etc ... +The behavior of the host resolver is incompatible with local based deployments. It will not find these deployments without additional environment variable configuration and instead search in machine wide locations. That means tools like Visual Studio and VS Code simply do not work with local deployment by default. Developers must take additional steps like manipulating `%PATH%` before launching these editors. That reduces the usefulness of items like the quick launch bar, short cuts, etc. This is further complicated when developers mix local and machine wide installations. The host resolver will find the first `dotnet` according to its lookup rules and search only there for a compatible SDK. Once developers manipulate `%PATH%` to prefer local SDKS the resolver will stop considering machine wide SDKS. That can lead to situations where there is machine wide SDK that works for a given global.json but the host resolver will not consider it because the developer setup `%PATH%` to consider a locally installed SDK. That can be very frustrating for end users. @@ -40,7 +40,7 @@ This problem also manifests in how customers naturally want to use our developme The global.json file will support two new properties under the `sdk` object: - `"additionalPaths"`: this is a list of paths that the host resolver should consider when looking for compatible SDKs. Relative paths will be interpreted relative to the global.json file. -- `"additionalPathsOnly"`: when true the resolver will _only_ consider paths in `additionalPaths`. It will not consider any machine wide locations (unless they are specified in `additionalPaths`). The default for this property is `false`. +- `"additionalPathsOnly"`: when `true` the resolver will _only_ consider paths in `additionalPaths`. It will not consider any machine wide locations (unless they are specified in `additionalPaths`). The default for this property is `false`. The `additionalPaths` works similar to how multi-level lookup works. It adds additional locations that the host resolver should consider when trying to resolve a compatible .NET SDK. For example: @@ -52,7 +52,7 @@ The `additionalPaths` works similar to how multi-level lookup works. It adds add } ``` -In this configuration the host resolver would find a compatible SDK if it exists in `.dotnet` or a machine wide location. The host resolver will consider the `additionalPaths` in the order they are defined and will stop at the first match. If none of the locations have a matching SDK then +In this configuration the host resolver would find a compatible SDK, if it exists in `.dotnet` or a machine wide location. The host resolver will consider the `additionalPaths` in the order they are defined and will stop at the first match. If none of the locations have a matching SDK then will fall back to the existing SDK resolution strategy: `%PATH%` followed by machine wide installations. This lookup will stop on the first match which means it won't necessarily find the best match. Consider a scenario with a global.json that has: @@ -82,12 +82,12 @@ This design requires us to only change the host resolver. That means other tooli One item to keep in mind when considering this area is the .NET SDK can be installed in many locations. The most common are: - Machine wide -- User wide: `%LocalAppData%\Microsoft\dotnet`` on Windows and `$HOME/.dotnet`` on Linux/macOS. +- User wide: `%LocalAppData%\Microsoft\dotnet` on Windows and `$HOME/.dotnet` on Linux/macOS. - Repo: `.dotnet` -Our installation tooling tends to avoid redundant installations. For example if restoring a repository that requires 7.0.400, the tooling will not install it locally if 7.0.400 is installed machine wide. It also will not necessarily delete the local `.dotnet` folder or the user wide folder. That means developers end up with .NET SDK installs in all three locations but only the machine wide install has the correct ones. +Our installation tooling tends to avoid redundant installations. For example, if restoring a repository that requires 7.0.400, the tooling will not install it locally if 7.0.400 is installed machine wide. It also will not necessarily delete the local `.dotnet` folder or the user wide folder. That means developers end up with .NET SDK installs in all three locations but only the machine wide install has the correct ones. -As a result solutions like "just use .dotnet if it exists" fall short. It will work in a lot of casse but will fail in more complex scenarios. To completely close the disconnect here we need to consider all the possible locations. +As a result solutions like "just use .dotnet, if it exists" fall short. It will work in a lot of casse but will fail in more complex scenarios. To completely close the disconnect here we need to consider all the possible locations. ### Do we need additionalPathsOnly? The necessity of this property is questionable. The design includes it for completeness and understanding that the goal of some developers is complete isolation from machine state. As long as we're considering designs that embrace local deployment, it seemed sensible to extend the design to embrace _only_ local deployment. From 19d0fd6cea2f627bcebf5835dc099fa976c21a1a Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 8 Nov 2023 15:57:44 -0800 Subject: [PATCH 083/108] more --- proposed/local-sdk-global-json.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index 0a3d9e72e..72f2540aa 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -99,7 +99,14 @@ This proposal is designed at giving global.json more control over how SDKs are f Best match is a valid approach though. Can certainly see the argument for some customers wanting that. Feel like it cuts against the proposal a bit because it devalues `additionalPaths` a bit. If the resolver is switched to best match then feel like the need for `additionalPathsOnly` is much stronger. There would certainly be a customer segment that wanted to isolate from machine state in that case. +### Other Designs +https://github.com/dotnet/designs/blob/main/accepted/2022/version-selection.md#local-dotnet +This is a proposal similar in nature to this one. There are a few differences: -The host resolver search stops at the first matching SDK. This proposal +1. This proposal is more configurable and supports all standard local installation points, not just the `.dotnet` variant. +2. This proposal doesn't change what SDK is chosen: the rules for global.json on what SDKs are allowed still apply. It simply changes the locations where the SDK is looked for. +3. No consideration for changing the command line. This is completely driven through global.json changes. + +Otherwise the proposals are very similar in nature. From 87b9a0e81a089f78244a55d1e6118804d8b3013b Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Mon, 27 Nov 2023 16:31:14 -0800 Subject: [PATCH 084/108] Apply suggestions from code review Co-authored-by: Jan Jones Co-authored-by: Fred Silberberg Co-authored-by: Rolf Bjarne Kvinge --- proposed/local-sdk-global-json.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index 72f2540aa..e36c8c1e4 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -16,7 +16,7 @@ This proposal adds two new properties to the `sdk` object in [global.json](https These properties will be considered by the resolver host when attempting to locate a compatible .NET SDK. This particular configuration would cause the local directory `.dotnet` to be considered _in addition_ to the current set of locations. ## Motivation -There is currently a disconnect between the ways the .NET SDK is deployed in practice and what the host resolver can discover when searching for compatible SDKs. By default the host resolver is only going to search for SDKS in machine wide locations: `C:\Program Files\dotnet`, `%PATH%`, etc ... The .NET SDK though is commonly deployed to local locations: `%LocalAppData%\Microsoft\dotnet`, `$HOME/.dotnet`. Many repos embrace this and restore the correct .NET for their builds into a local `.dotnet` directory. +There is currently a disconnect between the ways the .NET SDK is deployed in practice and what the host resolver can discover when searching for compatible SDKs. By default the host resolver is only going to search for SDKs in machine wide locations: `C:\Program Files\dotnet`, `%PATH%`, etc ... The .NET SDK though is commonly deployed to local locations: `%LocalAppData%\Microsoft\dotnet`, `$HOME/.dotnet`. Many repos embrace this and restore the correct .NET for their builds into a local `.dotnet` directory. The behavior of the host resolver is incompatible with local based deployments. It will not find these deployments without additional environment variable configuration and instead search in machine wide locations. That means tools like Visual Studio and VS Code simply do not work with local deployment by default. Developers must take additional steps like manipulating `%PATH%` before launching these editors. That reduces the usefulness of items like the quick launch bar, short cuts, etc. @@ -26,13 +26,13 @@ This disconnect between the resolver and deployment has lead to customers introd - [scripts](https://github.com/dotnet/razor/pull/9550) to launch VS Code while considering locally deployed .NET SDKs - [docs and scripts](https://github.com/dotnet/sdk/blob/518c60dbe98b51193b3a9ad9fc44e055e6e10fa0/documentation/project-docs/developer-guide.md?plain=1#L38) to setup the environment and launch VS so it can find the deployed .NET SDKs. -- [scritps](https://github.com/dotnet/runtime/blob/main/dotnet.cmd) that wrap `dotnet` to find the _correct_ `dotnet` to use during build. +- [scripts](https://github.com/dotnet/runtime/blob/main/dotnet.cmd) that wrap `dotnet` to find the _correct_ `dotnet` to use during build. These scripts are not one offs, they are increasingly common items in repos in `github.com/dotnet` to attempt to fix the disconnect. Even so many of these solutions are incomplete because they themselves only consider local deployment. They dont't fully support the full set of ways the SDK can be deployed. This problem also manifests in how customers naturally want to use our development tools like Visual Studio or VS Code. It's felt sharply on the .NET team, or any external customer who wants to contribute to .NET, due to how arcade infrastructure uses xcopy deployment into `.dotnet`. External teams like Unity also feel this pain in their development: -- This [issue](https://github.com/dotnet/sdk/issues/8254) from 2017 attempting to solve this problsem. It gets several hits a year from customers who are similarly struggling with our toolings inability to handle local deployment. +- This [issue](https://github.com/dotnet/sdk/issues/8254) from 2017 attempting to solve this problem. It gets several hits a year from customers who are similarly struggling with our toolings inability to handle local deployment. - This [internal discussion](https://teams.microsoft.com/l/message/19:ed7a508bf00c4b088a7760359f0d0308@thread.skype/1698341652961?tenantId=72f988bf-86f1-41af-91ab-2d7cd011db47&groupId=4ba7372f-2799-4677-89f0-7a1aaea3706c&parentMessageId=1698341652961&teamName=.NET%20Developer%20Experience&channelName=InfraSwat&createdTime=1698341652961) from a C# team member. They wanted to use VS as the product is shipped to customers and got blocked when we shipped an SDK that didn't have a corresponding MSI and hence VS couldn't load Roslyn anymore. - [VS code](https://github.com/dotnet/vscode-csharp/issues/6471) having to adjust to consider local directories for SDK because our resolver can't find them. @@ -53,7 +53,7 @@ The `additionalPaths` works similar to how multi-level lookup works. It adds add ``` In this configuration the host resolver would find a compatible SDK, if it exists in `.dotnet` or a machine wide location. The host resolver will consider the `additionalPaths` in the order they are defined and will stop at the first match. If none of the locations have a matching SDK then - will fall back to the existing SDK resolution strategy: `%PATH%` followed by machine wide installations. +the host resolver will fall back to the existing SDK resolution strategy: `%PATH%` followed by machine wide installations. This lookup will stop on the first match which means it won't necessarily find the best match. Consider a scenario with a global.json that has: @@ -87,7 +87,7 @@ One item to keep in mind when considering this area is the .NET SDK can be insta Our installation tooling tends to avoid redundant installations. For example, if restoring a repository that requires 7.0.400, the tooling will not install it locally if 7.0.400 is installed machine wide. It also will not necessarily delete the local `.dotnet` folder or the user wide folder. That means developers end up with .NET SDK installs in all three locations but only the machine wide install has the correct ones. -As a result solutions like "just use .dotnet, if it exists" fall short. It will work in a lot of casse but will fail in more complex scenarios. To completely close the disconnect here we need to consider all the possible locations. +As a result solutions like "just use .dotnet, if it exists" fall short. It will work in a lot of cases but will fail in more complex scenarios. To completely close the disconnect here we need to consider all the possible locations. ### Do we need additionalPathsOnly? The necessity of this property is questionable. The design includes it for completeness and understanding that the goal of some developers is complete isolation from machine state. As long as we're considering designs that embrace local deployment, it seemed sensible to extend the design to embrace _only_ local deployment. From 88c3c25d8e3776403699446d7e81f1d119a80d06 Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 31 Jan 2024 09:12:01 -0800 Subject: [PATCH 085/108] make the linter happy --- .vscode/settings.json | 5 + designs.sln | 30 +++++ proposed/local-sdk-global-json.md | 212 +++++++++++++++++++++++------- 3 files changed, 196 insertions(+), 51 deletions(-) create mode 100644 .vscode/settings.json create mode 100644 designs.sln diff --git a/.vscode/settings.json b/.vscode/settings.json new file mode 100644 index 000000000..6052b0607 --- /dev/null +++ b/.vscode/settings.json @@ -0,0 +1,5 @@ +{ + "[markdown]": { + "editor.rulers": [80] + } +} diff --git a/designs.sln b/designs.sln new file mode 100644 index 000000000..bfa4a19b7 --- /dev/null +++ b/designs.sln @@ -0,0 +1,30 @@ + +Microsoft Visual Studio Solution File, Format Version 12.00 +# Visual Studio Version 17 +VisualStudioVersion = 17.5.002.0 +MinimumVisualStudioVersion = 10.0.40219.1 +Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "tools", "tools", "{7ED864FD-FEA8-49A4-8196-9A9BA39499B4}" +EndProject +Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "update-index", "tools\update-index\update-index.csproj", "{524C2FE5-739D-4F3D-847F-17D0CA46EC69}" +EndProject +Global + GlobalSection(SolutionConfigurationPlatforms) = preSolution + Debug|Any CPU = Debug|Any CPU + Release|Any CPU = Release|Any CPU + EndGlobalSection + GlobalSection(ProjectConfigurationPlatforms) = postSolution + {524C2FE5-739D-4F3D-847F-17D0CA46EC69}.Debug|Any CPU.ActiveCfg = Debug|Any CPU + {524C2FE5-739D-4F3D-847F-17D0CA46EC69}.Debug|Any CPU.Build.0 = Debug|Any CPU + {524C2FE5-739D-4F3D-847F-17D0CA46EC69}.Release|Any CPU.ActiveCfg = Release|Any CPU + {524C2FE5-739D-4F3D-847F-17D0CA46EC69}.Release|Any CPU.Build.0 = Release|Any CPU + EndGlobalSection + GlobalSection(SolutionProperties) = preSolution + HideSolutionNode = FALSE + EndGlobalSection + GlobalSection(NestedProjects) = preSolution + {524C2FE5-739D-4F3D-847F-17D0CA46EC69} = {7ED864FD-FEA8-49A4-8196-9A9BA39499B4} + EndGlobalSection + GlobalSection(ExtensibilityGlobals) = postSolution + SolutionGuid = {A3EEE113-2E4D-478E-B28F-37A8D74AAF3A} + EndGlobalSection +EndGlobal diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index e36c8c1e4..3fa1bcc84 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -1,8 +1,9 @@ -Provide SDK hint paths in global.json -=== +# Provide SDK hint paths in global.json ## Summary -This proposal adds two new properties to the `sdk` object in [global.json](https://learn.microsoft.com/en-us/dotnet/core/tools/global-json#globaljson-schema). + +This proposal adds two new properties to the `sdk` object in +[global.json][global-json-schema] ```json { @@ -13,36 +14,83 @@ This proposal adds two new properties to the `sdk` object in [global.json](https } ``` -These properties will be considered by the resolver host when attempting to locate a compatible .NET SDK. This particular configuration would cause the local directory `.dotnet` to be considered _in addition_ to the current set of locations. +These properties will be considered by the resolver host when attempting to +locate a compatible .NET SDK. This particular configuration would cause the +local directory `.dotnet` to be considered _in addition_ to the current set of +locations. ## Motivation -There is currently a disconnect between the ways the .NET SDK is deployed in practice and what the host resolver can discover when searching for compatible SDKs. By default the host resolver is only going to search for SDKs in machine wide locations: `C:\Program Files\dotnet`, `%PATH%`, etc ... The .NET SDK though is commonly deployed to local locations: `%LocalAppData%\Microsoft\dotnet`, `$HOME/.dotnet`. Many repos embrace this and restore the correct .NET for their builds into a local `.dotnet` directory. - -The behavior of the host resolver is incompatible with local based deployments. It will not find these deployments without additional environment variable configuration and instead search in machine wide locations. That means tools like Visual Studio and VS Code simply do not work with local deployment by default. Developers must take additional steps like manipulating `%PATH%` before launching these editors. That reduces the usefulness of items like the quick launch bar, short cuts, etc. - -This is further complicated when developers mix local and machine wide installations. The host resolver will find the first `dotnet` according to its lookup rules and search only there for a compatible SDK. Once developers manipulate `%PATH%` to prefer local SDKS the resolver will stop considering machine wide SDKS. That can lead to situations where there is machine wide SDK that works for a given global.json but the host resolver will not consider it because the developer setup `%PATH%` to consider a locally installed SDK. That can be very frustrating for end users. - -This disconnect between the resolver and deployment has lead to customers introducing a number of creative work arounds: - -- [scripts](https://github.com/dotnet/razor/pull/9550) to launch VS Code while considering locally deployed .NET SDKs -- [docs and scripts](https://github.com/dotnet/sdk/blob/518c60dbe98b51193b3a9ad9fc44e055e6e10fa0/documentation/project-docs/developer-guide.md?plain=1#L38) to setup the environment and launch VS so it can find the deployed .NET SDKs. -- [scripts](https://github.com/dotnet/runtime/blob/main/dotnet.cmd) that wrap `dotnet` to find the _correct_ `dotnet` to use during build. - -These scripts are not one offs, they are increasingly common items in repos in `github.com/dotnet` to attempt to fix the disconnect. Even so many of these solutions are incomplete because they themselves only consider local deployment. They dont't fully support the full set of ways the SDK can be deployed. - -This problem also manifests in how customers naturally want to use our development tools like Visual Studio or VS Code. It's felt sharply on the .NET team, or any external customer who wants to contribute to .NET, due to how arcade infrastructure uses xcopy deployment into `.dotnet`. External teams like Unity also feel this pain in their development: - -- This [issue](https://github.com/dotnet/sdk/issues/8254) from 2017 attempting to solve this problem. It gets several hits a year from customers who are similarly struggling with our toolings inability to handle local deployment. -- This [internal discussion](https://teams.microsoft.com/l/message/19:ed7a508bf00c4b088a7760359f0d0308@thread.skype/1698341652961?tenantId=72f988bf-86f1-41af-91ab-2d7cd011db47&groupId=4ba7372f-2799-4677-89f0-7a1aaea3706c&parentMessageId=1698341652961&teamName=.NET%20Developer%20Experience&channelName=InfraSwat&createdTime=1698341652961) from a C# team member. They wanted to use VS as the product is shipped to customers and got blocked when we shipped an SDK that didn't have a corresponding MSI and hence VS couldn't load Roslyn anymore. -- [VS code](https://github.com/dotnet/vscode-csharp/issues/6471) having to adjust to consider local directories for SDK because our resolver can't find them. +There is currently a disconnect between the ways the .NET SDK is deployed in +practice and what the host resolver can discover when searching for compatible +SDKs. By default the host resolver is only going to search for SDKs in machine +wide locations: `C:\Program Files\dotnet`, `%PATH%`, etc ... The .NET SDK +though is commonly deployed to local locations: `%LocalAppData%\Microsoft\dotnet`, +`$HOME/.dotnet`. Many repos embrace this and restore the correct .NET for their +builds into a local `.dotnet` directory. + +The behavior of the host resolver is incompatible with local based deployments. +It will not find these deployments without additional environment variable +configuration and instead search in machine wide locations. That means tools +like Visual Studio and VS Code simply do not work with local deployment by +default. Developers must take additional steps like manipulating `%PATH%` before +launching these editors. That reduces the usefulness of items like the quick +launch bar, short cuts, etc. + +This is further complicated when developers mix local and machine wide +installations. The host resolver will find the first `dotnet` according to its +lookup rules and search only there for a compatible SDK. Once developers +manipulate `%PATH%` to prefer local SDKS the resolver will stop considering +machine wide SDKS. That can lead to situations where there is machine wide SDK +that works for a given global.json but the host resolver will not consider it +because the developer setup `%PATH%` to consider a locally installed SDK. That +can be very frustrating for end users. + +This disconnect between the resolver and deployment has lead to customers +introducing a number of creative work arounds: + +- [scripts][example-scripts-razor] to launch VS Code while considering locally +deployed .NET SDKs +- [docs and scripts][example-scripts-build] to setup the environment and launch +VS so it can find the deployed .NET SDKs. +- [scripts][example-scripts-dotnet] that wrap `dotnet` to find the _correct_ +`dotnet` to use during build. + +These scripts are not one offs, they are increasingly common items in repos in +`github.com/dotnet` to attempt to fix the disconnect. Even so many of these +solutions are incomplete because they themselves only consider local deployment. +They dont't fully support the full set of ways the SDK can be deployed. + +This problem also manifests in how customers naturally want to use our +development tools like Visual Studio or VS Code. It's felt sharply on the .NET +team, or any external customer who wants to contribute to .NET, due to how +arcade infrastructure uses xcopy deployment into `.dotnet`. External teams +like Unity also feel this pain in their development: + +- This [issue][cases-sdk-issue] from 2017 attempting +to solve this problsem. It gets several hits a year from customers who are +similarly struggling with our toolings inability to handle local deployment. +- This [internal discussion][cases-internal-discussion] from a C# team member. +They wanted to use VS as the product is shipped to customers and got blocked +when we shipped an SDK that didn't have a corresponding MSI and hence VS +couldn't load Roslyn anymore. +- [VS code][cases-vscode] having to adjust to consider local directories for SDK +because our resolver can't find them. ## Detailed Design + The global.json file will support two new properties under the `sdk` object: -- `"additionalPaths"`: this is a list of paths that the host resolver should consider when looking for compatible SDKs. Relative paths will be interpreted relative to the global.json file. -- `"additionalPathsOnly"`: when `true` the resolver will _only_ consider paths in `additionalPaths`. It will not consider any machine wide locations (unless they are specified in `additionalPaths`). The default for this property is `false`. +- `"additionalPaths"`: this is a list of paths that the host resolver should +consider when looking for compatible SDKs. Relative paths will be interpreted +relative to the global.json file. +- `"additionalPathsOnly"`: when `true` the resolver will _only_ consider paths +in `additionalPaths`. It will not consider any machine wide locations (unless +they are specified in `additionalPaths`). The default for this property is +`false`. -The `additionalPaths` works similar to how multi-level lookup works. It adds additional locations that the host resolver should consider when trying to resolve a compatible .NET SDK. For example: +The `additionalPaths` works similar to how multi-level lookup works. It adds +additional locations that the host resolver should consider when trying to +resolve a compatible .NET SDK. For example: ```json { @@ -52,10 +100,15 @@ The `additionalPaths` works similar to how multi-level lookup works. It adds add } ``` -In this configuration the host resolver would find a compatible SDK, if it exists in `.dotnet` or a machine wide location. The host resolver will consider the `additionalPaths` in the order they are defined and will stop at the first match. If none of the locations have a matching SDK then -the host resolver will fall back to the existing SDK resolution strategy: `%PATH%` followed by machine wide installations. +In this configuration the host resolver would find a compatible SDK, if it +exists in `.dotnet` or a machine wide location. The host resolver will consider +the `additionalPaths` in the order they are defined and will stop at the first +match. If none of the locations have a matching SDK then the host resolver will +fall back to the existing SDK resolution strategy: `%PATH%` followed by machine +wide installations. -This lookup will stop on the first match which means it won't necessarily find the best match. Consider a scenario with a global.json that has: +This lookup will stop on the first match which means it won't necessarily find +the best match. Consider a scenario with a global.json that has: ```json { @@ -67,46 +120,103 @@ This lookup will stop on the first match which means it won't necessarily find t } ``` -In a scenario where the `.dotnet` directory had 7.0.200 SDK but there was a machine wide install of 7.0.300 SDK, the host resolver would pick 7.0.200 out of `.dotnet`. That location is considered first, it has a matching SDK and hence discovery stops there. - -The values in the `additionalPaths` property can be a relative or absolute path. When a relative path is used it will be resolved relative to the location of global.json. These values also support the following substitutions: - -- `"$user$"`: this matches the user-local installation point of .NET SDK for the current operating system: `%LocalAppData%\Microsoft\dotnet` on Windows and `$HOME/.dotnet` on Linux/macOS. -- `"$machine$"`: this matches the machine installation point of .NET for the current operating system. -- `%VARIABLE%/$VARIABLE`: environment variables will be substituted. Either the Windows or Unix format can be used here and will be normalized for the operating system the host resolver executes on. - -This design requires us to only change the host resolver. That means other tooling like Visual Studio, VS Code, MSBuild, etc ... would transparently benefit from this change. Repositories could update global.json to have `additionalPaths` support `.dotnet` and Visual Studio would automatically find it without any design changes. +In a scenario where the `.dotnet` directory had 7.0.200 SDK but there was a +machine wide install of 7.0.300 SDK, the host resolver would pick 7.0.200 out +of `.dotnet`. That location is considered first, it has a matching SDK and +hence discovery stops there. + +The values in the `additionalPaths` property can be a relative or absolute path. +When a relative path is used it will be resolved relative to the location of +global.json. These values also support the following substitutions: + +- `"$user$"`: this matches the user-local installation point of .NET SDK for the +current operating system: `%LocalAppData%\Microsoft\dotnet` on Windows and +`$HOME/.dotnet` on Linux/macOS. +- `"$machine$"`: this matches the machine installation point of .NET for the +current operating system. +- `%VARIABLE%/$VARIABLE`: environment variables will be substituted. Either the +Windows or Unix format can be used here and will be normalized for the operating +system the host resolver executes on. + +This design requires us to only change the host resolver. That means other +tooling like Visual Studio, VS Code, MSBuild, etc ... would transparently +benefit from this change. Repositories could update global.json to have +`additionalPaths` support `.dotnet` and Visual Studio would automatically find +it without any design changes. ## Considerations + ### Installation Points -One item to keep in mind when considering this area is the .NET SDK can be installed in many locations. The most common are: + +One item to keep in mind when considering this area is the .NET SDK can be +installed in many locations. The most common are: - Machine wide -- User wide: `%LocalAppData%\Microsoft\dotnet` on Windows and `$HOME/.dotnet` on Linux/macOS. +- User wide: `%LocalAppData%\Microsoft\dotnet` on Windows and `$HOME/.dotnet` +on Linux/macOS. - Repo: `.dotnet` -Our installation tooling tends to avoid redundant installations. For example, if restoring a repository that requires 7.0.400, the tooling will not install it locally if 7.0.400 is installed machine wide. It also will not necessarily delete the local `.dotnet` folder or the user wide folder. That means developers end up with .NET SDK installs in all three locations but only the machine wide install has the correct ones. +Our installation tooling tends to avoid redundant installations. For example, if +restoring a repository that requires 7.0.400, the tooling will not install it +locally if 7.0.400 is installed machine wide. It also will not necessarily +delete the local `.dotnet` folder or the user wide folder. That means developers +end up with .NET SDK installs in all three locations but only the machine wide +install has the correct ones. -As a result solutions like "just use .dotnet, if it exists" fall short. It will work in a lot of cases but will fail in more complex scenarios. To completely close the disconnect here we need to consider all the possible locations. +As a result solutions like "just use .dotnet, if it exists" fall short. It will +work in a lot of cases but will fail in more complex scenarios. To completely +close the disconnect here we need to consider all the possible locations. ### Do we need additionalPathsOnly? -The necessity of this property is questionable. The design includes it for completeness and understanding that the goal of some developers is complete isolation from machine state. As long as we're considering designs that embrace local deployment, it seemed sensible to extend the design to embrace _only_ local deployment. -At the same time the motivation for this is much smaller. It would be reasonable to cut this from the design and consider it at a future time when the motivation is higher. +The necessity of this property is questionable. The design includes it for +completeness and understanding that the goal of some developers is complete +isolation from machine state. As long as we're considering designs that embrace +local deployment, it seemed sensible to extend the design to embrace _only_ +local deployment. + +At the same time the motivation for this is much smaller. It would be reasonable +to cut this from the design and consider it at a future time when the motivation +is higher. ### Best match or first match? -This proposal is designed at giving global.json more control over how SDKs are found. If the global.json asked for a specific path to be considered and it has a matching SDK but a different SDK was chosen, that seems counter intuitive. Even in the case where the chosen SDK was _better_. This is a motivating scenario for CI where certainty around SDK is often more desirable than _better_. This is why the host discovery stops at first match vs. looking at all location and choosing the best match. -Best match is a valid approach though. Can certainly see the argument for some customers wanting that. Feel like it cuts against the proposal a bit because it devalues `additionalPaths` a bit. If the resolver is switched to best match then feel like the need for `additionalPathsOnly` is much stronger. There would certainly be a customer segment that wanted to isolate from machine state in that case. +This proposal is designed at giving global.json more control over how SDKs are +found. If the global.json asked for a specific path to be considered and it has +a matching SDK but a different SDK was chosen, that seems counter intuitive. +Even in the case where the chosen SDK was _better_. This is a motivating +scenario for CI where certainty around SDK is often more desirable than +_better_. This is why the host discovery stops at first match vs. looking at +all location and choosing the best match. + +Best match is a valid approach though. Can certainly see the argument for some +customers wanting that. Feel like it cuts against the proposal a bit because it +devalues `additionalPaths` a bit. If the resolver is switched to best match then +feel like the need for `additionalPathsOnly` is much stronger. There would +certainly be a customer segment that wanted to isolate from machine state in +that case. ### Other Designs -https://github.com/dotnet/designs/blob/main/accepted/2022/version-selection.md#local-dotnet -This is a proposal similar in nature to this one. There are a few differences: +[This is a proposal][designs-other] similar in nature to this one. There are a +few differences: -1. This proposal is more configurable and supports all standard local installation points, not just the `.dotnet` variant. -2. This proposal doesn't change what SDK is chosen: the rules for global.json on what SDKs are allowed still apply. It simply changes the locations where the SDK is looked for. -3. No consideration for changing the command line. This is completely driven through global.json changes. +1. This proposal is more configurable and supports all standard local +installation points, not just the `.dotnet` variant. +2. This proposal doesn't change what SDK is chosen: the rules for global.json +on what SDKs are allowed still apply. It simply changes the locations where the +SDK is looked for. +3. No consideration for changing the command line. This is completely driven +through global.json changes. Otherwise the proposals are very similar in nature. +[global-json-schema]: https://learn.microsoft.com/en-us/dotnet/core/tools/global-json#globaljson-schema +[example-scripts-razor]: https://github.com/dotnet/razor/pull/9550 +[example-scripts-build]: https://github.com/dotnet/sdk/blob/518c60dbe98b51193b3a9ad9fc44e055e6e10fa0/documentation/project-docs/developer-guide.md?plain=1#L38 +[example-scripts-dotnet]: https://github.com/dotnet/runtime/blob/main/dotnet.cmd +[cases-sdk-issue]: https://github.com/dotnet/sdk/issues/8254 +[cases-internal-discussion]: https://teams.microsoft.com/l/message/19:ed7a508bf00c4b088a7760359f0d0308@thread.skype/1698341652961?tenantId=72f988bf-86f1-41af-91ab-2d7cd011db47&groupId=4ba7372f-2799-4677-89f0-7a1aaea3706c&parentMessageId=1698341652961&teamName=.NET%20Developer%20Experience&channelName=InfraSwat&createdTime=1698341652961 +[cases-vscode]: https://github.com/dotnet/vscode-csharp/issues/6471 +[designs-other]: https://github.com/dotnet/designs/blob/main/accepted/2022/version-selection.md#local-dotnet + From 3a62442e1ca0fc71b32c27e0ee1961e8df4eb14e Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 31 Jan 2024 10:10:58 -0800 Subject: [PATCH 086/108] Feedback --- .vscode/settings.json | 6 ++++-- proposed/local-sdk-global-json.md | 17 ++++++++++++----- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/.vscode/settings.json b/.vscode/settings.json index 6052b0607..7f06c1459 100644 --- a/.vscode/settings.json +++ b/.vscode/settings.json @@ -1,5 +1,7 @@ { "[markdown]": { - "editor.rulers": [80] - } + "editor.rulers": [80], + "editor.wordWrap": "bounded", + "editor.wordWrapColumn": 80 + }, } diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index 3fa1bcc84..f8eeba370 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -133,10 +133,8 @@ global.json. These values also support the following substitutions: current operating system: `%LocalAppData%\Microsoft\dotnet` on Windows and `$HOME/.dotnet` on Linux/macOS. - `"$machine$"`: this matches the machine installation point of .NET for the -current operating system. -- `%VARIABLE%/$VARIABLE`: environment variables will be substituted. Either the -Windows or Unix format can be used here and will be normalized for the operating -system the host resolver executes on. +current operating system. This should match how the product is +[installed][installation-doc] on the current operating system. This design requires us to only change the host resolver. That means other tooling like Visual Studio, VS Code, MSBuild, etc ... would transparently @@ -196,6 +194,15 @@ feel like the need for `additionalPathsOnly` is much stronger. There would certainly be a customer segment that wanted to isolate from machine state in that case. +### Environment variables + +Previous versions of this proposal included support for using environment +variables inside `additionalPaths`. This was removed due to lack of motivating +scenarios and potential for creating user confusion as different machines can +reasonably have different environment variables. + +This could be reconsidered if motivating scenarios are found. + ### Other Designs [This is a proposal][designs-other] similar in nature to this one. There are a @@ -219,4 +226,4 @@ Otherwise the proposals are very similar in nature. [cases-internal-discussion]: https://teams.microsoft.com/l/message/19:ed7a508bf00c4b088a7760359f0d0308@thread.skype/1698341652961?tenantId=72f988bf-86f1-41af-91ab-2d7cd011db47&groupId=4ba7372f-2799-4677-89f0-7a1aaea3706c&parentMessageId=1698341652961&teamName=.NET%20Developer%20Experience&channelName=InfraSwat&createdTime=1698341652961 [cases-vscode]: https://github.com/dotnet/vscode-csharp/issues/6471 [designs-other]: https://github.com/dotnet/designs/blob/main/accepted/2022/version-selection.md#local-dotnet - +[installation-doc]: https://github.com/dotnet/designs/blob/main/accepted/2021/install-location-per-architecture.md From cb6cb7c6d8f90d882e9b845a9bbfa562ee7d9696 Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 31 Jan 2024 17:06:02 -0800 Subject: [PATCH 087/108] PR feedback --- proposed/local-sdk-global-json.md | 84 +++++++++++++------------------ 1 file changed, 35 insertions(+), 49 deletions(-) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index f8eeba370..617d65562 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -8,18 +8,25 @@ This proposal adds two new properties to the `sdk` object in ```json { "sdk": { - "additionalPaths": [ ".dotnet" ], - "additionalPathsOnly": false, + "paths": [ ".dotnet", "$host" ], + "errorMessage": "The .NET SDK could not be found, please run ./install.sh." } } ``` -These properties will be considered by the resolver host when attempting to -locate a compatible .NET SDK. This particular configuration would cause the -local directory `.dotnet` to be considered _in addition_ to the current set of -locations. +These properties will be considered by the resolver host during .NET SDK +resolution. The `paths` property lists the locations that the resolver should +consider when attempting to locate a compatible .NET SDK. The `errorMessage` +property controls what the resolver displays when it cannot find a compatible +.NET SDK. + +This particular configuration would cause the local directory `.dotnet` to be +considered _in addition_ to the current set of locations. Further if resolution +failed the resolver would display the contents of `errorMessage` instead of +the default error message. ## Motivation + There is currently a disconnect between the ways the .NET SDK is deployed in practice and what the host resolver can discover when searching for compatible SDKs. By default the host resolver is only going to search for SDKs in machine @@ -80,32 +87,34 @@ because our resolver can't find them. The global.json file will support two new properties under the `sdk` object: -- `"additionalPaths"`: this is a list of paths that the host resolver should -consider when looking for compatible SDKs. Relative paths will be interpreted -relative to the global.json file. -- `"additionalPathsOnly"`: when `true` the resolver will _only_ consider paths -in `additionalPaths`. It will not consider any machine wide locations (unless -they are specified in `additionalPaths`). The default for this property is -`false`. +- `"paths"`: this is a list of paths that the host resolver should +consider when looking for compatible SDKs. In the case this property is `null` +or not specified, the host resolver will behave as it does today. +- `"errorMessage"`: when the host resolver cannot find a compatible .NET SDK it +will display the contents of this property instead of the default error message. +In the case this property is `null` or not specified, the current error message +will be displayed. -The `additionalPaths` works similar to how multi-level lookup works. It adds -additional locations that the host resolver should consider when trying to -resolve a compatible .NET SDK. For example: +The values in the `paths` property can be a relative path, absolute path or +`$host$`. When a relative path is used it will be resolved relative to the +location of the containing global.json. The value `$host$` is a special value +that represents the machine wide installation path of .NET SDK for the +[current host][installation-doc]. + +The values in `paths` are considered in the order they are defined. The host +resolver will stop when it finds the first path with a compatible .NET SDK. +For example: ```json { "sdk": { - "additionalPaths": [ ".dotnet" ], + "paths": [ ".dotnet", "$host$" ], } } ``` -In this configuration the host resolver would find a compatible SDK, if it -exists in `.dotnet` or a machine wide location. The host resolver will consider -the `additionalPaths` in the order they are defined and will stop at the first -match. If none of the locations have a matching SDK then the host resolver will -fall back to the existing SDK resolution strategy: `%PATH%` followed by machine -wide installations. +In this configuration the host resolver would find a compatible .NET SDK, if it +exists in `.dotnet` or a machine wide location. This lookup will stop on the first match which means it won't necessarily find the best match. Consider a scenario with a global.json that has: @@ -113,7 +122,7 @@ the best match. Consider a scenario with a global.json that has: ```json { "sdk": { - "additionalPaths": [ ".dotnet" ], + "paths": [ ".dotnet", "$host$" ], "version": "7.0.200", "rollForward": "latestFeature" } @@ -122,20 +131,9 @@ the best match. Consider a scenario with a global.json that has: In a scenario where the `.dotnet` directory had 7.0.200 SDK but there was a machine wide install of 7.0.300 SDK, the host resolver would pick 7.0.200 out -of `.dotnet`. That location is considered first, it has a matching SDK and +of `.dotnet`. That location is considered first, it has a matching .NET SDK and hence discovery stops there. -The values in the `additionalPaths` property can be a relative or absolute path. -When a relative path is used it will be resolved relative to the location of -global.json. These values also support the following substitutions: - -- `"$user$"`: this matches the user-local installation point of .NET SDK for the -current operating system: `%LocalAppData%\Microsoft\dotnet` on Windows and -`$HOME/.dotnet` on Linux/macOS. -- `"$machine$"`: this matches the machine installation point of .NET for the -current operating system. This should match how the product is -[installed][installation-doc] on the current operating system. - This design requires us to only change the host resolver. That means other tooling like Visual Studio, VS Code, MSBuild, etc ... would transparently benefit from this change. Repositories could update global.json to have @@ -165,18 +163,6 @@ As a result solutions like "just use .dotnet, if it exists" fall short. It will work in a lot of cases but will fail in more complex scenarios. To completely close the disconnect here we need to consider all the possible locations. -### Do we need additionalPathsOnly? - -The necessity of this property is questionable. The design includes it for -completeness and understanding that the goal of some developers is complete -isolation from machine state. As long as we're considering designs that embrace -local deployment, it seemed sensible to extend the design to embrace _only_ -local deployment. - -At the same time the motivation for this is much smaller. It would be reasonable -to cut this from the design and consider it at a future time when the motivation -is higher. - ### Best match or first match? This proposal is designed at giving global.json more control over how SDKs are @@ -197,7 +183,7 @@ that case. ### Environment variables Previous versions of this proposal included support for using environment -variables inside `additionalPaths`. This was removed due to lack of motivating +variables inside `paths`. This was removed due to lack of motivating scenarios and potential for creating user confusion as different machines can reasonably have different environment variables. From d7b311b42752779bfc6cd48f465a660b4700ac12 Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 31 Jan 2024 17:07:43 -0800 Subject: [PATCH 088/108] Remove unused file --- designs.sln | 30 ------------------------------ 1 file changed, 30 deletions(-) delete mode 100644 designs.sln diff --git a/designs.sln b/designs.sln deleted file mode 100644 index bfa4a19b7..000000000 --- a/designs.sln +++ /dev/null @@ -1,30 +0,0 @@ - -Microsoft Visual Studio Solution File, Format Version 12.00 -# Visual Studio Version 17 -VisualStudioVersion = 17.5.002.0 -MinimumVisualStudioVersion = 10.0.40219.1 -Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "tools", "tools", "{7ED864FD-FEA8-49A4-8196-9A9BA39499B4}" -EndProject -Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "update-index", "tools\update-index\update-index.csproj", "{524C2FE5-739D-4F3D-847F-17D0CA46EC69}" -EndProject -Global - GlobalSection(SolutionConfigurationPlatforms) = preSolution - Debug|Any CPU = Debug|Any CPU - Release|Any CPU = Release|Any CPU - EndGlobalSection - GlobalSection(ProjectConfigurationPlatforms) = postSolution - {524C2FE5-739D-4F3D-847F-17D0CA46EC69}.Debug|Any CPU.ActiveCfg = Debug|Any CPU - {524C2FE5-739D-4F3D-847F-17D0CA46EC69}.Debug|Any CPU.Build.0 = Debug|Any CPU - {524C2FE5-739D-4F3D-847F-17D0CA46EC69}.Release|Any CPU.ActiveCfg = Release|Any CPU - {524C2FE5-739D-4F3D-847F-17D0CA46EC69}.Release|Any CPU.Build.0 = Release|Any CPU - EndGlobalSection - GlobalSection(SolutionProperties) = preSolution - HideSolutionNode = FALSE - EndGlobalSection - GlobalSection(NestedProjects) = preSolution - {524C2FE5-739D-4F3D-847F-17D0CA46EC69} = {7ED864FD-FEA8-49A4-8196-9A9BA39499B4} - EndGlobalSection - GlobalSection(ExtensibilityGlobals) = postSolution - SolutionGuid = {A3EEE113-2E4D-478E-B28F-37A8D74AAF3A} - EndGlobalSection -EndGlobal From 481efede11973636b326ed34174846ab24546ebc Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Wed, 31 Jan 2024 17:18:28 -0800 Subject: [PATCH 089/108] Update index --- INDEX.md | 1 + 1 file changed, 1 insertion(+) diff --git a/INDEX.md b/INDEX.md index a197805ae..7efc8fcc2 100644 --- a/INDEX.md +++ b/INDEX.md @@ -95,6 +95,7 @@ Use update-index to regenerate it: |Year|Title|Owners| |----|-----|------| +| | [Provide SDK hint paths in global.json](proposed/local-sdk-global-json.md) | | | | [Rate limits](proposed/rate-limit.md) | [John Luo](https://github.com/juntaoluo), [Sourabh Shirhatti](https://github.com/shirhatti) | | | [Readonly references in C# and IL verification.](proposed/verifiable-ref-readonly.md) | | | | [Ref returns in C# and IL verification.](proposed/verifiable-ref-returns.md) | | From 26987e39e3f222ce5e78b937375009abecab696c Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Thu, 1 Feb 2024 09:37:20 -0800 Subject: [PATCH 090/108] Apply suggestions from code review Co-authored-by: Rolf Bjarne Kvinge Co-authored-by: Igor Velikorossov --- proposed/local-sdk-global-json.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index 617d65562..a0f97b1a7 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -65,22 +65,22 @@ VS so it can find the deployed .NET SDKs. These scripts are not one offs, they are increasingly common items in repos in `github.com/dotnet` to attempt to fix the disconnect. Even so many of these solutions are incomplete because they themselves only consider local deployment. -They dont't fully support the full set of ways the SDK can be deployed. +They don't fully support the full set of ways the SDK can be deployed. This problem also manifests in how customers naturally want to use our development tools like Visual Studio or VS Code. It's felt sharply on the .NET team, or any external customer who wants to contribute to .NET, due to how -arcade infrastructure uses xcopy deployment into `.dotnet`. External teams +.NET Arcade infrastructure uses xcopy deployment into `.dotnet`. External teams like Unity also feel this pain in their development: - This [issue][cases-sdk-issue] from 2017 attempting -to solve this problsem. It gets several hits a year from customers who are +to solve this problem. It gets several hits a year from customers who are similarly struggling with our toolings inability to handle local deployment. - This [internal discussion][cases-internal-discussion] from a C# team member. They wanted to use VS as the product is shipped to customers and got blocked when we shipped an SDK that didn't have a corresponding MSI and hence VS couldn't load Roslyn anymore. -- [VS code][cases-vscode] having to adjust to consider local directories for SDK +- [VS Code][cases-vscode] having to adjust to consider local directories for SDK because our resolver can't find them. ## Detailed Design @@ -92,7 +92,7 @@ consider when looking for compatible SDKs. In the case this property is `null` or not specified, the host resolver will behave as it does today. - `"errorMessage"`: when the host resolver cannot find a compatible .NET SDK it will display the contents of this property instead of the default error message. -In the case this property is `null` or not specified, the current error message +In the case this property is `null` or not specified, the default error message will be displayed. The values in the `paths` property can be a relative path, absolute path or From eb96a2a21197b2d92745f8c3f1118fd82322faf1 Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Thu, 1 Feb 2024 13:49:54 -0800 Subject: [PATCH 091/108] Exclude IPNetwork --- accepted/2023/net8.0-polyfills/net8.0-polyfills.md | 8 -------- 1 file changed, 8 deletions(-) diff --git a/accepted/2023/net8.0-polyfills/net8.0-polyfills.md b/accepted/2023/net8.0-polyfills/net8.0-polyfills.md index c4d6fa5ce..6e5ca7849 100644 --- a/accepted/2023/net8.0-polyfills/net8.0-polyfills.md +++ b/accepted/2023/net8.0-polyfills/net8.0-polyfills.md @@ -10,20 +10,12 @@ which includes .NET Standard and .NET Framework. | Polyfill | Assembly | Package | Existing? | API | Contacts | | ------------ | -------------------------- | -------------------------- | --------- | ---------------------- | -------------------------- | | TimeProvider | Microsoft.Bcl.TimeProvider | Microsoft.Bcl.TimeProvider | No | [dotnet/runtime#36617] | [@tarekgh] [@geeknoid] | -| IPNetwork | Microsoft.Bcl.IPNetwork | Microsoft.Bcl.IPNetwork | No | [dotnet/runtime#79946] | [@antonfirsov] [@geeknoid] | * `TimeProvider`. This type is an abstraction for the current time and time zone. In order to be useful, it's an exchange type that needs to be plumbed through several layers, which includes framework code (such as `Task.Delay`) and user code. -* `IPNetwork`. It's a utilitarian type that is used across several parties. Not - necessarily a critical exchange type but if we don't ship it downlevel, - parties (such as [@geeknoid]'s team) will end up shipping their own copy and - wouldn't use the framework provided type, even on .NET 8.0. - [@tarekgh]: https://github.com/tarekgh [@geeknoid]: https://github.com/geeknoid -[@antonfirsov]: https://github.com/antonfirsov [dotnet/runtime#36617]: https://github.com/dotnet/runtime/issues/36617 -[dotnet/runtime#79946]: https://github.com/dotnet/runtime/issues/79946 From ee9093fcf8aa0168188c3aae31279b1b06b38c28 Mon Sep 17 00:00:00 2001 From: Andy Gocke Date: Thu, 1 Feb 2024 19:46:05 -0800 Subject: [PATCH 092/108] Update proposed/local-sdk-global-json.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Alexander Köplinger --- proposed/local-sdk-global-json.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index a0f97b1a7..5ebb6e357 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -8,7 +8,7 @@ This proposal adds two new properties to the `sdk` object in ```json { "sdk": { - "paths": [ ".dotnet", "$host" ], + "paths": [ ".dotnet", "$host$" ], "errorMessage": "The .NET SDK could not be found, please run ./install.sh." } } From 2b926e83ddacf1f326e3c5430903635d8243be8d Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Mon, 5 Feb 2024 10:26:48 -0800 Subject: [PATCH 093/108] Apply suggestions from code review Co-authored-by: Elinor Fung --- proposed/local-sdk-global-json.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index 5ebb6e357..c55f60005 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -29,15 +29,17 @@ the default error message. There is currently a disconnect between the ways the .NET SDK is deployed in practice and what the host resolver can discover when searching for compatible -SDKs. By default the host resolver is only going to search for SDKs in machine -wide locations: `C:\Program Files\dotnet`, `%PATH%`, etc ... The .NET SDK +SDKs. By default the host resolver is only going to search for SDKs next to +the running `dotnet`. This often means machine-wide locations, since users +and tools typically rely on `dotnet` already being on the user's path when +launching, instead of specifying a full path to the executable. The .NET SDK though is commonly deployed to local locations: `%LocalAppData%\Microsoft\dotnet`, `$HOME/.dotnet`. Many repos embrace this and restore the correct .NET for their builds into a local `.dotnet` directory. The behavior of the host resolver is incompatible with local based deployments. It will not find these deployments without additional environment variable -configuration and instead search in machine wide locations. That means tools +configuration and only search next to the running `dotnet`. That means tools like Visual Studio and VS Code simply do not work with local deployment by default. Developers must take additional steps like manipulating `%PATH%` before launching these editors. That reduces the usefulness of items like the quick @@ -137,8 +139,8 @@ hence discovery stops there. This design requires us to only change the host resolver. That means other tooling like Visual Studio, VS Code, MSBuild, etc ... would transparently benefit from this change. Repositories could update global.json to have -`additionalPaths` support `.dotnet` and Visual Studio would automatically find -it without any design changes. +`paths` support `.dotnet` and Visual Studio would automatically find it without +any design changes. ## Considerations @@ -175,8 +177,8 @@ all location and choosing the best match. Best match is a valid approach though. Can certainly see the argument for some customers wanting that. Feel like it cuts against the proposal a bit because it -devalues `additionalPaths` a bit. If the resolver is switched to best match then -feel like the need for `additionalPathsOnly` is much stronger. There would +devalues `paths` a bit. If the resolver is switched to best match then, the need +for configuration around best versus first match is much stronger. There would certainly be a customer segment that wanted to isolate from machine state in that case. From 0c8ad2ef858ce61f391d5e38731af26884cbf396 Mon Sep 17 00:00:00 2001 From: Jared Parsons Date: Tue, 6 Feb 2024 10:35:03 -0800 Subject: [PATCH 094/108] PR feedback --- proposed/local-sdk-global-json.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/proposed/local-sdk-global-json.md b/proposed/local-sdk-global-json.md index c55f60005..2523a2c4f 100644 --- a/proposed/local-sdk-global-json.md +++ b/proposed/local-sdk-global-json.md @@ -182,6 +182,24 @@ for configuration around best versus first match is much stronger. There would certainly be a customer segment that wanted to isolate from machine state in that case. +### dotnet exec + +This proposal only impacts how .NET SDK commands do runtime discovery. The +command `dotnet exec` is not an .NET SDK command but instead a way to invoke +the app directly using the runtime installed with `dotnet`. + +It is reasonable for complex builds to build and use small tools. For example +building a tool for linting the build, running complex validation, etc ... To +work with local SDK discovery these builds need to leverage `dotnet run` to +execute such tools instead of `dotnet exec`. + +```cmd +# Avoid +> dotnet exec artifacts/bin/MyTool/Release/net8.0/MyTool.dll +# Prefer +> dotnet run --no-build --framework net7.0 src/Tools/MyTool/MyTool.csproj +``` + ### Environment variables Previous versions of this proposal included support for using environment From 8d232d66a44069120cc746664aa96612d053d9fd Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 18 Apr 2024 13:43:24 +0200 Subject: [PATCH 095/108] more --- accepted/2023/wasm-browser-threads.md | 705 +++++++++++--------------- 1 file changed, 299 insertions(+), 406 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index c4a15398c..4d899add5 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -1,6 +1,14 @@ # Multi-threading on a browser -## Goals +## Table of content +- [Goals](#goals) +- [Key ideas](#key-ideas) +- [State April 2024](#state-2024-april) +- [Design details](#design-details) +- [State September 2023](#state-2023-sep) +- [Alternatives](#alternatives---as-considered-2023-sep) + +# Goals - CPU intensive workloads on dotnet thread pool. - Allow user to start new managed threads using `new Thread` and join it. - Add new C# API for creating web workers with JS interop. Allow JS async/promises via external event loop. @@ -31,7 +39,7 @@ † Note: all the text below discusses MT build only, unless explicit about ST build. -## Key idea in this proposal +# Key ideas Move all managed user code out of UI/DOM thread, so that it becomes consistent with all other threads. @@ -54,10 +62,6 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w - It eats your battery - Browser will kill your tab at random point (Aw, snap). - It's not deterministic and you can't really test your app to prove it harmless. -- Firefox (still) has synchronous `XMLHttpRequest` which could be captured by async code in service worker - - it's [deprecated legacy API](https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Synchronous_and_Asynchronous_Requests#synchronous_request) - - [but other browsers don't](https://wpt.fyi/results/service-workers/service-worker/fetch-request-xhr-sync.https.html?label=experimental&label=master&aligned) and it's unlikely they will implement it - - there are deployment and security challenges with it - all the other threads/workers could synchronously block - `Atomics.wait()` works as expected - if we will have managed thread on the UI thread, any `lock` or Mono GC barrier could cause spin-wait @@ -85,38 +89,54 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w **7)** There could be pending HTTP promise (which needs browser event loop to resolve) and blocking `.Wait` on the same thread and same task/chain. Leading to deadlock. -# Summary +# State 2024 April -## (14) Deputy + emscripten dispatch to UI + JSWebWorker + without sync JSExport +## What was implemented in Net9 - Deputy thread design -This is Pavel's preferred design based on experiments and tests so far. -For other possible design options [see below](#Interesting-combinations). +For other possible design options we considered [see below](#Alternatives). -- Emscripten startup on UI thread - - C functions of emscripten -- MonoVM startup on UI thread +- Introduce dedicated web worker called "deputy thread" + - managed `Main()` is dispatched onto deputy thread +- MonoVM startup on deputy thread - non-GC C functions of mono are still available - - there is risk that UI will be suspended by pending GC - - it keeps `renderBatch` working as is - - it could be later optimized for purity to **(16)**. Pavel would like this. - - the mono startup is CPU heavy and it blocks rendering even for server side rendered UI. - - but it's difficult to get rid of many mono [C functions we currently use](#Move-Mono-startup-to-deputy) -- managed `Main()` would be dispatched onto dedicated web worker called "deputy thread" - - because the UI thread would be mostly idling, it could +- Emscripten startup stays on UI thread + - C functions of emscripten + - download of assets and into WASM memory +- UI/DOM thread + - because the UI thread would be mostly idling, it could: - render UI, keep debugger working - dynamically create pthreads -- sync JSExports would not be supported on UI thread - - later sync calls could opt-in and we implement **(13)** via spin-wait -- JS interop only on dedicated `JSWebWorker` - -## Sidecar options - -There are few downsides to them -- if we keep main managed thread and emscripten thread the same, pthreads can't be created dynamically - - we could upgrade it to design **(15)** and have extra thread for running managed `Main()` -- we will have to implement extra layer of dispatch from UI to sidecar - - this could be pure JS via `postMessage`, which is slow and can't do spin-wait. - - we could have `SharedArrayBuffer` for the messages, but we would have to implement (another?) marshaling. + - UI thread stays attached to Mono VM for Blazor's reasons (for Net9) + - it keeps `renderBatch` working as is, bu it's far from ideal + - there is risk that UI could be suspended by pending GC + - It would be ideal change Blazor so that it doesn't touch managed objects via naked pointers during render. + - we strive to detach the UI thread from Mono +- I/O thread + - is helper thread which allows `Task` to be resolved by UI's `Promise` even when deputy thread is blocked in `.Wait` +- JS interop from any thread is marshaled to UI thread's JavaScript +- HTTP and WS clients are implemented in JS of UI thread +- There is draft of `JSWebWorker` API + - it allows C# users to create dedicated JS thread + - the `JSImport` calls are dispatched to it if you are on the that thread + - or if you pass `JSObject` proxy with affinity to that thread as `JSImport` parameter. + - The API was not made public in Net9 yet +- calling synchronous `JSExports` is not supported on UI thread + - this could be changed by configuration option but it's dangerous. +- calling asynchronous `JSExports` is supported +- calling asynchronous `JSImport` is supported +- calling synchronous `JSImport` is supported without synchronous callback to C# +- Strings are marshaled by value + - as opposed to by reference optimization we have in single-threaded build +- Emscripten VFS and other syscalls + - file system operations are single-threaded and always marshaled to UI thread +- Emscripten pool of pthreads + - browser threads are expensive (as compared to normal OS) + - creation of `WebWorker` requires UI thread to do it + - there is quite complex and slow setup for `WebWorker` to become pthread and then to attach as Mono thread. + - that's why Emscripten pre-allocates pthreads + - this allows `pthread_create` to be synchronous and faster + +# Design details ## Define terms - UI thread @@ -148,23 +168,8 @@ There are few downsides to them - we already have prototype of the similar functionality - which can spin-wait -# Details - -## JSImport and marshaled JS functions -- both sync and async could be called on all `JSWebWorker` threads -- both sync and async could be called on main managed thread (even when running on UI) - - unless there is loop back to blocking `JSExport`, it could not deadlock - -## JSExport & C# delegates -- async could be called on all `JSWebWorker` threads -- sync could be called on `JSWebWorker` -- sync could be called on from UI thread is problematic - - with spin-wait in UI in JS it has **2)** problems - - with spin-wait in UI when emscripten is there could also deadlock the rest of the app - - this means that combination of sync JSExport and deputy design is dangerous - ## Proxies - thread affinity -- all of them have thread affinity +- all proxies of JS objects have thread affinity - all of them need to be used and disposed on correct thread - how to dispatch to correct thread is one of the questions here - all of them are registered to 2 GCs @@ -204,6 +209,8 @@ There are few downsides to them - there is `JSSynchronizationContext`` installed on it - so that user code could dispatch back to it, in case that it needs to call `JSObject` proxy (with thread affinity) - this thread needs to throw on any `.Wait` because of the problem **7** +- alternatively we could disable C# code on this thread and treat it similar to UI thread +- alternatively we could have I/O threads ## HTTP and WS clients - are implemented in terms of `JSObject` and `Promise` proxies @@ -219,24 +226,251 @@ There are few downsides to them - other unknowing users are `XmlUrlResolver`, `XmlDownloadManager`, `X509ResourceClient`, ... - because we could have blocking wait now, we could also implement synchronous APIs of HTTP/WS - so that existing user code bases would just work without change - - at the moment they throw PNSE - this would also require separate thread, doing the async job + - we could use I/O thread for it ## JSImport calls on threads without JSWebWorker - those are - thread-pool threads - main managed thread in deputy design -- what should happen when it calls JSImport directly ? -- what should happen when it calls HTTP/WS clients ? -- we could dispatch it to UI thread +- we dispatch it to UI thread - easy to understand default behavior - - downside is blocking the UI and emscripten loops with CPU intensive activity - - in sidecar design, also extra copy of buffers -- we could instead create dedicated `JSWebWorker` managed thread - - more difficult to reason about - - this extra worker could also serve all the sync-to-async jobs -# Dispatching call, who is responsible +## Performance +As compared to ST build for dotnet wasm: +- the dispatch between threads (caused by JS object thread affinity) will have negative performance impact on the JS interop +- in case of HTTP/WS clients used via Streams, it could be surprizing +- browser performance is lower when working with SharedArrayBuffer +- Mono performance is lower because there are GC safe-points and locks in the VM code +- startup is slower because creation of WebWorker instances is slow +- VFS access is slow because it's dispatched to UI thread +- console output is slow because it's POSIX stream is dispatched to UI thread, call per line + +# State 2023 September + - we already ship MT version of the runtime in the wasm-tools workload. + - It's enabled by `true` and it requires COOP HTTP headers. + - It will serve extra file `dotnet.native.worker.js`. + - This will also start in Blazor project, but UI rendering would not work. + - we have pre-allocated pool of browser Web Workers which are mapped to pthread dynamically. + - we can configure pthread to keep running after synchronous thread_main finished. That's necessary to run any async tasks involving JavaScript interop. + - legacy interop has problems with GC boundaries. + - JSImport & JSExport work + - There is private JSSynchronizationContext implementation which is too synchronous + - There is draft of public C# API for creating JSWebWorker with JS interop. It must be dedicated un-managed resource, because we could not cleanup JS state created by user code. + - There is MT version of HTTP & WS clients, which could be called from any thread but it's also too synchronous implementation. + - Many unit tests fail on MT https://github.com/dotnet/runtime/pull/91536 + - there are MT C# ref assemblies, which don't throw PNSE for MT build of the runtime for blocking APIs. + +# Alternatives - as considered 2023 Sep +- how to deal with blocking C# code on UI thread + - **A)** pretend it's not a problem (this we already have) + - **B)** move user C# code to web worker + - **C)** move all Mono to web worker + - **D)** like **A)** just move call of the C# `Main()` to `JSWebWorker` +- how to deal with blocking in synchronous JS calls from UI thread (like `onClick` callback) + - **D)** pretend it's not a problem (this we already have) + - **E)** throw PNSE when synchronous JSExport is called on UI thread + - **F)** dispatch calls to synchronous JSExport to web worker and spin-wait on JS side of UI thread. +- how to implement JS interop between managed main thread and UI thread (DOM) + - **G)** put it out of scope for MT, manually implement what Blazor needs + - **H)** pure JS dispatch between threads, [comlink](https://github.com/GoogleChromeLabs/comlink) style + - **I)** C/emscripten dispatch of infrastructure to marshal individual parameters + - **J)** C/emscripten dispatch of method binding and invoke, but marshal parameters on UI thread + - **K)** pure C# dispatch between threads +- how to implement JS interop on non-main web worker + - **L)** disable it for all non-main threads + - **M)** disable it for managed thread pool threads + - **N)** allow it only for threads created as dedicated resource `WebWorker` via new API + - **O)** enables it on all workers (let user deal with JS state) +- how to dispatch calls to the right JS thread context + - **P)** via `SynchronizationContext` before `JSImport` stub, synchronously, stack frames + - **Q)** via `SynchronizationContext` inside `JSImport` C# stub + - **R)** via `emscripten_dispatch_to_thread_async` inside C code of `` +- how to implement GC/dispose of `JSObject` proxies + - **S)** per instance: synchronous dispatch the call to correct thread via `SynchronizationContext` + - **T)** per instance: async schedule the cleanup + - at the detach of the thread. We already have `forceDisposeProxies` + - could target managed thread be paused during GC ? +- where to instantiate initial user JS modules (like Blazor's) + - **U)** in the UI thread + - **V)** in the deputy/sidecar thread +- where to instantiate `JSHost.ImportAsync` modules + - **W)** in the UI thread + - **X)** in the deputy/sidecar thread + - **Y)** allow it only for dedicated `JSWebWorker` threads + - **Z)** disable it + - same for `JSHost.GlobalThis`, `JSHost.DotnetInstance` +- how to implement Blazor's `renderBatch` + - **a)** keep as is, wrap it with GC pause, use legacy JS interop on UI thread + - **b)** extract some of the legacy JS interop into Blazor codebase + - **c)** switch to Blazor server mode. Web worker create the batch of bytes and UI thread apply it to DOM +- where to create HTTP+WS JS objects + - **d)** in the UI thread + - **e)** in the managed main thread + - **f)** in first calling `JSWebWorker` managed thread +- how to dispatch calls to HTTP+WS JS objects + - **g)** try to stick to the same thread via `ConfigureAwait(false)`. + - doesn't really work. `Task` migrate too freely + - **h)** via C# `SynchronizationContext` + - **i)** via `emscripten_dispatch_to_thread_async` + - **j)** via `postMessage` + - **k)** same whatever we choose for `JSImport` + - note there are some synchronous calls on WS +- where to create the emscripten instance + - **l)** could be on the UI thread + - **m)** could be on the "sidecar" thread +- where to start the Mono VM + - **n)** could be on the UI thread + - **o)** could be on the "sidecar" thread +- where to run the C# main entrypoint + - **p)** could be on the UI thread + - **q)** could be on the "deputy" or "sidecar" thread +- where to implement sync-to-async: crypto/DLL download/HTTP APIs/ + - **r)** out of scope + - **s)** in the UI thread + - **t)** in a dedicated web worker + - **z)** in the sidecar or deputy +- where to marshal JSImport/JSExport parameters/return/exception + - **u)** could be only values types, proxies out of scope + - **v)** could be on UI thread (with deputy design and Mono there) + - **w)** could be on sidecar (with double proxies of parameters via comlink) + - **x)** could be on sidecar (with comlink calls per parameter) + +## Interesting combinations + +### (8) Minimal support +- **A,D,G,L,P,S,U,Y,a,f,h,l,n,p,v** +- this is what we [already have today](#Current-state-2023-Sep) +- it could deadlock or die, +- JS interop on threads requires lot of user code attention +- Keeps problems **1,2,3,4** + +### (9) Sidecar + no JS interop + narrow Blazor support +- **C,E,G,L,P,S,U,Z,c,d,h,m,o,q,u** +- minimal effort, low risk, low capabilities +- move both emscripten and Mono VM sidecar thread +- no user code JS interop on any thread +- internal solutions for Blazor needs +- Ignores problems **1,2,3,4,5** + +### (10) Sidecar + only async just JS proxies UI + JSWebWorker + Blazor WASM server +- **C,E,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** +- no C or managed code on UI thread + - this architectural clarity is major selling point for sidecar design +- no support for blocking sync JSExport calls from UI thread (callbacks) + - it will throw PNSE +- this will create double proxy for `Task`, `JSObject`, `Func<>` etc + - difficult to GC, difficult to debug +- double marshaling of parameters +- Solves **1,2** for managed code. +- Avoids **1,2** for JS callback + - emscripten main loop stays responsive only when main managed thread is idle +- Solves **3,4,5** + +### (11) Sidecar + async & sync just JS proxies UI + JSWebWorker + Blazor WASM server +- **C,F,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** +- no C or managed code on UI thread +- support for blocking sync JSExport calls from UI thread (callbacks) + - at blocking the UI is at least well isolated from runtime code + - it makes responsibility for sync call clear +- this will create double proxy for `Task`, `JSObject`, `Func<>` etc + - difficult to GC, difficult to debug +- double marshaling of parameters +- Solves **1,2** for managed code + - unless there is sync `JSImport`->`JSExport` call +- Ignores **1,2** for JS callback + - emscripten main loop stays responsive only when main managed thread is idle +- Solves **3,4,5** + +### (12) Deputy + managed dispatch to UI + JSWebWorker + with sync JSExport +- **B,F,K,N,Q,S/T,U,W,a/b/c,d+f,h,l,n,s/z,v** +- this uses `JSSynchronizationContext` to dispatch calls to UI thread + - this is "dirty" as compared to sidecar because some managed code is actually running on UI thread + - it needs to also use `SynchronizationContext` for `JSExport` and callbacks, to dispatch to deputy. +- blazor render could be both legacy render or Blazor server style + - because we have both memory and mono on the UI thread +- Solves **1,2** for managed code + - unless there is sync `JSImport`->`JSExport` call +- Ignores **1,2** for JS callback + - emscripten main loop could deadlock on sync JSExport +- Solves **3,4,5** + +### (13) Deputy + emscripten dispatch to UI + JSWebWorker + with sync JSExport +- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** +- is variation of **(12)** + - with emscripten dispatch and marshaling in UI thread +- this uses `emscripten_dispatch_to_thread_async` for `call_entry_point`, `complete_task`, `cwraps.mono_wasm_invoke_method_bound`, `mono_wasm_invoke_bound_function`, `mono_wasm_invoke_import`, `call_delegate_method` to get to the UI thread. +- it uses other `cwraps` locally on UI thread, like `mono_wasm_new_root`, `stringToMonoStringRoot`, `malloc`, `free`, `create_task_callback_method` + - it means that interop related managed runtime code is running on the UI thread, but not the user code. + - it means that parameter marshalling is fast (compared to sidecar) + - this deputy design is major selling point #2 + - it still needs to enter GC barrier and so it could block UI for GC run shortly +- blazor render could be both legacy render or Blazor server style + - because we have both memory and mono on the UI thread +- Solves **1,2** for managed code + - unless there is sync `JSImport`->`JSExport` call +- Ignores **1,2** for JS callback + - emscripten main loop could deadlock on sync JSExport +- Solves **3,4,5** + +### (14) Deputy + emscripten dispatch to UI + JSWebWorker + without sync JSExport +- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** +- is variation of **(13)** + - without support for synchronous JSExport +- Solves **1,2** for managed code + - emscripten main loop stays responsive + - unless there is sync `JSImport`->`JSExport` call +- Avoids **2** for JS callback + - by throwing PNSE +- Solves **3,4,5** + +### (15) Deputy + Sidecar + UI thread +- 2 levels of indirection. +- benefit: blocking JSExport from UI thread doesn't block emscripten loop +- downside: complex and more resource intensive + +### (16) Deputy without Mono, no GC barrier breach for interop +- variation on **(13)** or **(14)** where we get rid of per-parameter calls to Mono +- benefit: get closer to purity of sidecar design without loosing perf + - this could be done later as purity optimization +- in this design the mono could be started on deputy thread + - this will keep UI responsive during startup +- UI would not be mono attached thread. +- See [details](#Get-rid-of-Mono-GC-boundary-breach) + +Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 + +### (17) Emscripten em_queue in deputy, managed UI thread +- is interesting because it avoids cross-thread dispatch to UI + - including double dispatch in Blazor's `RendererSynchronizationContext` +- avoids solving **1,2** +- low level hacking of emscripten design assumptions + +### (18) Soft deputy +- keep both Mono and emscripten in the UI thread +- use `SynchronizationContext` to do the dispatch +- make it easy and default to run any user code in deputy thread + - all Blazor events and callbacks like `onClick` to deputy + - move SignalR to deputy + - move Blazor entry point to deputy +- hope that UI thread is mostly idle + - enable dynamic thread allocation + - throw exceptions in dev loop when UI thread does `lock` or `.Wait` in user code + +### (19) Single threaded build in a WebWorker +- this already works well in Net8 +- when the developer is able to start dotnet in the worker himself and also handle all the messaging. +- there are known existing examples in the community + +## Sidecar options +There are few downsides to them +- if we keep main managed thread and emscripten thread the same, pthreads can't be created dynamically + - we could upgrade it to design **(15)** and have extra thread for running managed `Main()` +- we will have to implement extra layer of dispatch from UI to sidecar + - this could be pure JS via `postMessage`, which is slow and can't do spin-wait. + - we could have `SharedArrayBuffer` for the messages, but we would have to implement (another?) marshaling. + +## Dispatching call, who is responsible - User code - this is difficult and complex task which many will fail to do right - it can't be user code for HTTP/WS clients because there is no direct call via Streams @@ -254,9 +488,8 @@ There are few downsides to them - this is just the UI -> deputy dispatch, which is not C# code - Mono/C/JS internal layer - see `emscripten_dispatch_to_thread_async` below -- TODO: API SynCContext as parameter of `JSImport` -# Dispatching JSImport - what should happen +## Dispatching JSImport - what should happen - when there is no extra code-gen flag - for backward compatibility, dispatch handled by user - assert that we are on `JSWebWorker` or main thread @@ -272,7 +505,7 @@ There are few downsides to them - assert all parameters have same affinity - could be called from any thread, including thread pool -# Dispatching JSImport in deputy design - how to do it +## Dispatching JSImport in deputy design - how to do it - how to dispatch to UI in deputy design ? - A) double dispatch, C# -> main, emscripten -> UI - B) make whole dispatch emscripten only, implement blocking wait in C for emscripten sync calls. @@ -299,8 +532,7 @@ There are few downsides to them - store the `JSHandle` on JS side (thread static) associated with method ID - TODO: double dispatch in Blazor - -# Dispatching JSExport - what should happen +## Dispatching JSExport - alternatives - when caller is UI, we need to dispatch back to managed thread - preferably deputy or sidecar thread - when caller is `JSWebWorker`, @@ -308,7 +540,7 @@ There are few downsides to them - when caller is callback from HTTP/WS we could dispatch to any managed thread - callers are not from managed thread pool, by design. Because we don't want any JS code running there. -# Dispatching call - options +## Dispatching call - alternatives - `JSSynchronizationContext` - in deputy design - this would not work for dispatch to UI thread as it doesn't have sync context - is implementation of `SynchronizationContext` installed to @@ -331,8 +563,6 @@ There are few downsides to them - `emscripten_dispatch_to_thread_async` - in deputy design - can dispatch async call to C function on the timer loop of target pthread - doesn't block and doesn't propagate results and exceptions - - this would not work in sidecar design - - because UI is not pthread there - from JS (UI) to C# managed main - only necessary for deputy/sidecar, not for HTTP - async @@ -396,47 +626,6 @@ There are few downsides to them - doesn't block and doesn't propagate exceptions - this is slow -## Move Mono startup to deputy -- related to design **(16)** -- Get rid of Mono GC boundary breach -- `Task`/`Promise` - - improved in https://github.com/dotnet/runtime/pull/93010 -- `MonoString` - - `monoStringToString`, `stringToMonoStringRoot` - - `mono_wasm_string_get_data_ref` - - `mono_wasm_string_from_utf16_ref` - - `get_string_root` -> `mono_wasm_new_external_root` - - we could start passing just a buffer instead of `MonoString` - - we will lose the optimization for interned strings -- managed instances in `MonoArray`, like `MonoString`, `JSObject` or `System.Object` - - `mono_wasm_register_root`, `mono_wasm_deregister_root` - - `Interop.Runtime.DeregisterGCRoot`, `Interop.Runtime.RegisterGCRoot` -- this is about GC and Dispose(): `ManagedObject`, `ErrorObject` - - `release_js_owned_object_by_gc_handle`, `setup_managed_proxy`, `teardown_managed_proxy` - - `JavaScriptExports.ReleaseJSOwnedObjectByGCHandle`, `CreateTaskCallback` -- this is about GC and Dispose(): `JSObject`, `JSException` - - `Interop.Runtime.ReleaseCSOwnedObject` -- `mono_wasm_get_assembly_exports` -> `__Register_` - - `mono_wasm_assembly_load`, `mono_wasm_assembly_find_class`, `mono_wasm_assembly_find_method` - - this logic could be moved to deputy or sidecar thread -- `mono_wasm_bind_js_function`, `mono_wasm_bind_cs_function` - - `mono_wasm_new_external_root` -- `invoke_method_and_handle_exception` - - `mono_wasm_new_root` -- not problem for deputy design: `Module.stackAlloc`, `Module.stackSave`, `Module.stackRestore` -- what's overall perf impact for Blazor's `renderBatch` ? - -## Performance -As compared to ST build for dotnet wasm: -- the dispatch between threads (caused by JS object thread affinity) will have negative performance impact on the JS interop -- in case of HTTP/WS clients used via Streams, it could be surprizing -- browser performance is lower when working with SharedArrayBuffer -- Mono performance is lower because there are GC safe-points and locks in the VM code -- startup is slower because creation of WebWorker instances is slow -- VFS access is slow because it's dispatched to UI thread -- console output is slow because it's POSIX stream is dispatched to UI thread, call per `put_char` -- any VFS access is slow because it dispatched to UI thread - ## Spin-waiting in JS - if we want to keep synchronous JS APIs to work on UI thread, we have to spin-wait - we probably should have opt-in configuration flag for this @@ -457,11 +646,6 @@ As compared to ST build for dotnet wasm: - it could still deadlock if there is synchronous JSImport call to UI thread while UI thread is spin-waiting on it. - this would be clearly user code mistake -## Debugging -- VS debugger would work as usual -- Chrome dev tools would only see the events coming from `postMessage` or `Atomics.waitAsync` -- Chrome dev tools debugging C# could be bit different, it possibly works already. The C# code would be in different node of the "source" tree view - ## Blazor - as compared to single threaded runtime, the major difference would be no synchronous callbacks. - for example from DOM `onClick`. This is one of the reasons people prefer ST WASM over Blazor Server. @@ -482,10 +666,6 @@ As compared to ST build for dotnet wasm: - `JSImport` used for logging: `globalThis.console.debug`, `globalThis.console.error`, `globalThis.console.info`, `globalThis.console.warn`, `Blazor._internal.dotNetCriticalError` - probably could be any JS context -## Virtual filesystem -- we use emscripten's VFS, which is JavaScript implementation in the UI thread. -- the POSIX operations are synchronously dispatched to UI thread. - ## WebPack, Rollup friendly - it's not clear how to make this single-file - because web workers need to start separate script(s) via `new Worker('./dotnet.js', {type: 'module'})` @@ -506,293 +686,6 @@ As compared to ST build for dotnet wasm: - we could synchronously wait for another thread to do async operations - to fetch another DLL which was not pre-downloaded -## New pthreads -- with deputy design we could set `PTHREAD_POOL_SIZE_STRICT=0` and enable threads to be created dynamically - -# Current state 2023 Sep - - we already ship MT version of the runtime in the wasm-tools workload. - - It's enabled by `true` and it requires COOP HTTP headers. - - It will serve extra file `dotnet.native.worker.js`. - - This will also start in Blazor project, but UI rendering would not work. - - we have pre-allocated pool of browser Web Workers which are mapped to pthread dynamically. - - we can configure pthread to keep running after synchronous thread_main finished. That's necessary to run any async tasks involving JavaScript interop. - - legacy interop has problems with GC boundaries. - - JSImport & JSExport work - - There is private JSSynchronizationContext implementation which is too synchronous - - There is draft of public C# API for creating JSWebWorker with JS interop. It must be dedicated un-managed resource, because we could not cleanup JS state created by user code. - - There is MT version of HTTP & WS clients, which could be called from any thread but it's also too synchronous implementation. - - Many unit tests fail on MT https://github.com/dotnet/runtime/pull/91536 - - there are MT C# ref assemblies, which don't throw PNSE for MT build of the runtime for blocking APIs. - -## Implementation options (only some combinations are possible) -- how to deal with blocking C# code on UI thread - - **A)** pretend it's not a problem (this we already have) - - **B)** move user C# code to web worker - - **C)** move all Mono to web worker - - **D)** like **A)** just move call of the C# `Main()` to `JSWebWorker` -- how to deal with blocking in synchronous JS calls from UI thread (like `onClick` callback) - - **D)** pretend it's not a problem (this we already have) - - **E)** throw PNSE when synchronous JSExport is called on UI thread - - **F)** dispatch calls to synchronous JSExport to web worker and spin-wait on JS side of UI thread. -- how to implement JS interop between managed main thread and UI thread (DOM) - - **G)** put it out of scope for MT, manually implement what Blazor needs - - **H)** pure JS dispatch between threads, [comlink](https://github.com/GoogleChromeLabs/comlink) style - - **I)** C/emscripten dispatch of infrastructure to marshal individual parameters - - **J)** C/emscripten dispatch of method binding and invoke, but marshal parameters on UI thread - - **K)** pure C# dispatch between threads -- how to implement JS interop on non-main web worker - - **L)** disable it for all non-main threads - - **M)** disable it for managed thread pool threads - - **N)** allow it only for threads created as dedicated resource `WebWorker` via new API - - **O)** enables it on all workers (let user deal with JS state) -- how to dispatch calls to the right JS thread context - - **P)** via `SynchronizationContext` before `JSImport` stub, synchronously, stack frames - - **Q)** via `SynchronizationContext` inside `JSImport` C# stub - - **R)** via `emscripten_dispatch_to_thread_async` inside C code of `` -- how to implement GC/dispose of `JSObject` proxies - - **S)** per instance: synchronous dispatch the call to correct thread via `SynchronizationContext` - - **T)** per instance: async schedule the cleanup - - at the detach of the thread. We already have `forceDisposeProxies` - - could target managed thread be paused during GC ? -- where to instantiate initial user JS modules (like Blazor's) - - **U)** in the UI thread - - **V)** in the deputy/sidecar thread -- where to instantiate `JSHost.ImportAsync` modules - - **W)** in the UI thread - - **X)** in the deputy/sidecar thread - - **Y)** allow it only for dedicated `JSWebWorker` threads - - **Z)** disable it - - same for `JSHost.GlobalThis`, `JSHost.DotnetInstance` -- how to implement Blazor's `renderBatch` - - **a)** keep as is, wrap it with GC pause, use legacy JS interop on UI thread - - **b)** extract some of the legacy JS interop into Blazor codebase - - **c)** switch to Blazor server mode. Web worker create the batch of bytes and UI thread apply it to DOM -- where to create HTTP+WS JS objects - - **d)** in the UI thread - - **e)** in the managed main thread - - **f)** in first calling `JSWebWorker` managed thread -- how to dispatch calls to HTTP+WS JS objects - - **g)** try to stick to the same thread via `ConfigureAwait(false)`. - - doesn't really work. `Task` migrate too freely - - **h)** via C# `SynchronizationContext` - - **i)** via `emscripten_dispatch_to_thread_async` - - **j)** via `postMessage` - - **k)** same whatever we choose for `JSImport` - - note there are some synchronous calls on WS -- where to create the emscripten instance - - **l)** could be on the UI thread - - **m)** could be on the "sidecar" thread -- where to start the Mono VM - - **n)** could be on the UI thread - - **o)** could be on the "sidecar" thread -- where to run the C# main entrypoint - - **p)** could be on the UI thread - - **q)** could be on the "deputy" or "sidecar" thread -- where to implement sync-to-async: crypto/DLL download/HTTP APIs/ - - **r)** out of scope - - **s)** in the UI thread - - **t)** in a dedicated web worker - - **z)** in the sidecar or deputy -- where to marshal JSImport/JSExport parameters/return/exception - - **u)** could be only values types, proxies out of scope - - **v)** could be on UI thread (with deputy design and Mono there) - - **w)** could be on sidecar (with double proxies of parameters via comlink) - - **x)** could be on sidecar (with comlink calls per parameter) - -# Interesting combinations - -## (8) Minimal support -- **A,D,G,L,P,S,U,Y,a,f,h,l,n,p,v** -- this is what we [already have today](#Current-state-2023-Sep) -- it could deadlock or die, -- JS interop on threads requires lot of user code attention -- Keeps problems **1,2,3,4** - -## (9) Sidecar + no JS interop + narrow Blazor support -- **C,E,G,L,P,S,U,Z,c,d,h,m,o,q,u** -- minimal effort, low risk, low capabilities -- move both emscripten and Mono VM sidecar thread -- no user code JS interop on any thread -- internal solutions for Blazor needs -- Ignores problems **1,2,3,4,5** - -## (10) Sidecar + only async just JS proxies UI + JSWebWorker + Blazor WASM server -- **C,E,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** -- no C or managed code on UI thread - - this architectural clarity is major selling point for sidecar design -- no support for blocking sync JSExport calls from UI thread (callbacks) - - it will throw PNSE -- this will create double proxy for `Task`, `JSObject`, `Func<>` etc - - difficult to GC, difficult to debug -- double marshaling of parameters -- Solves **1,2** for managed code. -- Avoids **1,2** for JS callback - - emscripten main loop stays responsive only when main managed thread is idle -- Solves **3,4,5** - -## (11) Sidecar + async & sync just JS proxies UI + JSWebWorker + Blazor WASM server -- **C,F,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** -- no C or managed code on UI thread -- support for blocking sync JSExport calls from UI thread (callbacks) - - at blocking the UI is at least well isolated from runtime code - - it makes responsibility for sync call clear -- this will create double proxy for `Task`, `JSObject`, `Func<>` etc - - difficult to GC, difficult to debug -- double marshaling of parameters -- Solves **1,2** for managed code - - unless there is sync `JSImport`->`JSExport` call -- Ignores **1,2** for JS callback - - emscripten main loop stays responsive only when main managed thread is idle -- Solves **3,4,5** - -## (12) Deputy + managed dispatch to UI + JSWebWorker + with sync JSExport -- **B,F,K,N,Q,S/T,U,W,a/b/c,d+f,h,l,n,s/z,v** -- this uses `JSSynchronizationContext` to dispatch calls to UI thread - - this is "dirty" as compared to sidecar because some managed code is actually running on UI thread - - it needs to also use `SynchronizationContext` for `JSExport` and callbacks, to dispatch to deputy. -- blazor render could be both legacy render or Blazor server style - - because we have both memory and mono on the UI thread -- Solves **1,2** for managed code - - unless there is sync `JSImport`->`JSExport` call -- Ignores **1,2** for JS callback - - emscripten main loop could deadlock on sync JSExport -- Solves **3,4,5** - -## (13) Deputy + emscripten dispatch to UI + JSWebWorker + with sync JSExport -- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** -- is variation of **(12)** - - with emscripten dispatch and marshaling in UI thread -- this uses `emscripten_dispatch_to_thread_async` for `call_entry_point`, `complete_task`, `cwraps.mono_wasm_invoke_method_bound`, `mono_wasm_invoke_bound_function`, `mono_wasm_invoke_import`, `call_delegate_method` to get to the UI thread. -- it uses other `cwraps` locally on UI thread, like `mono_wasm_new_root`, `stringToMonoStringRoot`, `malloc`, `free`, `create_task_callback_method` - - it means that interop related managed runtime code is running on the UI thread, but not the user code. - - it means that parameter marshalling is fast (compared to sidecar) - - this deputy design is major selling point #2 - - it still needs to enter GC barrier and so it could block UI for GC run shortly -- blazor render could be both legacy render or Blazor server style - - because we have both memory and mono on the UI thread -- Solves **1,2** for managed code - - unless there is sync `JSImport`->`JSExport` call -- Ignores **1,2** for JS callback - - emscripten main loop could deadlock on sync JSExport -- Solves **3,4,5** - -## (14) Deputy + emscripten dispatch to UI + JSWebWorker + without sync JSExport -- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** -- is variation of **(13)** - - without support for synchronous JSExport -- Solves **1,2** for managed code - - emscripten main loop stays responsive - - unless there is sync `JSImport`->`JSExport` call -- Avoids **2** for JS callback - - by throwing PNSE -- Solves **3,4,5** - -## (15) Deputy + Sidecar + UI thread -- 2 levels of indirection. -- benefit: blocking JSExport from UI thread doesn't block emscripten loop -- downside: complex and more resource intensive - -## (16) Deputy without Mono, no GC barrier breach for interop -- variation on **(13)** or **(14)** where we get rid of per-parameter calls to Mono -- benefit: get closer to purity of sidecar design without loosing perf - - this could be done later as purity optimization -- in this design the mono could be started on deputy thread - - this will keep UI responsive during startup -- UI would not be mono attached thread. -- See [details](#Get-rid-of-Mono-GC-boundary-breach) - -Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 - -## (17) Emscripten em_queue in deputy, managed UI thread -- is interesting because it avoids cross-thread dispatch to UI - - including double dispatch in Blazor's `RendererSynchronizationContext` -- avoids solving **1,2** -- low level hacking of emscripten design assumptions - -## (18) Soft deputy -- keep both Mono and emscripten in the UI thread -- use `SynchronizationContext` to do the dispatch -- make it easy and default to run any user code in deputy thread - - all Blazor events and callbacks like `onClick` to deputy - - move SignalR to deputy - - move Blazor entry point to deputy -- hope that UI thread is mostly idle - - enable dynamic thread allocation - - throw exceptions in dev loop when UI thread does `lock` or `.Wait` in user code - -## Scratch pad - -current generated `JSImport` in Net7, Net8 - -```cs - -[JSImport(Dispatch.UI)] -public static partial Task WebSocketReceive(JSObject webSocket, nint bufferPtr, int bufferLength); - -[JSImport(Dispatch.Params)] -public static partial Task WebSocketReceive(JSObject webSocket, nint bufferPtr, int bufferLength); - -[DebuggerNonUserCode] -public static partial Task WebSocketReceive(JSObject webSocket, nint bufferPtr, int bufferLength) -{ - if (__signature_WebSocketReceive_1144640460 == null) - { - __signature_WebSocketReceive_1144640460 = JSFunctionBinding.BindJSFunction("INTERNAL.ws_wasm_receive", null, new JSMarshalerType[] { - JSMarshalerType.Task(), - JSMarshalerType.JSObject, - JSMarshalerType.IntPtr, - JSMarshalerType.Int32 - }); - } - - Span __arguments_buffer = stackalloc JSMarshalerArgument[5]; - ref JSMarshalerArgument __arg_exception = ref __arguments_buffer[0]; - __arg_exception.Initialize(); - ref JSMarshalerArgument __arg_return = ref __arguments_buffer[1]; - __arg_return.Initialize(); - Task __retVal; - - ref JSMarshalerArgument __bufferLength_native__js_arg = ref __arguments_buffer[4]; - ref JSMarshalerArgument __bufferPtr_native__js_arg = ref __arguments_buffer[3]; - ref JSMarshalerArgument __webSocket_native__js_arg = ref __arguments_buffer[2]; - - __bufferLength_native__js_arg.ToJS(bufferLength); - __bufferPtr_native__js_arg.ToJS(bufferPtr); - __webSocket_native__js_arg.ToJS(webSocket); - - JSFunctionBinding.InvokeJS(__signature_WebSocketReceive_1144640460, __arguments_buffer); - - __arg_return.ToManaged(out __retVal); - - return __retVal; -} - -[ThreadStaticAttribute] -static JSFunctionBinding __signature_WebSocketReceive_1144640460; - -[DebuggerNonUserCode] -internal static unsafe void __Wrapper_Dummy_1616792047(JSMarshalerArgument* __arguments_buffer) -{ - Task meaningPromise; - ref JSMarshalerArgument __arg_exception = ref __arguments_buffer[0]; - ref JSMarshalerArgument __arg_return = ref __arguments_buffer[1]; - - ref JSMarshalerArgument __meaningPromise_native__js_arg = ref __arguments_buffer[2]; - try - { - - __meaningPromise_native__js_arg.ToManaged(out meaningPromise, - static (ref JSMarshalerArgument __task_result_arg, out int __task_result) => - { - __task_result_arg.ToManaged(out __task_result); - }); - Sample.Test.Dummy(meaningPromise); - } - catch (global::System.Exception ex) - { - __arg_exception.ToJS(ex); - } -} -``` - - +## Remove Mono from UI thread +- Get rid of Mono GC boundary breach +- see https://github.com/dotnet/runtime/issues/100411 From a4eef089586d097ddf0f00cbfff6a1b08a3a93c0 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 18 Apr 2024 13:48:49 +0200 Subject: [PATCH 096/108] more --- accepted/2023/wasm-browser-threads.md | 9 --------- 1 file changed, 9 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 4d899add5..ef5a284fc 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -209,8 +209,6 @@ For other possible design options we considered [see below](#Alternatives). - there is `JSSynchronizationContext`` installed on it - so that user code could dispatch back to it, in case that it needs to call `JSObject` proxy (with thread affinity) - this thread needs to throw on any `.Wait` because of the problem **7** -- alternatively we could disable C# code on this thread and treat it similar to UI thread -- alternatively we could have I/O threads ## HTTP and WS clients - are implemented in terms of `JSObject` and `Promise` proxies @@ -229,13 +227,6 @@ For other possible design options we considered [see below](#Alternatives). - this would also require separate thread, doing the async job - we could use I/O thread for it -## JSImport calls on threads without JSWebWorker -- those are - - thread-pool threads - - main managed thread in deputy design -- we dispatch it to UI thread - - easy to understand default behavior - ## Performance As compared to ST build for dotnet wasm: - the dispatch between threads (caused by JS object thread affinity) will have negative performance impact on the JS interop From d11a56b278920224e48408d68cc595855902dc26 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Thu, 18 Apr 2024 13:49:23 +0200 Subject: [PATCH 097/108] whitespace --- accepted/2023/wasm-browser-threads.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index ef5a284fc..debe8786e 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -115,9 +115,9 @@ For other possible design options we considered [see below](#Alternatives). - is helper thread which allows `Task` to be resolved by UI's `Promise` even when deputy thread is blocked in `.Wait` - JS interop from any thread is marshaled to UI thread's JavaScript - HTTP and WS clients are implemented in JS of UI thread -- There is draft of `JSWebWorker` API +- There is draft of `JSWebWorker` API - it allows C# users to create dedicated JS thread - - the `JSImport` calls are dispatched to it if you are on the that thread + - the `JSImport` calls are dispatched to it if you are on the that thread - or if you pass `JSObject` proxy with affinity to that thread as `JSImport` parameter. - The API was not made public in Net9 yet - calling synchronous `JSExports` is not supported on UI thread From bce30d6861f441f03ecaf53463bc59272857a5f5 Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Tue, 14 May 2024 12:07:26 +0200 Subject: [PATCH 098/108] feedback --- accepted/2023/wasm-browser-threads.md | 447 +------------------------- 1 file changed, 3 insertions(+), 444 deletions(-) diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index debe8786e..82ce91cef 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -93,7 +93,7 @@ Move all managed user code out of UI/DOM thread, so that it becomes consistent w ## What was implemented in Net9 - Deputy thread design -For other possible design options we considered [see below](#Alternatives). +For other possible design options we considered [see below](#alternatives-and-details---as-considered-2023-sep). - Introduce dedicated web worker called "deputy thread" - managed `Main()` is dispatched onto deputy thread @@ -237,446 +237,5 @@ As compared to ST build for dotnet wasm: - VFS access is slow because it's dispatched to UI thread - console output is slow because it's POSIX stream is dispatched to UI thread, call per line -# State 2023 September - - we already ship MT version of the runtime in the wasm-tools workload. - - It's enabled by `true` and it requires COOP HTTP headers. - - It will serve extra file `dotnet.native.worker.js`. - - This will also start in Blazor project, but UI rendering would not work. - - we have pre-allocated pool of browser Web Workers which are mapped to pthread dynamically. - - we can configure pthread to keep running after synchronous thread_main finished. That's necessary to run any async tasks involving JavaScript interop. - - legacy interop has problems with GC boundaries. - - JSImport & JSExport work - - There is private JSSynchronizationContext implementation which is too synchronous - - There is draft of public C# API for creating JSWebWorker with JS interop. It must be dedicated un-managed resource, because we could not cleanup JS state created by user code. - - There is MT version of HTTP & WS clients, which could be called from any thread but it's also too synchronous implementation. - - Many unit tests fail on MT https://github.com/dotnet/runtime/pull/91536 - - there are MT C# ref assemblies, which don't throw PNSE for MT build of the runtime for blocking APIs. - -# Alternatives - as considered 2023 Sep -- how to deal with blocking C# code on UI thread - - **A)** pretend it's not a problem (this we already have) - - **B)** move user C# code to web worker - - **C)** move all Mono to web worker - - **D)** like **A)** just move call of the C# `Main()` to `JSWebWorker` -- how to deal with blocking in synchronous JS calls from UI thread (like `onClick` callback) - - **D)** pretend it's not a problem (this we already have) - - **E)** throw PNSE when synchronous JSExport is called on UI thread - - **F)** dispatch calls to synchronous JSExport to web worker and spin-wait on JS side of UI thread. -- how to implement JS interop between managed main thread and UI thread (DOM) - - **G)** put it out of scope for MT, manually implement what Blazor needs - - **H)** pure JS dispatch between threads, [comlink](https://github.com/GoogleChromeLabs/comlink) style - - **I)** C/emscripten dispatch of infrastructure to marshal individual parameters - - **J)** C/emscripten dispatch of method binding and invoke, but marshal parameters on UI thread - - **K)** pure C# dispatch between threads -- how to implement JS interop on non-main web worker - - **L)** disable it for all non-main threads - - **M)** disable it for managed thread pool threads - - **N)** allow it only for threads created as dedicated resource `WebWorker` via new API - - **O)** enables it on all workers (let user deal with JS state) -- how to dispatch calls to the right JS thread context - - **P)** via `SynchronizationContext` before `JSImport` stub, synchronously, stack frames - - **Q)** via `SynchronizationContext` inside `JSImport` C# stub - - **R)** via `emscripten_dispatch_to_thread_async` inside C code of `` -- how to implement GC/dispose of `JSObject` proxies - - **S)** per instance: synchronous dispatch the call to correct thread via `SynchronizationContext` - - **T)** per instance: async schedule the cleanup - - at the detach of the thread. We already have `forceDisposeProxies` - - could target managed thread be paused during GC ? -- where to instantiate initial user JS modules (like Blazor's) - - **U)** in the UI thread - - **V)** in the deputy/sidecar thread -- where to instantiate `JSHost.ImportAsync` modules - - **W)** in the UI thread - - **X)** in the deputy/sidecar thread - - **Y)** allow it only for dedicated `JSWebWorker` threads - - **Z)** disable it - - same for `JSHost.GlobalThis`, `JSHost.DotnetInstance` -- how to implement Blazor's `renderBatch` - - **a)** keep as is, wrap it with GC pause, use legacy JS interop on UI thread - - **b)** extract some of the legacy JS interop into Blazor codebase - - **c)** switch to Blazor server mode. Web worker create the batch of bytes and UI thread apply it to DOM -- where to create HTTP+WS JS objects - - **d)** in the UI thread - - **e)** in the managed main thread - - **f)** in first calling `JSWebWorker` managed thread -- how to dispatch calls to HTTP+WS JS objects - - **g)** try to stick to the same thread via `ConfigureAwait(false)`. - - doesn't really work. `Task` migrate too freely - - **h)** via C# `SynchronizationContext` - - **i)** via `emscripten_dispatch_to_thread_async` - - **j)** via `postMessage` - - **k)** same whatever we choose for `JSImport` - - note there are some synchronous calls on WS -- where to create the emscripten instance - - **l)** could be on the UI thread - - **m)** could be on the "sidecar" thread -- where to start the Mono VM - - **n)** could be on the UI thread - - **o)** could be on the "sidecar" thread -- where to run the C# main entrypoint - - **p)** could be on the UI thread - - **q)** could be on the "deputy" or "sidecar" thread -- where to implement sync-to-async: crypto/DLL download/HTTP APIs/ - - **r)** out of scope - - **s)** in the UI thread - - **t)** in a dedicated web worker - - **z)** in the sidecar or deputy -- where to marshal JSImport/JSExport parameters/return/exception - - **u)** could be only values types, proxies out of scope - - **v)** could be on UI thread (with deputy design and Mono there) - - **w)** could be on sidecar (with double proxies of parameters via comlink) - - **x)** could be on sidecar (with comlink calls per parameter) - -## Interesting combinations - -### (8) Minimal support -- **A,D,G,L,P,S,U,Y,a,f,h,l,n,p,v** -- this is what we [already have today](#Current-state-2023-Sep) -- it could deadlock or die, -- JS interop on threads requires lot of user code attention -- Keeps problems **1,2,3,4** - -### (9) Sidecar + no JS interop + narrow Blazor support -- **C,E,G,L,P,S,U,Z,c,d,h,m,o,q,u** -- minimal effort, low risk, low capabilities -- move both emscripten and Mono VM sidecar thread -- no user code JS interop on any thread -- internal solutions for Blazor needs -- Ignores problems **1,2,3,4,5** - -### (10) Sidecar + only async just JS proxies UI + JSWebWorker + Blazor WASM server -- **C,E,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** -- no C or managed code on UI thread - - this architectural clarity is major selling point for sidecar design -- no support for blocking sync JSExport calls from UI thread (callbacks) - - it will throw PNSE -- this will create double proxy for `Task`, `JSObject`, `Func<>` etc - - difficult to GC, difficult to debug -- double marshaling of parameters -- Solves **1,2** for managed code. -- Avoids **1,2** for JS callback - - emscripten main loop stays responsive only when main managed thread is idle -- Solves **3,4,5** - -### (11) Sidecar + async & sync just JS proxies UI + JSWebWorker + Blazor WASM server -- **C,F,H,N,P,S,U,W+Y,c,e+f,h+k,m,o,q,w** -- no C or managed code on UI thread -- support for blocking sync JSExport calls from UI thread (callbacks) - - at blocking the UI is at least well isolated from runtime code - - it makes responsibility for sync call clear -- this will create double proxy for `Task`, `JSObject`, `Func<>` etc - - difficult to GC, difficult to debug -- double marshaling of parameters -- Solves **1,2** for managed code - - unless there is sync `JSImport`->`JSExport` call -- Ignores **1,2** for JS callback - - emscripten main loop stays responsive only when main managed thread is idle -- Solves **3,4,5** - -### (12) Deputy + managed dispatch to UI + JSWebWorker + with sync JSExport -- **B,F,K,N,Q,S/T,U,W,a/b/c,d+f,h,l,n,s/z,v** -- this uses `JSSynchronizationContext` to dispatch calls to UI thread - - this is "dirty" as compared to sidecar because some managed code is actually running on UI thread - - it needs to also use `SynchronizationContext` for `JSExport` and callbacks, to dispatch to deputy. -- blazor render could be both legacy render or Blazor server style - - because we have both memory and mono on the UI thread -- Solves **1,2** for managed code - - unless there is sync `JSImport`->`JSExport` call -- Ignores **1,2** for JS callback - - emscripten main loop could deadlock on sync JSExport -- Solves **3,4,5** - -### (13) Deputy + emscripten dispatch to UI + JSWebWorker + with sync JSExport -- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** -- is variation of **(12)** - - with emscripten dispatch and marshaling in UI thread -- this uses `emscripten_dispatch_to_thread_async` for `call_entry_point`, `complete_task`, `cwraps.mono_wasm_invoke_method_bound`, `mono_wasm_invoke_bound_function`, `mono_wasm_invoke_import`, `call_delegate_method` to get to the UI thread. -- it uses other `cwraps` locally on UI thread, like `mono_wasm_new_root`, `stringToMonoStringRoot`, `malloc`, `free`, `create_task_callback_method` - - it means that interop related managed runtime code is running on the UI thread, but not the user code. - - it means that parameter marshalling is fast (compared to sidecar) - - this deputy design is major selling point #2 - - it still needs to enter GC barrier and so it could block UI for GC run shortly -- blazor render could be both legacy render or Blazor server style - - because we have both memory and mono on the UI thread -- Solves **1,2** for managed code - - unless there is sync `JSImport`->`JSExport` call -- Ignores **1,2** for JS callback - - emscripten main loop could deadlock on sync JSExport -- Solves **3,4,5** - -### (14) Deputy + emscripten dispatch to UI + JSWebWorker + without sync JSExport -- **B,F,J,N,R,T,U,W,a/b/c,d+f,i,l,n,s,v** -- is variation of **(13)** - - without support for synchronous JSExport -- Solves **1,2** for managed code - - emscripten main loop stays responsive - - unless there is sync `JSImport`->`JSExport` call -- Avoids **2** for JS callback - - by throwing PNSE -- Solves **3,4,5** - -### (15) Deputy + Sidecar + UI thread -- 2 levels of indirection. -- benefit: blocking JSExport from UI thread doesn't block emscripten loop -- downside: complex and more resource intensive - -### (16) Deputy without Mono, no GC barrier breach for interop -- variation on **(13)** or **(14)** where we get rid of per-parameter calls to Mono -- benefit: get closer to purity of sidecar design without loosing perf - - this could be done later as purity optimization -- in this design the mono could be started on deputy thread - - this will keep UI responsive during startup -- UI would not be mono attached thread. -- See [details](#Get-rid-of-Mono-GC-boundary-breach) - -Related Net8 tracking https://github.com/dotnet/runtime/issues/85592 - -### (17) Emscripten em_queue in deputy, managed UI thread -- is interesting because it avoids cross-thread dispatch to UI - - including double dispatch in Blazor's `RendererSynchronizationContext` -- avoids solving **1,2** -- low level hacking of emscripten design assumptions - -### (18) Soft deputy -- keep both Mono and emscripten in the UI thread -- use `SynchronizationContext` to do the dispatch -- make it easy and default to run any user code in deputy thread - - all Blazor events and callbacks like `onClick` to deputy - - move SignalR to deputy - - move Blazor entry point to deputy -- hope that UI thread is mostly idle - - enable dynamic thread allocation - - throw exceptions in dev loop when UI thread does `lock` or `.Wait` in user code - -### (19) Single threaded build in a WebWorker -- this already works well in Net8 -- when the developer is able to start dotnet in the worker himself and also handle all the messaging. -- there are known existing examples in the community - -## Sidecar options -There are few downsides to them -- if we keep main managed thread and emscripten thread the same, pthreads can't be created dynamically - - we could upgrade it to design **(15)** and have extra thread for running managed `Main()` -- we will have to implement extra layer of dispatch from UI to sidecar - - this could be pure JS via `postMessage`, which is slow and can't do spin-wait. - - we could have `SharedArrayBuffer` for the messages, but we would have to implement (another?) marshaling. - -## Dispatching call, who is responsible -- User code - - this is difficult and complex task which many will fail to do right - - it can't be user code for HTTP/WS clients because there is no direct call via Streams - - authors of 3rd party components would need to do it to hide complexity from users -- Roslyn generator: JSImport - if we make it responsible for the dispatch - - it needs to stay backward compatible with Net7, Net8 already generated code - - how to detect that there is new version of generated code ? - - it needs to do it via public C# API - - possibly new API `JSHost.Post` and `JSHost.Send` - - or `JSHost.InvokeInTargetSync` and `JSHost.InvokeInTargetAsync` - - it needs to re-consider current `stackalloc` - - probably by re-ordering Roslyn generated code of `__arg_return.ToManaged(out __retVal);` before `JSFunctionBinding.InvokeJS` - - it needs to propagate exceptions -- Roslyn generator: JSExport - can't be used - - this is just the UI -> deputy dispatch, which is not C# code -- Mono/C/JS internal layer - - see `emscripten_dispatch_to_thread_async` below - -## Dispatching JSImport - what should happen -- when there is no extra code-gen flag - - for backward compatibility, dispatch handled by user - - assert that we are on `JSWebWorker` or main thread - - assert all parameters affinity to current thread -- when generated with `JSImportAttribute.Affinity==UI` - - dispatch to UI thread - - assert all parameters affinity to UI thread - - could be called from any thread, including thread pool - - there is no `JSSynchronizationContext` in deputy's UI, use emscripten. - - emscripten can't block caller -- when generated with `JSImportAttribute.Affinity==ByParams` - - dispatch to thread of parameters - - assert all parameters have same affinity - - could be called from any thread, including thread pool - -## Dispatching JSImport in deputy design - how to do it -- how to dispatch to UI in deputy design ? - - A) double dispatch, C# -> main, emscripten -> UI - - B) make whole dispatch emscripten only, implement blocking wait in C for emscripten sync calls. - - C) only allow sync call on non-UI target -- how to obtain `JSHandle` of function in the target thread ? - - there are 2 dimensions: the thread and the method - - there are 2 steps: - - A) obtain existing `JSHandle` on next call (if available) - - to avoid double dispatch, this needs to be accessible - - by any caller thread - - or by UI thread C code (not managed) - - B) if this is first call to the method on the target thread, create the target `JSHandle` by binding existing JS function - - collecting the metadata is generated C# code - - therefore we need to get the metadata buffer on caller main thread: double dispatch - - store new `JSHandle` somewhere -- possible solution - assign `static` unique ID to the function on C# side during first call. - - A) Call back to C# if the method was not bound yet (which thread ?). - - B) Keep the metadata buffer - - make `JSFunctionBinding` registration static (not thread-static) - - never free the buffer - - pass the buffer on each call to the target - - late bind `JSHandle` - - store the `JSHandle` on JS side (thread static) associated with method ID -- TODO: double dispatch in Blazor - -## Dispatching JSExport - alternatives -- when caller is UI, we need to dispatch back to managed thread - - preferably deputy or sidecar thread -- when caller is `JSWebWorker`, - - we are probably on correct thread already - - when caller is callback from HTTP/WS we could dispatch to any managed thread -- callers are not from managed thread pool, by design. Because we don't want any JS code running there. - -## Dispatching call - alternatives -- `JSSynchronizationContext` - in deputy design - - this would not work for dispatch to UI thread as it doesn't have sync context - - is implementation of `SynchronizationContext` installed to - - managed thread with `JSWebWorker` - - or main managed thread - - it has asynchronous `SynchronizationContext.Post` - - it has synchronous `SynchronizationContext.Send` - - can propagate caller stack frames - - can propagate exceptions from callee thread - - when the method is async - - we can schedule it asynchronously to the `JSWebWorker` or main thread - - propagate result/exceptions via `TaskCompletionSource.SetException` from any managed thread - - when the method is sync - - create internal `TaskCompletionSource` - - we can schedule it asynchronously to the `JSWebWorker` or main thread - - we could block-wait on `Task.Wait` until it's done. - - return sync result - - this would not work in sidecar design - - because UI is not managed thread there -- `emscripten_dispatch_to_thread_async` - in deputy design - - can dispatch async call to C function on the timer loop of target pthread - - doesn't block and doesn't propagate results and exceptions - - from JS (UI) to C# managed main - - only necessary for deputy/sidecar, not for HTTP - - async - - `malloc` stack frame and do JS side of marshaling - - re-order `marshal_task_to_js` before `invoke_method_and_handle_exception` - - pre-create `JSHandle` of a `promise_controller` - - pass `JSHandle` instead of receiving it - - send the message via `emscripten_dispatch_to_thread_async` - - return the promise immediately - - await until `mono_wasm_resolve_or_reject_promise` is sent back - - this need to be also dispatched - - how could we make that dispatch same for HTTP cross-thread by `JSObject` affinity ? - - any errors in messaging will `abort()` - - sync - - dispatch C function - - which will lift Atomic semaphore at the end - - spin-wait for semaphore - - stack-frame could stay on stack - - synchronously returning `null` `Task?` - - pass `slot.ElementType = MarshalerType.Discard;` ? - - `abort()` ? - - `resolve(null)` ? - - `reject(null)` ? - - from C# to JS (UI) - - how to obtain JSHandle of function in the target thread ? - - async - - needs to deal with `stackalloc` in C# generated stub, by copying the buffer - - sync - - inside `JSFunctionBinding.InvokeJS`: - - create internal `TaskCompletionSource` - - use async dispatch above - - block-wait on `Task.Wait` until it's done. - - !! this would not keep receiving JS loop events !! - - return sync result - - implementation calls - - `BindJSFunction`, `mono_wasm_bind_js_function` - many out params, need to be sync call to UI - - `BindCSFunction`, `mono_wasm_bind_cs_function` - many out params, need to be sync call to UI - - `ReleaseCSOwnedObject`, `mono_wasm_release_cs_owned_object` - async message to UI - - `ResolveOrRejectPromise`, `mono_wasm_resolve_or_reject_promise` - async message to UI - - `InvokeJSFunction`, `mono_wasm_invoke_bound_function` - depending on signature, via FuncJS.ResMarshaler - - `InvokeImport`, `mono_wasm_invoke_import` - depending on signature, could be sync or async message to UI - - `InstallWebWorkerInterop`, `mono_wasm_install_js_worker_interop` - could become async - - `UninstallWebWorkerInterop`, `mono_wasm_uninstall_js_worker_interop` - could become async - - `RegisterGCRoot`, `mono_wasm_register_root` - could stay on deputy - - `DeregisterGCRoot`, `mono_wasm_deregister_root` - could stay on deputy - - hybrid globalization, could probably stay on deputy -- `emscripten_sync_run_in_main_runtime_thread` - in deputy design - - can run sync method in UI thread -- "comlink" - in sidecar design - - when the method is async - - extract GCHandle of the `TaskCompletionSource` - - convert parameters to JS (sidecar context) - - using sidecar Mono `cwraps` for marshaling - - call UI thread via "comlink" - - will create comlink proxies - - capture JS result/exception from "comlink" - - use stored `TaskCompletionSource` to resolve the `Task` on target thread -- `postMessage` - - can send serializable message to web worker - - can transmit [transferable objects](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects) - - doesn't block and doesn't propagate exceptions - - this is slow - -## Spin-waiting in JS -- if we want to keep synchronous JS APIs to work on UI thread, we have to spin-wait - - we probably should have opt-in configuration flag for this - - making user responsible for the consequences -- at the moment emscripten implements spin-wait in wasm - - See [pthread_cond_timedwait.c](https://github.com/emscripten-core/emscripten/blob/cbf4256d651455abc7b3332f1943d3df0698e990/system/lib/libc/musl/src/thread/pthread_cond_timedwait.c#L117-L118) and [__timedwait.c](https://github.com/emscripten-core/emscripten/blob/cbf4256d651455abc7b3332f1943d3df0698e990/system/lib/libc/musl/src/thread/__timedwait.c#L67-L69) - - I was not able to confirm that they would call `emscripten_check_mailbox` during spin-wait - - See also https://emscripten.org/docs/porting/pthreads.html -- in sidecar design - emscripten main is not running in UI thread - - it means it could still pump events and would not deadlock in Mono or managed code - - unless the sidecar thread is blocked, or CPU hogged, which could happen - - we need pure JS version of spin-wait and we have OK enough prototype -- in deputy design - emscripten main is running in UI thread - - but the UI thread is not running managed code - - it means it could still pump events and would not deadlock in Mono or managed code - - this deputy design is major selling point #1 - - unless user code opts-in to call sync JSExport -- it could still deadlock if there is synchronous JSImport call to UI thread while UI thread is spin-waiting on it. - - this would be clearly user code mistake - -## Blazor -- as compared to single threaded runtime, the major difference would be no synchronous callbacks. - - for example from DOM `onClick`. This is one of the reasons people prefer ST WASM over Blazor Server. -- Blazor `renderBatch` - - currently `Blazor._internal.renderBatch` -> `MONO.getI16`, `MONO.getI32`, `MONO.getF32`, `BINDING.js_string_to_mono_string`, `BINDING.conv_string`, `BINDING.unbox_mono_obj` - - we could also [RenderBatchWriter](https://github.com/dotnet/aspnetcore/blob/045afcd68e6cab65502fa307e306d967a4d28df6/src/Components/Shared/src/RenderBatchWriter.cs) in the WASM - - some of them need Mono VM and GC barrier, but could be re-written with GC pause and only memory read -- Blazor's [`IJSInProcessRuntime.Invoke`](https://learn.microsoft.com/en-us/dotnet/api/microsoft.jsinterop.ijsinprocessruntime.invoke) - this is C# -> JS direction - - TODO: which implementation keeps this working ? Which worker is the target ? - - we could use Blazor Server style instead -- Blazor's [`IJSUnmarshalledRuntime`](https://learn.microsoft.com/en-us/dotnet/api/microsoft.jsinterop.ijsunmarshalledruntime) - - this is ICall `InvokeJS` - - TODO: which implementation keeps this working ? Which worker is the target ? -- `JSImport` used for startup, loading and embedding: `INTERNAL.loadLazyAssembly`, `INTERNAL.loadSatelliteAssemblies`, `Blazor._internal.getApplicationEnvironment`, `receiveHotReloadAsync` - - all of them pass simple data types, no proxies -- `JSImport` used for calling user JS code: `Blazor._internal.endInvokeDotNetFromJS`, `Blazor._internal.invokeJSJson`, `Blazor._internal.receiveByteArray`, `Blazor._internal.getPersistedState` - - TODO: which implementation keeps this working ? Which worker is the target ? -- `JSImport` used for logging: `globalThis.console.debug`, `globalThis.console.error`, `globalThis.console.info`, `globalThis.console.warn`, `Blazor._internal.dotNetCriticalError` - - probably could be any JS context - -## WebPack, Rollup friendly -- it's not clear how to make this single-file -- because web workers need to start separate script(s) via `new Worker('./dotnet.js', {type: 'module'})` - - we can start a WebWorker with a Blob, but that's not CSP friendly. - - when bundled together with user code, into `./my-react-application.js` we don't have way how to `new Worker('./dotnet.js')` anymore. -- emscripten uses `dotnet.native.worker.js` script. At the moment we don't have nice way how to customize what is inside of it. -- for ST build we have JS API to replace our dynamic `import()` of our modules with provided instance of a module. - - we would have to have some way how 3rd party code could become responsible for doing it also on web worker (which we start) -- what other JS frameworks do when they want to be webpack friendly and create web workers ? - -## Subtle crypto -- once we have have all managed threads outside of the UI thread -- we could synchronously wait for another thread to do async operations -- and use [async API of subtle crypto](https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto) - -## Lazy DLL download -- once we have have all managed threads outside of the UI thread -- we could synchronously wait for another thread to do async operations -- to fetch another DLL which was not pre-downloaded - -## Remove Mono from UI thread -- Get rid of Mono GC boundary breach -- see https://github.com/dotnet/runtime/issues/100411 +# Alternatives and details - as considered 2023 Sep +See https://gist.github.com/pavelsavara/c81ef3a9e4000d67f49ddb0f1b1c2284 \ No newline at end of file From 510d40d09523d0bdf60fe79c642ea0f060659f5b Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Tue, 14 May 2024 12:10:23 +0200 Subject: [PATCH 099/108] index --- INDEX.md | 1 + 1 file changed, 1 insertion(+) diff --git a/INDEX.md b/INDEX.md index 62799f597..c71e5eed2 100644 --- a/INDEX.md +++ b/INDEX.md @@ -83,6 +83,7 @@ Use update-index to regenerate it: | 2022 | [.NET 7 Version Selection Improvements](accepted/2022/version-selection.md) | [Rich Lander](https://github.com/richlander) | | 2023 | [Experimental APIs](accepted/2023/preview-apis/preview-apis.md) | [Immo Landwerth](https://github.com/terrjobst) | | 2023 | [net8.0-browser TFM for applications running in the browser](accepted/2023/net8.0-browser-tfm.md) | [Javier Calvarro](https://github.com/javiercn) | +| 2023 | [Multi-threading on a browser](accepted/2023/wasm-browser-threads.md) | [Pavel Savara](https://github.com/pavelsavara) | ## Drafts From 4e781dcc80e79ce91261ba8468254427d5cc4bea Mon Sep 17 00:00:00 2001 From: pavelsavara Date: Tue, 14 May 2024 12:15:08 +0200 Subject: [PATCH 100/108] fix --- INDEX.md | 2 +- accepted/2023/wasm-browser-threads.md | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/INDEX.md b/INDEX.md index c71e5eed2..9e5fc80fb 100644 --- a/INDEX.md +++ b/INDEX.md @@ -82,8 +82,8 @@ Use update-index to regenerate it: | 2021 | [Tracking Platform Dependencies](accepted/2021/platform-dependencies/platform-dependencies.md) | [Matt Thalman](https://github.com/mthalman) | | 2022 | [.NET 7 Version Selection Improvements](accepted/2022/version-selection.md) | [Rich Lander](https://github.com/richlander) | | 2023 | [Experimental APIs](accepted/2023/preview-apis/preview-apis.md) | [Immo Landwerth](https://github.com/terrjobst) | -| 2023 | [net8.0-browser TFM for applications running in the browser](accepted/2023/net8.0-browser-tfm.md) | [Javier Calvarro](https://github.com/javiercn) | | 2023 | [Multi-threading on a browser](accepted/2023/wasm-browser-threads.md) | [Pavel Savara](https://github.com/pavelsavara) | +| 2023 | [net8.0-browser TFM for applications running in the browser](accepted/2023/net8.0-browser-tfm.md) | [Javier Calvarro](https://github.com/javiercn) | ## Drafts diff --git a/accepted/2023/wasm-browser-threads.md b/accepted/2023/wasm-browser-threads.md index 82ce91cef..cefd5cf79 100644 --- a/accepted/2023/wasm-browser-threads.md +++ b/accepted/2023/wasm-browser-threads.md @@ -1,5 +1,7 @@ # Multi-threading on a browser +**Owner** [Pavel Savara](https://github.com/pavelsavara) | + ## Table of content - [Goals](#goals) - [Key ideas](#key-ideas) From 02065da22013ef0ced56a72384bbead781c47345 Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Wed, 15 May 2024 15:53:31 -0700 Subject: [PATCH 101/108] Add design document for issuing a warning when targeting netstandard1.x --- INDEX.md | 1 + accepted/2024/net-standard-recommendation.md | 128 +++++++++++++++++++ 2 files changed, 129 insertions(+) create mode 100644 accepted/2024/net-standard-recommendation.md diff --git a/INDEX.md b/INDEX.md index 5cc6d67dd..07e68ea65 100644 --- a/INDEX.md +++ b/INDEX.md @@ -85,6 +85,7 @@ Use update-index to regenerate it: | 2023 | [Experimental APIs](accepted/2023/preview-apis/preview-apis.md) | [Immo Landwerth](https://github.com/terrjobst) | | 2023 | [Multi-threading on a browser](accepted/2023/wasm-browser-threads.md) | [Pavel Savara](https://github.com/pavelsavara) | | 2023 | [net8.0-browser TFM for applications running in the browser](accepted/2023/net8.0-browser-tfm.md) | [Javier Calvarro](https://github.com/javiercn) | +| 2024 | [.NET Standard Targeting Recommendations](accepted/2024/net-standard-recommendation.md) | [Immo Landwerth](https://github.com/terrajobst), [Viktor Hofer](https://github.com/ViktorHofer) | ## Drafts diff --git a/accepted/2024/net-standard-recommendation.md b/accepted/2024/net-standard-recommendation.md new file mode 100644 index 000000000..564a25173 --- /dev/null +++ b/accepted/2024/net-standard-recommendation.md @@ -0,0 +1,128 @@ +# .NET Standard Targeting Recommendations + +**Owner** [Immo Landwerth](https://github.com/terrajobst) | [Viktor Hofer](https://github.com/ViktorHofer) + +We have already said that there is [no new version of .NET +Standard][net-standard-future] and that the replacement moving forward is to +just target the platform-neutral `netX.Y` [base framework][tfm] (with `x.y >= +5.0`) of .NET Core. + +For customers that need to author code that still needs to be consumed by .NET +Framework, the recommendation is to continue using `netstandard2.0`. There is +very little reason to target .NET Standard 2.1 because you lose .NET Framework +while only gaining very little few additional APIs. And if you don't care about +.NET Framework support, then just target .NET Core directly; at least then you +get access to a lot more functionality, in exchange for the reduced reach. + +There is basically no reason to target .NET Standard 1.x any more as all .NET +implementations that are still supported, can reference .NET Standard 2.0 +libraries. + +However, we have found that many customers still target .NET Standard 1.x. We +found out the reasons boil down to one of two things: + +1. Targeting the lowest possible versions is considered better +2. It was like and there is no reason to change + +The first argument has merit in a world where .NET Standard continues to produce +new versions. Since that's no longer the case you might as well update to a +later version that allows you to use more functionality. We found the biggest +jump in productivity is with .NET Standard 2.0 because it also supports +[referencing .NET Framework libraries][netfx-compate-mode], which is useful in +cases where you modernize existing applications and have some mixture until you +can upgrade all the projects. + +The second reason is always valid; however, in cases when you're actively +maintaining a code base it makes sense to upgrade to avoid missing out on +features that could have made your life easier and you didn't realize exist +because code completion simply didn't advertise them. + +## Scenarios and User Experience + +### Building a project targeting .NET Standard + +Jackie is assigned a work item to add a feature that requires writing code in +the shared business logic project. That project happens to target .NET Standard +1.4. She doesn't know why because the project was built before her time on the +team. The fist time she compiles in Visual Studio she sees a warning: + +> warning NETSDK1234: Contoso.BusinessLogic.csproj: Targeting .NET Standard +> prior to 2.0 is no longer recommended. See \ for more details. + +Following the link, she finds a page that inform her that all supported .NET +implementations allow using .NET Standard 2.0 or higher. She asks one of her +colleagues and she remembers that they used .NET Standard 1.4 because at some +point her team also maintained a Windows Phone app (which, to everybody's +disappointment no longer exists). + +Jack decides to upgrade the project to .NET Standard 2.0. She later realizes +that this was a good idea because it enables her to use many of the +`Microsoft.Extensions` libraries that make her life a lot easier, specifically +configuration and DI. + +## Requirements + +### Goals + +* Issue a warning when *building* a project targeting `netstandard1.x` with the + .NET 9 SDK + +### Non-Goals + +* Issue a warning when building a project targeting `netstandard2.x` +* Issue a warning when consuming a library that was built for .NET Standard 1.x. +* Issue a warning building with a .NET SDK prior to 9.0 + +## Stakeholders and Reviewers + +* Libraries team +* SDK team +* Build team +* NuGet team + +## Design + +The .NET SDK will check the `TargetFramework` property. If it uses +`netstandardX.Y` with `X.Y < 2.0` it will produce the following warning: + +> Targeting .NET Standard prior to 2.0 is no longer recommended. See \ for +> more details. + +* The warning should have a diagnostic ID (prefixed `NETSDK` followed by a four + digit number). +* The warning should be suppressable via the `NoWarn` property. +* The URL should be point to the documented ID [here][sdk-errors] (not sure why + it says `sdk-errors` -- I don't believe we have a section for warnings). + +## Q & A + +### Didn't you promise that .NET Standard is supported forever? + +We promised that future .NET implementations will continue to support +*consuming* libraries built for .NET Standard, all the way back to 1.0. That is +different from promising that we'll support *producing* .NET Standard binaries +forever; that would only make sense if .NET Standard is still the best way to +build reusable libraries. And [since .NET 5 and the platform-specific +flavors][tfm] of .NET we think there is now a better way to do so. + +The original promise around consumption still stands: existing libraries built +for .NET Standard 1.0 can still be consumed and we have no plans on changing +that. So any existing investment in building a library that can be consumed in +as many places as possible still carries forward. + +### Can I continue to build libraries targeting .NET Standard 1.x? + +Yes. There *may* come a time where the .NET SDK stops supporting building .NET +Standard libraries, but we currently have no plans on doing so as the targeting +packs for .NET Standard 1.x aren't bundled with the SDK and are only downloaded +on demand already; thus, the benefit to removing the support for building .NET +Standard 1.x seems marginal. + +We believe our ecosystem is better off targeting at least .NET Standard 2.0 +which is why we want to issue a warning, but the motivation isn't in us being +able to remove the support for building. + +[net-standard-future]: https://devblogs.microsoft.com/dotnet/the-future-of-net-standard/ +[tfm]: https://github.com/dotnet/designs/blob/main/accepted/2020/net5/net5.md +[netfx-compate-mode]: https://learn.microsoft.com/en-us/dotnet/core/porting/#net-framework-compatibility-mode +[sdk-errors]: https://learn.microsoft.com/en-us/dotnet/core/tools/sdk-errors/ From 6c33e7063f4b7a0f175f365d4c8dfa1da1a21767 Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Wed, 15 May 2024 17:30:17 -0700 Subject: [PATCH 102/108] Apply suggestions for typos Co-authored-by: Austin Wise Co-authored-by: Weihan Li --- accepted/2024/net-standard-recommendation.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/accepted/2024/net-standard-recommendation.md b/accepted/2024/net-standard-recommendation.md index 564a25173..2f40a93d1 100644 --- a/accepted/2024/net-standard-recommendation.md +++ b/accepted/2024/net-standard-recommendation.md @@ -22,7 +22,7 @@ However, we have found that many customers still target .NET Standard 1.x. We found out the reasons boil down to one of two things: 1. Targeting the lowest possible versions is considered better -2. It was like and there is no reason to change +2. It was already like that and there is no reason to change The first argument has merit in a world where .NET Standard continues to produce new versions. Since that's no longer the case you might as well update to a @@ -44,7 +44,7 @@ because code completion simply didn't advertise them. Jackie is assigned a work item to add a feature that requires writing code in the shared business logic project. That project happens to target .NET Standard 1.4. She doesn't know why because the project was built before her time on the -team. The fist time she compiles in Visual Studio she sees a warning: +team. The first time she compiles in Visual Studio she sees a warning: > warning NETSDK1234: Contoso.BusinessLogic.csproj: Targeting .NET Standard > prior to 2.0 is no longer recommended. See \ for more details. From 17964c0bcf39c15de90cf989501f0cd86a885a7f Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Wed, 15 May 2024 20:03:29 -0700 Subject: [PATCH 103/108] Apply fix for typo Co-authored-by: Yaakov --- accepted/2024/net-standard-recommendation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/accepted/2024/net-standard-recommendation.md b/accepted/2024/net-standard-recommendation.md index 2f40a93d1..216c8970e 100644 --- a/accepted/2024/net-standard-recommendation.md +++ b/accepted/2024/net-standard-recommendation.md @@ -55,7 +55,7 @@ colleagues and she remembers that they used .NET Standard 1.4 because at some point her team also maintained a Windows Phone app (which, to everybody's disappointment no longer exists). -Jack decides to upgrade the project to .NET Standard 2.0. She later realizes +Jackie decides to upgrade the project to .NET Standard 2.0. She later realizes that this was a good idea because it enables her to use many of the `Microsoft.Extensions` libraries that make her life a lot easier, specifically configuration and DI. From 83157fdf99fb4e472cb9ef024273837ee73054e1 Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Sat, 1 Jun 2024 14:04:21 -0700 Subject: [PATCH 104/108] Clarify downsides --- accepted/2024/net-standard-recommendation.md | 45 ++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/accepted/2024/net-standard-recommendation.md b/accepted/2024/net-standard-recommendation.md index 216c8970e..3f8f35b73 100644 --- a/accepted/2024/net-standard-recommendation.md +++ b/accepted/2024/net-standard-recommendation.md @@ -96,6 +96,51 @@ The .NET SDK will check the `TargetFramework` property. If it uses ## Q & A +## What is the downside of targeting .NET Standard 1.x? + +There is a major difference in how tooling works for .NET Standard 1.x vs 2.x: + +* .NET Standard 1.x was based on packages, that is, if you target .NET Standard + 1.x you end up referencing about ~70-100 packages (depending on whether you + also target a specific runtime identifier). Those are also persisted in the + resulting NuGet package which causes consumers having to restore them as well. + +* .NET Standard 2.x doesn't use packages; the resulting NuGet package doesn't + depend on any packages, just like targeting specific frameworks generally + doesn't. + +In principle we could change the way tooling works for .NET Standard 1.x but we +don't really see the value for two reasons: + +1. The additional platforms that .NET Standard 1.x reaches are mostly out of + support. + +2. [.NET Standard itself is no longer updated][net-standard-future]. We believe + the new OS-specific frameworks starting with .NET 5 are a much easier and + better model therefore supersede .NET Standard. + +If you target .NET Standard 1.x you're limiting yourself to a subset of .NET +Framework 4.5, which shipped about 12 years ago. A lot of innovation has +happened since then that you're missing out on. If you need to support older +frameworks that Microsoft doesn't support, this might be an intentional choice +in which case suppressing the warning makes total sense. + +However, we believe the vast majority of people don't have such a requirement +and would be better off targeting .NET Standard 2.0 or .NET 5+ instead. + +## Should I drop .NET Standard altogether? + +If you need to produce a NuGet package that works on .NET Framework as well as +modern .NET flavors then using .NET Standard 2.0 is still the best solution for +you -- and your consumers. + +If you drop .NET Standard and instead just multi-target for .NET Framework and +.NET 5+, you force your consumers to do the same, which is generally not +necessary and thus very much undesirable. + +However, if you don't have to support .NET Framework then you're likely better +off just targeting .NET 5+. + ### Didn't you promise that .NET Standard is supported forever? We promised that future .NET implementations will continue to support From 62c393de88d447fe9a7d114543ca942472df4d93 Mon Sep 17 00:00:00 2001 From: Elinor Fung Date: Tue, 2 Jul 2024 14:20:57 -0700 Subject: [PATCH 105/108] Add doc for embedding install location options in `apphost` (#318) There have been numerous requests around being able to customize how/where the .NET root path will be determined for a an application. This document describes a proposed mechanism for basic configuration of how apphost will search for the .NET install location. This is intended to be a step towards a broader (not-yet-designed) SDK experience for building an application or set of applications with a customized .NET runtime path. It does not attempt to tackle the problem of the fuller experience. --- INDEX.md | 1 + proposed/apphost-embed-install-location.md | 152 +++++++++++++++++++++ 2 files changed, 153 insertions(+) create mode 100644 proposed/apphost-embed-install-location.md diff --git a/INDEX.md b/INDEX.md index 07e68ea65..9aa18f133 100644 --- a/INDEX.md +++ b/INDEX.md @@ -98,6 +98,7 @@ Use update-index to regenerate it: |Year|Title|Owners| |----|-----|------| +| | [Add ability to embed install location options in apphost](proposed/apphost-embed-install-location.md) | | | | [Provide SDK hint paths in global.json](proposed/local-sdk-global-json.md) | | | | [Rate limits](proposed/rate-limit.md) | [John Luo](https://github.com/juntaoluo), [Sourabh Shirhatti](https://github.com/shirhatti) | | | [Readonly references in C# and IL verification.](proposed/verifiable-ref-readonly.md) | | diff --git a/proposed/apphost-embed-install-location.md b/proposed/apphost-embed-install-location.md new file mode 100644 index 000000000..7ed141aaa --- /dev/null +++ b/proposed/apphost-embed-install-location.md @@ -0,0 +1,152 @@ +# Add ability to embed install location options in apphost + +There have been numerous requests around being able to customize how/where the +.NET root path will be determined for an application. This document describes +a proposed mechanism for basic configuration of how `apphost` will search for +the .NET install location. + +Goals: + +- Enable developers to build an app and configure it such that it can: + - Only consider global .NET installs + - Always look at a relative path for a .NET install +- Create a building block for a fuller, SDK-supported experience that should fit +in regardless of where we land for the full experience + +Non-goals: + +- SDK support for layout and deployment of the app with a runtime files in a +corresponding relative path +- SDK and project system experience for building a set of applications using the +same custom runtime install location + +Related: + +- [dotnet/runtime#2572](https://github.com/dotnet/runtime/issues/2572) +- [dotnet/runtime#3453](https://github.com/dotnet/runtime/issues/3453) +- [dotnet/runtime#53834](https://github.com/dotnet/runtime/issues/53834) +- [dotnet/runtime#64430](https://github.com/dotnet/runtime/issues/64430) +- [dotnet/runtime#70975](https://github.com/dotnet/runtime/issues/70975) +- [dotnet/runtime#86801](https://github.com/dotnet/runtime/issues/86801) + +## State in .NET 8 + +The current state is described in detail in +[install-locations](https://github.com/dotnet/designs/blob/main/accepted/2020/install-locations.md) +and [install-locations-per-architecture](https://github.com/dotnet/designs/blob/main/accepted/2021/install-locations-per-architecture.md) + +At a high level, the process and priority for `apphost` determining the install +location is: + + 1. App-local + - Look for the runtime in the app's folder (self-contained apps) + 2. Environment variables + - Read the `DOTNET_ROOT_` and `DOTNET_ROOT` environment variables + 3. Global install (registered) + - Check for a registered install - registry key on Windows or an install + location file on non-Windows + 4. Global install (default) + - Fall back to a well-known default install location based on the platform + +## Embedded install location options for `apphost` + +Every `apphost` already has a placeholder that gets rewritten to contain the app +binary path during build (that is, `dotnet build`). This proposal adds another +placeholder which would represent the optional configuration of search locations +and app-relative path to an install location. Same as existing placeholder, it +would conditionally be rewritten based on the app's settings. This rewrite would +[only be done on `publish`](#writing-options-on-publish) of the app currently. +This is similar to the proposal in [dotnet/runtime#64430](https://github.com/dotnet/runtime/issues/64430), +but with additional configuration of which search locations to use. + +### Configuration of search locations + +Some applications do not want to use all the default search locations when they +are deployed - for example, [dotnet/runtime#86801](https://github.com/dotnet/runtime/issues/86801). + +We can allow selection of which search locations the `apphost` will use. When +search locations are configured, only the specified locations will be searched. + +The search location could be specified via a property in the project: + +```xml +Global +``` + +where the valid values are `AppLocal`, `AppRelative`, `EnvironmentVariables`, or +`Global`. Multiple values can be specified, delimited by semi-colons. + +When a value is specified, only those locations will be used. For example, if +`Global` is specified, `apphost` will only look at global install locations, not +app-local, at any app-relative path, or at environment variables. + +### App-relative path to install location + +When a relative path is written into an `apphost` which is [configured to look +at it](#configuration-of-search-locations), that path will be used as the .NET +install root when running the application. + +The install location could be specified via a property in the project: + +```xml +./relative/path/to/runtime +``` + +Setting this implies `AppHostDotNetSearch=AppRelative`. This means that only the +relative path will be considered. If a .NET install is not found at the relative +path, other locations - environment variables, global install locations - will +not be considered and the app will fail to run. + +`AppHostDotNetSearch` could also be explicitly set to include a fallback - for +example, `AppHostDotNetSearch=AppRelative;Global` would look at the relative +path and, if it is not found, the global locations. If `AppHostDotNetSearch` is +explicitly set to a value that does not include `AppRelative`, then setting +`AppHostRelativeDotNet` is meaningless - the SDK will not write the relative +path into the `apphost` and the `apphost` will not check for a relative path. + +## Updated behaviour + +At a high level, the updated process and priority for `apphost` determining the +install location would be: + + 1. App-local, if search location not configured + - Look for the runtime in the app's folder (self-contained apps) + 2. App-relative, if specified as a search location + - Use the path written into `apphost`, relative to the app location + 3. Environment variables, if search location not configured or if set as a + search location + - Read the `DOTNET_ROOT_` and `DOTNET_ROOT` environment variables + 4. Global install (registered), if search location not configured or if set as + a search location + - Check for a registered install - registry key on Windows or a install + location file on non-Windows + 5. Global install (default), if search location not configured or if set as a + search location + - Fall back to a well-known default install location based on the platform + +By default - that is, without any configured install location options - the +effective behaviour remains as in the [current state](#state-in-net-8). + +## Considerations + +### Writing options on publish + +This proposal writes the install location options in the `apphost` on `publish`. +Without SDK support for constructing the layout, updating on `build` would cause +the app to stop working in inner dev loop scenarios (running from output folder, +`dotnet run`, F5) unless they create a layout with the runtime in the expected +location relative to the app output as part of their build. This introduces a +difference between `build` and `publish` for `apphost` (currently, it is the +same between `build` and `publish` - with the exception of single-file, which is +only a `publish` scenario), but once SDK support is added it can be reconciled. + +### Other hosts + +This proposal is only for `apphost`. It is not relevant for `singlefilehost`, as +that has the runtime itself statically linked. Other hosts (`comhost`, `ijwhost`) +are also not included. In theory, other hosts could be given the same treatment, +but we do not have feedback for scenarios using them. They also do not have the +app-local search or the existing requirement of being partially re-written via a +known placeholder. Changing them would add significant complexity without a +compelling scenario. If we see confusion here, we could add an SDK warning if +`AppHostRelativeDotNet` or `AppHostDotNetSearch` is set for those projects. From 272c7f820612c2eb01d612e6c0352c524ea53295 Mon Sep 17 00:00:00 2001 From: Elinor Fung Date: Thu, 1 Aug 2024 17:21:13 -0700 Subject: [PATCH 106/108] Add more details and examples for embedded relative path to apphost-embed-install-location.md (#319) --- proposed/apphost-embed-install-location.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/proposed/apphost-embed-install-location.md b/proposed/apphost-embed-install-location.md index 7ed141aaa..c80dedd4c 100644 --- a/proposed/apphost-embed-install-location.md +++ b/proposed/apphost-embed-install-location.md @@ -104,6 +104,16 @@ explicitly set to a value that does not include `AppRelative`, then setting `AppHostRelativeDotNet` is meaningless - the SDK will not write the relative path into the `apphost` and the `apphost` will not check for a relative path. +The path must not be rooted and will be written into the `apphost` unmodified. +At run time, the path will be considered relative to the app location - that is, +it is appended to the location of the `apphost` itself. The app relies on file +system APIs of the underlying OS to determine if the path is valid and exists. + +When running `C:\dir\app.exe` or `/home/dir/app` with `AppHostRelativeDotNet` set to: + - `my_dotnet`: the app will look at `C:\dir\my_dotnet` or `/home/dir/my_dotnet` + - `./my_dotnet`: the app will look at `C:\dir\my_dotnet` or `/home/dir/my_dotnet` + - `../my_dotnet` the app will look at `C:\my_dotnet` or `/home/my_dotnet` + ## Updated behaviour At a high level, the updated process and priority for `apphost` determining the From b2e41aa776479814d6a9285d4ab40e4f2d5f1b2e Mon Sep 17 00:00:00 2001 From: Marc Paine Date: Wed, 14 Aug 2024 13:41:20 -0700 Subject: [PATCH 107/108] Fix the WORKLOAD_UPDDATE environment variable --- accepted/2020/workloads/workload-manifest.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/accepted/2020/workloads/workload-manifest.md b/accepted/2020/workloads/workload-manifest.md index c164a0a54..fdc1d07f8 100644 --- a/accepted/2020/workloads/workload-manifest.md +++ b/accepted/2020/workloads/workload-manifest.md @@ -147,7 +147,7 @@ To mantain installation coherence, any workload management operation (install, u The .NET tooling will automatically and opportunistically download updated versions of all manifests for the current SDK band and unpack them to `~/.dotnet/sdk-advertising/{sdk-band}/{manifest-id}/`. These user-local updated copies of the manifest are known as *advertising manifests.* The advertising manifests are used to notify that newer versions of installed components are available. They are **not** used in pack resolution or installation. -By default, the .NET SDK will look for newer versions of workload manifests and update the advertising manifests when a `dotnet` CLI command is run which runs NuGet restore, and it has been at least 24 hours since the SDK checked for updated workload manifests. This can be disabled by setting the `WORKLOAD_UPDATE_NOTIFY_DISABLE` environment variable to `true`, and the interval can be controlled with the `WORKLOAD_UPDATE_NOTIFY_INTERVAL_HOURS` environment variable. +By default, the .NET SDK will look for newer versions of workload manifests and update the advertising manifests when a `dotnet` CLI command is run which runs NuGet restore, and it has been at least 24 hours since the SDK checked for updated workload manifests. This can be disabled by setting the `DOTNET_CLI_WORKLOAD_UPDATE_NOTIFY_DISABLE` environment variable to `true`, and the interval can be controlled with the `DOTNET_CLI_WORKLOAD_UPDATE_NOTIFY_INTERVAL_HOURS` environment variable. To explicitly update the advertising manifests without also updating workloads, the following command can be used: `dotnet workload update --advertising-manifests-only` From 4f33f5a7a9d0cfbd880ce9c690494d0b0b15c0fc Mon Sep 17 00:00:00 2001 From: Elinor Fung Date: Wed, 20 Nov 2024 14:11:58 -0800 Subject: [PATCH 108/108] Specify that install location is in 32-bit view of registry (#325) Clarify that the InstallLocation key on Windows is in the 32-bit registry. The original doc (https://github.com/dotnet/designs/blob/main/accepted/2020/install-locations.md) already specifies this. The change here makes it clearer in the per-arch install design doc that builds on the original. --- accepted/2021/install-location-per-architecture.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/accepted/2021/install-location-per-architecture.md b/accepted/2021/install-location-per-architecture.md index 477ab009c..71e15851d 100644 --- a/accepted/2021/install-location-per-architecture.md +++ b/accepted/2021/install-location-per-architecture.md @@ -37,8 +37,8 @@ otherwise the apphost will use the global install location lookup. On Windows the multi-arch situation already exists with x86 and x64 architectures. Each installs into a different location and is registered accordingly. -The registration mechanism in registry includes the architecture, -the key path is `HKLM\SOFTWARE\dotnet\Setup\InstalledVersions\\InstallLocation`. +Install locations are registered in the 32-bit registry under +`HKLM\SOFTWARE\dotnet\Setup\InstalledVersions\\InstallLocation`. This means that x86 apphost looks only for x86 install location and similarly x64 apphost looks only for x64 install location. Adding another architecture