Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use Buildkit with Windows containers #616

Closed
tofflos opened this issue Sep 11, 2018 · 86 comments
Closed

Unable to use Buildkit with Windows containers #616

tofflos opened this issue Sep 11, 2018 · 86 comments

Comments

@tofflos
Copy link

tofflos commented Sep 11, 2018

I'm using the Buildkit version that comes bundled with Docker for Windows 18.06.1 and am experiencing some trouble running it with Windows containers. In the log below you can see a build succeed for a very simple build running without Buildkit and then failing once I enable it. The localized error message "Det går inte att hitta filen" roughly translates to "Unable to find the file". I've had success running Buildkit on the same system when running Linux containers. A minimal project that reproduces the error can be found here test.zip.

PS C:\test> docker version
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:21:34 2018
 OS/Arch:           windows/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.24)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:36:40 2018
  OS/Arch:          windows/amd64
  Experimental:     true
PS C:\test> ls


    Directory: C:\test


Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----       2018-09-11     15:38             74 Dockerfile
-a----       2018-09-11     15:39             23 test.txt


PS C:\test> type .\Dockerfile
FROM microsoft/nanoserver:1803
COPY test.txt /test.txt
RUN type test.txt

PS C:\test> $Env:DOCKER_BUILDKIT=0
PS C:\test> docker build -t test .
Sending build context to Docker daemon  3.072kB
Step 1/3 : FROM microsoft/nanoserver:1803
 ---> 693ff1719e39
Step 2/3 : COPY test.txt /test.txt
 ---> 3cb8bc9e5e2e
Step 3/3 : RUN type test.txt
 ---> Running in 376f873629fd
This is a test message!Removing intermediate container 376f873629fd
 ---> 0cce47564a2d
Successfully built 0cce47564a2d
Successfully tagged test:latest

PS C:\test> $Env:DOCKER_BUILDKIT=1
PS C:\test> docker build -t test .
[+] Building 0.2s (2/2) FINISHED
 => local://dockerfile (Dockerfile)                                                                                                                                                                                                                                       0.1s
 => => transferring dockerfile: 31B                                                                                                                                                                                                                                       0.0s
 => local://context (.dockerignore)                                                                                                                                                                                                                                       0.1s
 => => transferring context: 2B                                                                                                                                                                                                                                           0.0s
failed to read dockerfile: open C:\ProgramData\Docker\tmp\buildkit-mount977689469\Dockerfile: Det går inte att hitta filen.
@tonistiigi
Copy link
Member

tonistiigi commented Sep 11, 2018

Buildkit is not supported for Windows containers in docker 18.06/18.09

@gerich-home
Copy link

Any plans to support it?

@quangkieu
Copy link

If there is no windows container support yet, I think the error message need to be update to define expectation.

@olljanat
Copy link

olljanat commented Jun 1, 2019

@quangkieu it looks to be described on documentation: https://docs.docker.com/build/buildkit/#getting-started
Only supported for building Linux containers

@quangkieu
Copy link

@olljanat I meant about the error message from the built process.

@Barsonax
Copy link

When is buildkit support coming for windows?

@TBBle
Copy link
Collaborator

TBBle commented Nov 12, 2019

Maybe a better question is what needs to be done/what are the outstanding dependencies?

@Iristyle
Copy link

Has anyone tried using buildctl on Windows via instructions at https://github.com/moby/buildkit#exploring-dockerfiles with buildkit daemon running in a container? Looks like that might be an alternative until docker build works properly on Windows?

@olljanat
Copy link

@Iristyle if you read that doc more carefully it also says

the buildkitd daemon is only available for Linux currently.

@Barsonax I'm bit worry about that we will not see Windows containers support ever because there is no Microsoft persons contributin to this project. Hopefully I'm wrong.

@Iristyle
Copy link

@olljanat well, I'm using LCOW, which hosts a real Linux kernel - so it's a bit of a grey area (and a lot of the docker folks don't seem to know much about in practical terms). I played around a little and I was getting closer to having rootless running per instructions at https://github.com/moby/buildkit/blob/master/docs/rootless.md#about---oci-worker-no-process-sandbox, noting that --privileged is not supported on Windows at all.

I'll update if I'm able to get it going or hit a dead end.

@olljanat
Copy link

@Iristyle that is probably possible but this issue is about real Windows containers so let's try keep on topic.

@TBBle
Copy link
Collaborator

TBBle commented Dec 30, 2019

Since last time I looked into this, containerd gained support for Windows 10 1809/Windows Server 2019, so it's possible no MS involvement in buildkit is needed, if it can get everything it needs for the low-level part via its containerd backend.

Edit: A quick look at the build system for buildkit suggests that you need running buildkit (either locally, or running inside Docker) to build buildkit. I'm somewhat flummoxed by this.

@olljanat
Copy link

@TBBle hmm. Yea here is some info about containerd support on https://docs.microsoft.com/en-us/virtualization/windowscontainers/deploy-containers/containerd so maybe it can be possible.

Then someone probably can try build buildkitd.exe for Windows to see where it fails. I also guess that latest Docker binaries with containerd support are needed ( more info about that moby/moby#38541 )

@TBBle
Copy link
Collaborator

TBBle commented Jan 3, 2020

Ah, thank you. moby/moby#38541 is the PR reference I was looking for earlier.

Poking through, containerd doesn't seem to publish Windows binaries in their releases despite having thew new Windows V2 runtime in their 1.3.0 release, and their AppVeyor build pipeline doesn't capture artifacts.

The required hcsshim project does publish artifacts from their AppVeyor pipeline, even though they don't include them in their releases.

Both have recent-enough releases to meet the criteria laid out in moby/moby#38541 but they both also have active work on master which might make a difference.

containerd currently vendors a specific commit of hcsshim (microsoft/hcsshim@d2849cb), binaries for which can be fetched from AppVeyor. For containerd 1.3.2 (microsoft/hcsshim@9e92188) the binaries are also on AppVeyor but will expire in late February. Both of these vendored versions are older than the current hcsshim release, 0.8.7, whose artifacts are also on AppVeyor.

In the end, it's not clear to me if this ecosystem is yet in a state to start trying to get BuildKit working, and containerd/containerd#1920 (which has not been updated since the switch to the Windows V2 API) gives me a reasonable level of doubt.

@TBBle
Copy link
Collaborator

TBBle commented Jan 4, 2020

Quick correction: Containerd does have nightly builds for Windows, they're at https://github.com/containerd/containerd/actions?query=workflow%3ANightly

@TBBle
Copy link
Collaborator

TBBle commented Jan 4, 2020

So with a bit of hacking I got containerd working on my Windows 10 Desktop (mostly blocked by a bug recently introduced into containerd master Edit: Fix pending in containerd/containerd#3929).

I then did a bunch more hacking on BuildKit, including fixing a couple of bugs, and commenting out a lot of stuff.

Buildkitd ran, and tried to build me a package, but failed because it didn't copy the Dockerfile over.

PS C:\Users\paulh\Documents\BuildKit\simpleDocker> buildctl.exe --debug build --frontend=dockerfile.v0 --local context=. --local dockerfile=.
[+] Building 0.0s (0/0)
time="2020-01-05T07:47:33+11:00" level=debug msg="serving grpc connection"
[+] Building 0.1s (2/2) FINISHED
 => [internal] load build definition from Dockerfile                                                                     0.1s
 =>
 => transferring dockerfile: 983B                                                                                     0.0s
 => [internal] load .dockerignore                                                                                        0.1s
 =>
 => transferring context: 2B                                                                                          0.0s
error: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to read dockerfile: open C:\Users\paulh\AppData\Local\Temp\buildkit-mount017874163\Dockerfile: The system cannot find the file specified.
failed to solve
github.com/moby/buildkit/client.(*Client).solve.func2
        C:/Users/paulh/go/src/github.com/moby/buildkit/client/solve.go:203
github.com/moby/buildkit/vendor/golang.org/x/sync/errgroup.(*Group).Go.func1
        C:/Users/paulh/go/src/github.com/moby/buildkit/vendor/golang.org/x/sync/errgroup/errgroup.go:57
runtime.goexit
        c:/go/src/runtime/asm_amd64.s:1357

I assume this is because I commented out too much, and somehow excluded the code that actually copies things into the snapshots, as both created snapshots were empty despite reporting having transferred stuff. The DockerFile itself did no transfers from the host OS, it's [MS's trivial Python example](# https://github.com/MicrosoftDocs/Virtualization-Documentation/blob/master/windows-container-samples/python/Dockerfile).

PS C:\Users\paulh\Documents\BuildKit\simpleDocker> buildctl.exe --debug du
ID                                                                      RECLAIMABLE     SIZE    LAST ACCESSED
x86vuhy70whikjae56p5wsfmo*                                              true            0B
m733jropkh4azwwgoknhowicq*                                              true            0B
Reclaimable:    0B
Total:          0B
PS C:\Users\paulh\Documents\BuildKit\simpleDocker> buildctl.exe --debug prune
ID                                                                      RECLAIMABLE     SIZE    LAST ACCESSED
m733jropkh4azwwgoknhowicq*                                              true            0B
x86vuhy70whikjae56p5wsfmo*                                              true            0B
Total:  0B

@tofflos tofflos changed the title Unable to use Builtkit with Windows containers Unable to use Buildkit with Windows containers Jan 5, 2020
@TBBle
Copy link
Collaborator

TBBle commented Jan 5, 2020

With #1314, and some more hacking on things, I've gotten to the point where my next failure is coming from inside containerd, or the connection to it.

PS C:\Users\paulh\Documents\BuildKit\supersimpleDocker> buildctl --debug build --frontend=dockerfile.v0 --local context=. --local dockerfile=.
time="2020-01-06T08:03:16+11:00" level=debug msg="serving grpc connection"
[+] Building 4.7s (4/5)
[+] Building 4.7s (5/5) FINISHED
 => [internal] load build definition from Dockerfile                                                                     0.0s  => => transferring dockerfile: 588B                                                                                     0.0s  => [internal] load .dockerignore                                                                                        0.0s  => => transferring context: 2B                                                                                          0.0s  => [internal] load metadata for mcr.microsoft.com/windows/servercore:1909                                               0.2s  => CACHED [1/2] FROM mcr.microsoft.com/windows/servercore:1909@sha256:12327ccba5d74921479cc95b56e9422278ac3565740c2a46  0.0s  => => resolve mcr.microsoft.com/windows/servercore:1909@sha256:12327ccba5d74921479cc95b56e9422278ac3565740c2a46359bf0a  0.0s  => ERROR [2/2] RUN echo Write-Host -ForegroundColor Red Hello > wr.ps1                                                  4.4s ------
 > [2/2] RUN echo Write-Host -ForegroundColor Red Hello > wr.ps1:
------
error: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to build LLB: executor failed running [powershell -command echo Write-Host -ForegroundColor Red Hello > wr.ps1]: failure waiting for process: rpc error: code = Unknown desc = ttrpc: closed: unknown
failed to solve
github.com/moby/buildkit/client.(*Client).solve.func2
        C:/Users/paulh/go/src/github.com/moby/buildkit/client/solve.go:203
github.com/moby/buildkit/vendor/golang.org/x/sync/errgroup.(*Group).Go.func1
        C:/Users/paulh/go/src/github.com/moby/buildkit/vendor/golang.org/x/sync/errgroup/errgroup.go:57
runtime.goexit
        c:/go/src/runtime/asm_amd64.s:1357

I've pushed one commit that needs more work (breaks the auto tests) plus my hacks onto https://github.com/TBBle/buildkit/tree/hacks_ahoy, in case anyone else wants to play with this.

For reference, I was working with source from containerd/containerd#3929, to fix a blocking bug and microsoft/hcsshim#749, to let me build without gcc. For hcshim, had I not been instrumenting the source, I could have used the nightly binary build of the containerd shim, and I'm planning to suggest/submit that their releases include pushing a container for the container managed /opt feature, which would avoid hunting down binaries and adding them to the $PATH. (Edit: microsoft/hcsshim#750)

@TBBle
Copy link
Collaborator

TBBle commented Jan 7, 2020

The failure I hit in my previous run turned out to be a bug in hcsshim, for which I have posted a fix at microsoft/hcsshim#752.

So now I am able to build a trivial Dockerfile. So trivial it's pointless, except that it worked.

FROM mcr.microsoft.com/windows/servercore:1909
LABEL Description="Built with BuildKit!"
SHELL ["powershell", "-command"]
RUN echo Write-Host -ForegroundColor Red Hello > wr.ps1
CMD ["powershell" ".\wr1.ps1"]

I don't know yet if my containers do not have networking set up properly due to my Buildkit spec-generation hacks, or some other aspect of my setup unrelated to Buildkit.

As well as networking issues, filesystem commands do not function on Windows due to an assertion about idmapping support.

I was worried about API issues, so I had vendored containerd master into buildkit, and hcsshim master into containerd. However, I suspect that this wasn't necessary, and I'll back those out next time I look at this.

I've rebased https://github.com/TBBle/buildkit/tree/hacks_ahoy to the current version of #1314, so it should be relatively easy for anyone who wants to try this out, and perhaps try and turn some of my hacks into further valuable commits.

@guillaume86
Copy link

@TBBle cool to see someone tackling this. Does your fork handles the alternative <pathOfDockerfile>.dockerignore path for .dockerignore files? That is pretty much the only thing I miss for the moment.

@TBBle
Copy link
Collaborator

TBBle commented Apr 5, 2020

It probably doesn't, but only because all the file-copy APIs in BuildKit fail an assertion on Windows related to permissions support.

I really should get back to this, it got jammed up behind questions about containerd 1.2 support, and then other stuff came up.

@jorgearteiro
Copy link

jorgearteiro commented Jul 7, 2020

There is an issue logged on Microsoft Windows Containers repo microsoft/Windows-Containers#34

@TBBle
Copy link
Collaborator

TBBle commented Oct 22, 2023

Ah, is the local mounting done by buildctl, not buildkitd? I knew I should have checked that first. Ah well. Yeah, we'll probably have to start with it completely non-containerised in a distinct pipeline with an eye to migrating to HostProcess someday, rather than trying to split the difference.

Even HostProcess containers may not be able to sufficiently-unify the pipelines to make it worth doing. I actually don't recall having heard of anyone running a second containerd in a HostProcess container.

And yeah, the next step is pushing forward into buildx, which will be the point where this can start getting into the hands of a wider group of users. Once we have Docker Desktop for Windows backed by containerd and containerd image store (both exist in the moby repo with CI support-ish but are not yet shipped in Docker Desktop), AFAIK users can just update buildx in-situ to pull improvements and fixes, which is much more reliable than getting the full stack working by hand.

@gabriel-samfira
Copy link
Collaborator

Ah, is the local mounting done by buildctl, not buildkitd?

I think it is done by buildkitd, but the issue is a mixed bag. I am not (yet) confident that the tests won't check for the existence of files in the path where layers get mounted (see continuity/fstest). Starting next week we will finally have time to start tackling the test suite.

@TBBle
Copy link
Collaborator

TBBle commented Oct 22, 2023

Ah, good point. The tests would be doing the mounting themselves so they don't need to see the exact same filesytem as buildkitd, but need access to the same containerd backing store. I see now. That'd probably also be true on Linux in the same situation, i.e. we were trying to run tests in a non-privileged container talking to buildkitd/containerd in a separate container.

@tonistiigi
Copy link
Member

I am not (yet) confident that the tests won't check for the existence of files in the path where layers get mounted

They do not afaik, the data is exported with --output to registry/local/containerd and then checked.


Could someone explain in more detail what are the actual technical problems of not being able to run buildkitd inside the containers. This isn't just for tests but I also want it to be possible to run buildx create to run any upstream release of buildkit as an isolated instance. In linux, by default this means making a buildkit container. I'm also not sure atm if the frontend containers work in wcow or not. That one is a slightly different problem though as frontends do not require any extra privileges.

@gabriel-samfira
Copy link
Collaborator

Could someone explain in more detail what are the actual technical problems of not being able to run buildkitd inside the containers.

We can't say for sure. I can't, at least. Not until we actually try it. At this point we're just guessing based on previous experience in other parts of the ecosystem. My hope is that it will work. If not in process containers, then at least in hyper-v vontainers.

We'll know more in the following weeks, and will add more relevant details and/or PRs to enable tests as well as the rest of the ecosystem tooling. The aim is to be as close as possible to the linux version in terms of UX.

@TBBle
Copy link
Collaborator

TBBle commented Oct 22, 2023

The main limitation is that we can't run "privileged" containers on Windows except Host Process containers (with which my personal experience is basically zero, and AFAIK neither Docker nor nerdctl support them, so I don't know what is needed to make them aesthetic for non-k8s situations) and I suspect therefore that we can't run a containerd instance in a container.

I also suspect that we might not be able to use the localmounter from inside a container, even if the containerd data tree is mapped into the container, as the same non-privileged state means we may not be able to actually mount inside the container using WCIFS. See microsoft/Windows-Containers#268 for a related known limitation. (This might not apply to WCIFS...) microsoft/hcsshim#1699 notes that a related issue also affects Host Process containers. But neither is exactly what we'd be trying to do there.

But as @gabriel-samfira has noted, this is still speculation. I'm not aware of anyone having tried this explicitly.

I expect frontend containers aren't affected by these limitations, but I've not looked at all into how they work, so I allow room to be surprised.

@gabriel-samfira
Copy link
Collaborator

CC-in from the Microsoft side for greater visibility @lucillex @profnandaa @iankingori

@TBBle
Copy link
Collaborator

TBBle commented Oct 26, 2023

Since the core of the system is roughly working in master, and AFAIK all the upstream dependencies have released versions we can use, we probably should set some goals for closing this ticket and tracking remaining work that needs further discussion new tickets.

First question, do we want to keep this ticket around as a meta-tracking ticket? I suspect a lot of people are subscribed and would see this ticket closing as "It works". It makes sense to me to keep using this ticket to track until the feature is release-notable.

I'd love to see WCOW land as supported in 0.13, but feel #3158 and the Platform Matcher issue for Windows 11 should be resolved first, as they represent regressions from the legacy builder in dockerd for common existing Dockerfile patterns. We also need test suite coverage, to identify any other regressions.

It just occurred to me that it might be worth collecting a list of large WCOW-based containers and do some test-builds with them to shake out any other regressions. Since I have history with it, ue4-docker comes immediately to mind. I don't think my own machine is strong enough for it. (It is probably also going to be bitten by #3158, since it uses RUN powershell frequently.) Core MS tools like PowerShell-Docker and dotnet-docker.

We probably should test and document the state of HyperV Isolation support. It'll be interesting for people on Windows 11 and Windows Server 2022 hosts to build Windows Server 2019 containers, but whether those people are numerous enough to make it a release goal, I'm unsure. dotnet-docker is also a test-case for this, they appear to still support LTSC2019 and I think we don't plan to support Windows Server LTSC 2019 as host for buildkitd. (But now I'm questioning that, did I confuse it with Docker 24? Or with LTSC2016 support?)

I don't think LCOW is a release goal here. Although it might be easier to get the test suite running on that, there's probably a bunch of things that are making WCOW assumptions in BuildKit, e.g., the Platform Matcher. And similarly, multi-platform build support probably isn't interesting right now. (WCOW/LCOW would be doable once we have LCOW, I'm not sure if multi-architecture on top of that would be fun to implement, it'd probably be QEMU inside LCOW for Linux, and multi-arch Windows Containers is simultaneously ancient history and unknowable future)

I'm not sure what we'd need in terms of documentation. Presumably documentation of the various Windows-specific limitations is the bare minimum.

And then there's trivial stuff like moving the buildkitd binary into the binaries image and anything else needed to make the released artifact usable. (I hope we don't need to do an installer here. That seems like a bundler issue? nerdctl wants an installer for their "Windows supported" milestone, which would include buildkit for example.)

@TBBle
Copy link
Collaborator

TBBle commented Oct 27, 2023

I have drafted #4387 which fixes use of FROM mcr.microsoft.com/powershell:latest for example, as the only example I tested. It should fix all Windows multi-arch images using FROM, and also pre-fix any future surprise corner cases like unexpected cache layer hits on different OS versions.

@profnandaa
Copy link
Collaborator

Just updating on this thread that there is now experimental support on Windows. See docs/windows.md to get started. We are actively prioritizing any blocking issues coming from the experimental release, so feel free to open individual issues that you come across that are not captured here yet.

@bplasmeijer
Copy link

Awesome! Thanks to the many community members involved.

@giuseppetrematerra
Copy link

I don't know if it has been already reported, i'm experiencing some issue building windows image. Can't find anything related reported.

ERROR: failed to solve: process "cmd /S /C pwsh -Command "choco install jre8 -y "" did not complete successfully: buildkit executor not implemented for windows

The dockerfile directive is:
RUN pwsh -Command "choco install jre8 -y"

@profnandaa
Copy link
Collaborator

@giuseppetrematerra -- did you setup a few things before that step, like pwsh and choco? can share your dockerfile?

@gabriel-samfira
Copy link
Collaborator

@giuseppetrematerra Docker may not yet have the buildkit executor hooked up to the new buildkitd windows support. You're most likely hitting this:

https://github.com/moby/moby/blob/master/builder/builder-next/executor_nolinux.go#L25

The RUN stanza requires the executor to be implemented in moby as well. You should be able to call into buildkitd directly, using buildctl.exe, if you have the latest version of buildkitd running.

@FrankRichterAnsys

This comment was marked as outdated.

@profnandaa
Copy link
Collaborator

@786maan -- spammer, reported.

@profnandaa
Copy link
Collaborator

Closing this issue in favor of specific issues opened on this list here. Please feel free to open any issue you are facing with the current experimental release. We are now racing towards GA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests