Skip to content

ERRO[0025] unlinkat /var/tmp/buildah2410054376/mounts3022885724/bind626918239: device or resource busy #5988

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cevich opened this issue Feb 12, 2025 · 19 comments · May be fixed by #6126
Open
Labels
jira Issues which will be sync'd to a card at https://issues.redhat.com/projects/RUN

Comments

@cevich
Copy link
Member

cevich commented Feb 12, 2025

When building inside a rootless container using buildah's vfs storage driver and chroot isolation (As is very often done to build images in CI environments), specifying read/write bind volumes from other stages results in an error. This behavior does not reproduce using buildah 1.37 or earlier. Also verified this same behavior using a vanilla registry.fedoraproject.org/fedora-minimal images + dnf5 install buildah. That is to say, I think it's a buildah problem, not a buildah image problem.

Reproduction (host) environment:

  • Fedora 40
  • podman 5.3.1
  • Running as a regular user w/ default podman settings
  • The quay.io/buildah/upstream:latest container image (buildah version 1.40.0-dev (image-spec 1.1.0, runtime-spec 1.2.0))
  • The quay.io/buildah/stable:v1.38 container image
  • The quay.io/buildah/stable:v1.37 container image

Steps to reproduce:

  1. Create the following Containerfile somewhere in the users homedir
    FROM registry.fedoraproject.org/fedora-minimal:latest as test
    RUN mkdir -p /var/tmp/test
    ADD ./Containerfile /var/tmp/test/
    
    FROM test as final
    RUN --mount=type=bind,from=test,src=/var/tmp/test,dst=/var/tmp/test,rw \
        set -x && \
        date > /var/tmp/test/Containerfile && \
        cat /var/tmp/test/Containerfile
    
  2. Run podman run -it --rm -v ./Containerfile:/root/Containerfile:ro,Z quay.io/buildah/stable:v1.38 buildah --storage-driver=vfs build --isolation=chroot /root
  3. Run the exact same command, but with quay.io/buildah/stable:v1.37 (or any other earlier version)

Unexpected results:

[1/2] STEP 1/3: FROM registry.fedoraproject.org/fedora-minimal:latest AS test
Trying to pull registry.fedoraproject.org/fedora-minimal:latest...
Getting image source signatures
Copying blob 169491f3e4f7 done   |
Copying config e6917e6306 done   |
Writing manifest to image destination
[1/2] STEP 2/3: RUN mkdir -p /var/tmp/test
[1/2] STEP 3/3: ADD ./Containerfile /var/tmp/test/
Getting image source signatures
Copying blob cde90dcf8c1f skipped: already exists
Copying blob cec21250b843 done   |
Copying config 9f9e432f21 done   |
Writing manifest to image destination
--> 9f9e432f21cb
[2/2] STEP 1/2: FROM 9f9e432f21cbb67c928b93d87af3878f3b903cbc2030cc12594f9368829ccc8c AS final
[2/2] STEP 2/2: RUN --mount=type=bind,from=test,src=/var/tmp/test,dst=/var/tmp/test,rw     set -x &&     date > /var/tmp/test/Containerfile &&     cat /var/tmp/test/Containerfile
ERRO[0025] unlinkat /var/tmp/buildah1274147250/mounts4133407440/bind3931917386: device or resource busy
Error: building at STEP "RUN --mount=type=bind,from=test,src=/var/tmp/test,dst=/var/tmp/test,rw set -x &&     date > /var/tmp/test/Containerfile &&     cat /var/tmp/test/Containerfile": resolving mountpoints for container "bb08d8062b4c17b75108492838e53d3236abce647447c8f5bec72cebfcb8ca1b": setting up overlay of "/var/tmp/buildah1274147250/mounts4133407440/bind3931917386": mount overlay:/var/tmp/buildah1274147250/mounts4133407440/overlay/981784139/merge, data: lowerdir=/var/tmp/buildah1274147250/mounts4133407440/bind3931917386,upperdir=/var/tmp/buildah1274147250/mounts4133407440/overlay/981784139/upper,workdir=/var/tmp/buildah1274147250/mounts4133407440/overlay/981784139/work,userxattr: invalid argument

Expected results (from v1.37):

[1/2] STEP 1/3: FROM registry.fedoraproject.org/fedora-minimal:latest AS test
Trying to pull registry.fedoraproject.org/fedora-minimal:latest...
Getting image source signatures
Copying blob 169491f3e4f7 done   |
Copying config e6917e6306 done   |
Writing manifest to image destination
[1/2] STEP 2/3: RUN mkdir -p /var/tmp/test
[1/2] STEP 3/3: ADD ./Containerfile /var/tmp/test/
Getting image source signatures
Copying blob cde90dcf8c1f skipped: already exists
Copying blob b50f8aabd929 done   |
Copying config 71ea00d65f done   |
Writing manifest to image destination
--> 71ea00d65f89
[2/2] STEP 1/2: FROM 71ea00d65f8949486c4441a13b231fd4992b2be2c4170e97a0b9baae11244f71 AS final
[2/2] STEP 2/2: RUN --mount=type=bind,from=test,src=/var/tmp/test,dst=/var/tmp/test,rw     set -x &&     date > /var/tmp/test/Containerfile &&     cat /var/tmp/test/Containerfile
WARN[0000] couldn't find "/var/lib/containers/storage/vfs/dir/7d684fe50918fe44941621b1721c8ee345f7884e2887f8cae36608bacb38e0e8/tmp/test" on host to bind mount into container
+ date
+ cat /var/tmp/test/Containerfile
Wed Feb 12 18:17:34 UTC 2025
[2/2] COMMIT
Getting image source signatures
Copying blob cde90dcf8c1f skipped: already exists
Copying blob b50f8aabd929 skipped: already exists
Copying blob 11db3e39f474 done   |
Copying config 83de1e9298 done   |
Writing manifest to image destination
--> 83de1e9298fe
83de1e9298feac0ce7e01e89b840e42ecd3901a4a67d1b998b3bdbe176fd3a69

Debug output from v1.38 is below (v1.40.0-dev output is substantially similar):

buildah_v1.38_debug.log.txt

Note: Also attempted with the following Containerfile with similar results:

FROM registry.fedoraproject.org/fedora-minimal:latest as test

ADD ./Containerfile /test/
RUN chmod 777 /test/Containerfile

#####

FROM test as final

RUN --mount=type=bind,from=test,src=/test,dst=/test,rw \
    set -x && \
    date > /test/Containerfile && \
    cat /test/Containerfile
@cevich
Copy link
Member Author

cevich commented Feb 14, 2025

Poking through the debuglog and the code, I'm thinking perhaps this problem is stemming from within containers/storage based on convertToOverlay() getting an error back from overlay.MountWithOptions(). I didn't dig too deep into the storage code, but the ,userxattr suffix on the end of the debug messages made my ears stand up: "Why would that be present or even relevant for a VFS "bind" mount?"

time="2025-02-12T18:19:46Z" level=debug msg="Error building at step
{Env:[container=oci ...cut...: resolving mountpoints for container
...cut...: setting up overlay of \"/var/tmp/buildah3627628243/mounts2014160263/bind3820943893\": 
mount overlay:
...cut...,
workdir=/var/tmp/buildah3627628243/mounts2014160263/overlay/1907194961/work,userxattr: invalid argument"

@ssams
Copy link

ssams commented Feb 25, 2025

stumbled across what appears to be the same issue in a build (also using VFS storage driver), to me it seems the problem starts to appear with buildah version 1.37.6:

time="2025-02-21T09:00:59Z" level=error msg="unlinkat /var/tmp/buildah1222469549/mounts3222934611/bind1342232015: device or resource busy"
Error: building at STEP "RUN --mount=type=bind,source=requirements.txt,target=/tmp/pip-tmp/requirements.txt [...]": resolving mountpoints for container "8a8dd1c7104a71218d2e85f1b657facd2a45051f9c0ccf56a267ed85046d6d06": setting up overlay of "/var/tmp/buildah1222469549/mounts3222934611/bind1342232015": mount overlay:/var/tmp/buildah1222469549/mounts3222934611/overlay/1549299006/merge, data: lowerdir=/var/tmp/buildah1222469549/mounts3222934611/bind1342232015,upperdir=/var/tmp/buildah1222469549/mounts3222934611/overlay/1549299006/upper,workdir=/var/tmp/buildah1222469549/mounts3222934611/overlay/1549299006/work,userxattr: invalid argument

this is with buildah version 1.37.6 (image-spec 1.1.0, runtime-spec 1.2.0) running via container registry.redhat.io/rhel9/buildah:9.5-1738643435.

everything works as expected with buildah version 1.37.5 (image-spec 1.1.0, runtime-spec 1.2.0) via registry.redhat.io/rhel9/buildah:9.5-1737479141

@cevich
Copy link
Member Author

cevich commented Feb 25, 2025

Interesting, and thanks for providing details. Knowing this behavior crept in via a patch release is actually really helpful. I just checked, and it was 1.37.5 that fixed the issue for me, which makes sense based on your experience.

Checking the git history, there are only 17 commits between 1.37.5 and 1.37.6. Of these, almost half are merge or changelog update commits. So that narrows things down quite a bit!

@cevich
Copy link
Member Author

cevich commented Feb 25, 2025

Based on the string setting up overlay of in the message, I believe the problem is somewhere in/around convertToOverlay() which first appeared in 2c70035 (between .5 and .6). Curiously as near as I can tell, the containers/storage module was last updated in 1.37.5, so that's probably not the root cause.

There are several conditionals that would all emit a similar message, but I think this is coming from the the 4th one, dealing with a failure from overlay.MountWithOptions(). However, it's also possible this error is a red-herring, and the problem is really coming from GetBindMount(), where convertToOverlay() shouldn't even be used for a VFS mount (clearly we're not reproducing with a mountedImage):

func GetBindMount(...cut...
        ...cut...

        overlayDir := ""
        if mountedImage != "" || mountIsReadWrite(newMount) {
                if newMount, overlayDir, err = convertToOverlay(newMount, store, mountLabel, tmpDir, 0, 0); err != nil {
                        return newMount, "", "", "", err
                }
        }

        succeeded = true
        return newMount, mountedImage, intermediateMount, overlayDir, nil
}

@ssams
Copy link

ssams commented Feb 26, 2025

didn't get to look at the details of the commit, but it sounds very plausible to me. at least I can confirm that removing the rw option makes the mount itself succeed in my case, with the default read-only bind mount it would work. which also further hints towards these changes around read-write mounts.

@ssams
Copy link

ssams commented Feb 26, 2025

and I noticed that I may have shortened the output in my earlier comment a bit much, so in case it could be helpful, the apparently problematic line in my build is: --mount=type=bind,source=third_party/,target=/tmp/pip-tmp/third_party/,rw (so in my case it's mounted from the host, not from an earlier build stage). as indicated in the last comment, removing the rw makes the mount work, so --mount=type=bind,source=third_party/,target=/tmp/pip-tmp/third_party/ works.

@cevich
Copy link
Member Author

cevich commented Feb 26, 2025

All good data points, thanks again for sharing. For VFS I don't think it matters if the source is another stage or w/in the context dir, both should just resolve to directories on the "host" side. SELinux could be to blame, however the way I was reproducing it, nested w/in quay.io/buildah/stable, rules that out.

@cevich
Copy link
Member Author

cevich commented Feb 28, 2025

Something interesting one of my colleagues noticed:

If you do a sudo dmesg -HW on the host then run the reproducer, there's an overlay error from the kernel at the exact same time as buildah tries the volume mount during the build. This is significant because with --storage-driver=vfs, the expectations is that overlay shouldn't be involved at all.

By my reading of internal/volumes/volumes.go to date, in the case of VFS, either GetBindMount() should never call convertToOverlay() or that function shouldn't be calling overlay.MountWithOptions() (which is overlay specific).

@cevich
Copy link
Member Author

cevich commented Mar 6, 2025

@nalind I think we need your expert eyes on this, it also affects main IIRC. Myself and @dashea have exhausted our brick-wall collision quota trying to understand and figure out the correct fix. Significantly, this issue afflicts all/most cases of using buildah in a CI environment to produce an image. So potentially konflux is impacted, as is GitLab CI, and similar container-based automation environments.

@nalind nalind added the jira Issues which will be sync'd to a card at https://issues.redhat.com/projects/RUN label Mar 12, 2025
@nalind
Copy link
Member

nalind commented Mar 12, 2025

Read-write bind mounts get converted into overlays to match the expectation that writes to them get discarded. This was part of the patch set that we backported to multiple branches for CVE-2024-11218.
It looks like the kernel is pointing out that the upper directory we attempt to use there, since we're in a container, is also on an overlay filesystem, which it doesn't allow. Forcing the storage driver to be vfs instead of the overlay-with-fuse-overlayfs default we have in storage.conf in the image discards the bit of configuration that would have caused fuse-overlayfs to be used, and that would have allowed it to succeed here.

@cevich
Copy link
Member Author

cevich commented Mar 12, 2025

Thanks for taking a look at this Nalin, I appreciate it. So it is as I/we feared, overlay is being forced. My understanding/belief is this would also be reproducible if the VFS driver was configured in storage.conf rather than on the command line. There are certainly CI environments (like gitlab) where fuse-overlay isn't supported.

As a fix, is it possible to detect if VFS is being used in convertToOverlay()? If so, would it be correct for that function to create yet another temporary directory, copy the "lower" content, then arrange for it to be thrown away? Or is there a better way to handle this?

@dkhater-redhat
Copy link

dkhater-redhat commented Mar 13, 2025

following this as we are running into this issue as well-

"/home/build/.local/share/containers/storage/vfs/dir/56ea6acb37115a17b8188bc2601914e9fe64077ed0a70d0a4deb5554c56ad23b": mount overlay:/var/tmp/buildah3838130017/mounts3684384049/overlay/2916328100/merge, data: lowerdir=/home/build/.local/share/containers/storage/vfs/dir/56ea6acb37115a17b8188bc2601914e9fe64077ed0a70d0a4deb5554c56ad23b,upperdir=/var/tmp/buildah3838130017/mounts3684384049/overlay/2916328100/upper,workdir=/var/tmp/buildah3838130017/mounts3684384049/overlay/2916328100/work,userxattr: invalid argument

@cgwalters
Copy link

So potentially konflux is impacted,

Just for the record I discovered Konflux has a fork in https://github.com/konflux-ci/buildah-container/ - notice the buildah submodule there is ~4 months old at this time.

@cgwalters
Copy link

It looks like the kernel is pointing out that the upper directory we attempt to use there, since we're in a container, is also on an overlay filesystem, which it doesn't allow.

Don't we want to encourage people to provide an "emptydir" (in kube terms) i.e. transient non-overlayfs volume? Or honestly use podman run --read-only-tmpfs so we get /tmp and /var/tmp as non-overlayfs by default. Then c/storage (?) can detect this case and automatically use /var/tmp for the uppers.

@cevich
Copy link
Member Author

cevich commented Mar 28, 2025

Don't we want to encourage people to provide an "emptydir" (in kube terms)

This is perfectly valid and I agree this is probably a better way to run nested builds. However, two things:

  1. This worked previously, so it's a regression. It impacts several RHEL release branches as well 😭
  2. "Encouragement" is best provided in new major versions, or otherwise in the form of blogs and documentation 😉

@cgwalters
Copy link

Yes I agree (though I'm not the one writing the patches for this so it's easy to do 😄 )

One observation I would have is I think few of us have nested builds near the top of mind, it's certainly not in my day-to-day usage. But probably one thing that would make sense (tying together with my comment above) is to do "reverse dependency testing" by running the Konflux buildah task against proposed updates to buildah. The Konflux buildah task is a beast but it is how many things get built for production so we certainly need it to continue to work.

@cevich
Copy link
Member Author

cevich commented Mar 31, 2025

But probably one thing that would make sense (tying together with my comment above) is to do "reverse dependency testing" by running the Konflux buildah task against proposed updates to buildah.

This is a really good suggestion. While konflux may be the eventual destination, there's no reason why the current tests couldn't have caught this. It runs chroot tests and it runs VFS tests. It must simply be missing a test that tries to rw mount from a previous layer.

Edit: There's possibly a secondary avenue as well - containers/image_build actually produces the quay/buildah images, but doesn't test them very well. CI in that repo builds every day, and fires off e-mails on failure. I'll see if I can find a half-hour to add these tests.

cevich added a commit to cevich/image_build that referenced this issue Mar 31, 2025
Ref: containers/buildah#5988

Having this test in the AIO build is merely a convenience, as it  will
exercise both buildah and podman packages as they appear in their
respective purpose-built images.

Signed-off-by: Chris Evich <[email protected]>
Copy link

github-actions bot commented May 1, 2025

A friendly reminder that this issue had no activity for 30 days.

@cevich
Copy link
Member Author

cevich commented May 5, 2025

X-ref: #6126
(so it's more obvious)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira Issues which will be sync'd to a card at https://issues.redhat.com/projects/RUN
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants