-
Notifications
You must be signed in to change notification settings - Fork 2.6k
WIP: Test GOCACHE/GOMODCACHE caching for unit tests #25951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -571,6 +571,19 @@ unit_test_task: | |
env: | ||
TEST_FLAVOR: unit | ||
clone_script: *get_gosrc | ||
go_unit_build_cache: | ||
fingerprint_script: &go_fingerprint_script # FIXME: Almost certainly insufficient | ||
- echo "$DISTRO_NV" | ||
- echo "$VM_IMAGE_NAME" | ||
- echo "$CTR_FQIN" | ||
- echo "$CI_DESIRED_DATABASE" | ||
- echo "$CI_DESIRED_STORAGE" | ||
- echo "$PRIV_NAME" | ||
- cat go.sum | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we use go.sum than we have a ton of PRs invalidating the cache all the time, I mean it is likely a good thing to start with as we don't want the cache to keep growing forever but podman has times where we merge more than one renovate PR a a day so I am not sure how we we gain. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don’t have a good intuition for choice between having an accurate cache and not needing to build anything, in some cases, at the cost of frequently building without a cache — vs. having a more stable cache, and almost always having some cache benefit, but almost always building something. At minimum, I think we should definitely aim to have successive small changes to a single PR reuse the cache, so that subsequent force-pushes are tested quickly. And there might be more that we can do, e.g. it might be a big enough win to do caching even only restricted to a single PR (compiling+running unit tests on one element of the matrix first, caching the outcome, then running the rest of the matrix afterwards, using the cache; same for integration tests). (Also, we might want to cache build outputs but to explicitly discard test outcome caches, to fully exercise the test matrix.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Reading the docs:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The caches are stored in our GCE/AWS accounts as S3 storage AFAICT. So we do pay for that, but storage is generally cheaper then cpu time so that should most likely always be better. One thing we run into with netavark (where we already make use of quite large caches) is that if the cache gets to large you loose all the speed benefit because the download/upload of the cache take much longer. So a good fingerprint is important. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The default lifetime is set to 90 days I belive, same for all our CI logs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that
With that, we can probably be fine using just the
The only downside is, since cache is updated/reuploaded, it will end up accumulating older versions of packages that we no longer need. This vicious cycle will break once Go version is updated. As for the build cache ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. go version may not change for a long time, i.e. at least around a month on main but on our old RHEL branches basically never, sure such branches should not see much changes so maybe it would not actually grow but that would worry me. Not sure how would much the cache would grow in that time, maybe we could add something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That’s not what I see: Also, conceptually, I think we don’t want state. In principle, the tests should be isolated and reproducible; introducing long-term state adds risk. (OTOH, admittedly, we do trust the Go runtime to compile/run things correctly anyway; and, also, the whole point of the cache is to reintroduce some state.) From that point of view, I think it is cleaner to have But, after all, this is a thing that can be measured and decided based on numbers. And, ultimately, it’s much more important to do some version of the optimization and have it in, than to get this cache design perfect, so please don’t take me all that seriously. |
||
folder: $GOCACHE | ||
go_unit_mod_cache: | ||
fingerprint_script: *go_fingerprint_script | ||
folder: $GOPATH/pkg/mod # FIXME: This doesn’t contain any files | ||
setup_script: *setup | ||
main_script: *main | ||
always: *logs_artifacts | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just using VM_IMAGE_NAME sounds best, I don't see why the others would matter here at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point that this needs reexamining — this should not capture runtime options like things we write into
storage.conf
. We might want to cache builds for different Go versions, or different build tags, individually, to maximize cache coverage.