Skip to content

Introduce --test-results-output flag for libtest #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
patskovn opened this issue May 8, 2025 · 4 comments
Open

Introduce --test-results-output flag for libtest #11

patskovn opened this issue May 8, 2025 · 4 comments

Comments

@patskovn
Copy link

patskovn commented May 8, 2025

PR: rust-lang/rust#140805

Libtest currently only writes machine-readable output to stdout, respecting passed --format. This is problematic for CI/build-tools integration and due to possible test results corruption caused by other dependencies writing to stdout. We propose adding a new flag --test-results-output <path>, which writes the chosen --format output to the given file.

Prior attempts to address this gap have not landed: e.g. PRs #96290 and #123365 (to make --logfile use --format) were abandoned, and the testing-devex-team Issue #9 (“Export machine-readable test results to a file”) was closed without merging initially proposed changes.

The recent deprecation of the --logfile flag in libtest was a positive move away from ambiguous reporting behavior toward clearer solutions. However, removing --logfile without introducing a replacement for machine-readable file output left issues not fully addressed.

Motivation

Separating test results from other output avoids contamination. If libtest only writes output to stdout, any non-test output (log messages, debug prints, etc.) may corrupt the stream and break parsers. Rust’s println’s are wrapped by libtest, but anything can (and does, in real world) use libc, or have C code using libc that corrupts stdout. There is no possible workaround for the stdout corruption problem.

Also, in practice, projects often resort to external post-processing to filter test output. As one tracking discussion notes, “due to limitations of Rust libtest formatters, Rust developers often use a separate tool to postprocess the test results output”. By writing test results directly to a file, we can guarantee the test results are isolated and parseable, without third-party noise.

Writing results to a file aligns with established patterns: Google Test (GTest) uses a flag like --gtest_output=json:path to produce test reports in a file.

Proposed Solution

We propose introducing a new option, --test-results-output <file>, which directs libtest to write the structured test report to the given file. This flag would be independent of --logfile; it would capture test results in the specified format. Key points:

  • Syntax: --test-results-output path/to/results.ext. The path may be relative or absolute; libtest should fail if file already exists.

  • Respecting --format: The output format (JSON, JUnit XML, etc.) is controlled by the existing --format flag. libtest would open the file and use the same formatter logic as for stdout

  • Usage Example

cargo test -- -Zunstable-options --test-results-output=tests.json --format=json 
  • Unstable Feature: This flag can initially be gated (e.g. behind -Zunstable-options) until its behavior stabilizes

  • Error Handling: If the file cannot be written (permissions, etc.), libtest should emit an error to stderr and exit. If multiple binaries produce the same file (or the same test command is executed multiple times), it’s up to the caller to avoid collisions by using a unique file name per invocation or cleaning up the file1

  • Exclusivity: Unlike #123365 which refactored libtest so that both stdout and logfile results are written we propose to write only to the file if the argument is passed and do not duplicate outputs. Motivation for that is file outputs are mostly expected to be used by build (and test) tools and not by a user that invokes cargo test from cli and alignment with previous decisions of maintainers.

  • Backward Compatibility: This change does not break existing users of --format or --logfile. The change will not alter libtest behavior unless it’s explicitly used.

@epage
Copy link

epage commented May 8, 2025

The motivation here seems to be focused on intermixed output. That was acknowledged in the proposal to close at #9 (comment)

(2) Tests that leak out non-programmatic output that intermixes with programmatic output. We acknowledge this is a problem to be evaluated but we need to make sure we are stepping back and gathering requirements, rather than assuming --logfile will fit the needs.

This proposal makes several assumptions that would be aided by gathering requirements and learning from prior art, including

  • "Exclusivity": should the output only go to one location or be teed?
  • File IO is required for this (iirc pytest allows other options for output capturing which is different but related)

I'd recommend digging more into people's requirements and how prior art works and what we can learn from it.

@patskovn
Copy link
Author

patskovn commented May 9, 2025

Thanks for swift reply! Will gather more input on how this implemented in other various testing framework.

@patskovn
Copy link
Author

I researched as you advised, gathering more input here.

Framework Reporting to file Output kind Can format stdout results Formats
GTest Supported Teeing No Can do junit or json format to the file
Pytest Supported Teeing No Can do junit and json formats at the same time to different files
iOS Supported Teeing No Custom .xcresults bundle
Python unittest Can be implemented by developer Per implementation  No   Per implementation
Jest unit Can be implemented by developer Per implementation  No   Per implementation
Go test Not supported - Yes -
Rust test (status quo) Not supported - Yes -

Interpreting the table, we see three camps here. In the first one are GTest and Pytest where test reporting is strict, configurable and handled by test harness implementation. Stdout reporting is reserved to be interpreted by humans and test results are duplicated in structured format into a requested file.

The other camp is Python unittest and Jest unit (I took it as a biggest JS testing framework) where languages are more dynamic by nature. They give fair level of customisation to setup reporting as user of testing framework wishes to. There is a quite old RFC for custom test frameworks in Rust that suggests similar-ish route, but considering amount of time passed since discussion I don't think it's worth seriously considering it as part of this discussion.

The last camp are Go test and Rust test. Currently they do not have any way to write test results into a file (besides piping, of course) and they produce requested results format only into stdout. For golang though,

I would make an educated guess that google has an internal-only fork where reporting to results file is supported. (I do not work on Google and don't know for sure).

So to sum it up, test frameworks usually implement in a way where output goes to both locations if results file is specified. But considering the format is different between stdout/file it makes some sense for them. I believe the biggest motivation for that is to properly separate test results that should be interpreted by machine from the user's output (including user-owned printlns).

I don't think there is a clear benefit in duplicating the output at that moment and following what others frameworks do: stick to plain text in stdout and enforce formatted output to a file (we can do that cause report format is not stabilised yet), but I don't have a deal breaking no/yes on that. Whatever option we choose, it solves the main problem I'm trying to address: intermixed output.

Currently I picked the least invasive implementation which doesn't change (even unstabilised) behaviour and we can iterate over it.

@epage
Copy link

epage commented May 12, 2025

Regarding custom test harnesses, that is exactly what we are working towards.

What about

  • format control
  • output destinations
  • what they call it

Or in other words, it'd be good to summarize the feature as a whole and then draw analysis.

Were you able to find context on pytest or gtest's features? A quick look to see if you can find when introduced and a check in common forums for problems would be good. The presence of something doesn't give us those insights.

For cargo test and other cases, I'd particularly not want to be creating random files just to get programmatic output if there is an alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants