Skip to content

Host --output-format json files in addition to generated HTML #1285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
deeprobin opened this issue Feb 26, 2021 · 11 comments
Open

Host --output-format json files in addition to generated HTML #1285

deeprobin opened this issue Feb 26, 2021 · 11 comments
Labels
A-builds Area: Building the documentation for a crate C-enhancement Category: This is a new feature

Comments

@deeprobin
Copy link

I need this to implement my own "doc"-frontend for me into a multi-page-application.

@syphar
Copy link
Member

syphar commented Feb 26, 2021

@deeprobin what kind of information would you need?

docs.rs is just hosting some metadata and is serving the HTML files generated by rustdoc,

@deeprobin
Copy link
Author

Oh okay. Then it would be better if I open a issue in the rustdoc repository? I thought this was the right repository for this.

@Nemo157
Copy link
Member

Nemo157 commented Feb 26, 2021

Rustdoc has an in-progress JSON backend: rust-lang/rust#76578

There has been some very brief discussion of docs.rs generating and hosting these files along with the html. But at the moment I think it's quite buggy (e.g. rust-lang/rust#80664 is probably going to fail a large proportion of builds) and the format is unstable. I think it would make sense to wait a little longer till it has stabilized a bit more before we add it here (I know of at least a couple of users that are already building alternative renderers based on the json, which will hopefully drive it towards stabilization).

@jyn514 jyn514 changed the title Can docs.rs generate .json files? Host --output-format json files in addition to generated HTML Feb 26, 2021
@jyn514 jyn514 added A-builds Area: Building the documentation for a crate C-enhancement Category: This is a new feature S-blocked Status: marked as blocked ❌ on something else such as an RFC or other implementation work. labels Feb 26, 2021
@DottieDot
Copy link

Any progress/updates on this?

@syphar
Copy link
Member

syphar commented Jul 3, 2022

There are still open ICEs when generating the JSON (see this current discussion on zulip, rust-lang/rust#93518 is one of the fixes.

When this is solved I believe we have to partially finish #795 so we have the additional spare capacity for running the additional builds.

Of course on top of that the actual code change needs to be done.

@jyn514
Copy link
Member

jyn514 commented Jul 3, 2022

I don't think we should start hosting this until the JSON backend is stabilized. The output format is likely to change several more times before stabilization.

@syphar
Copy link
Member

syphar commented Jul 3, 2022

yeah, also valid point, I only had in mind that we could find more ICEs when running it for docs, but the format changing could be annoying for users of the JSON

@syphar
Copy link
Member

syphar commented May 14, 2025

  • we're planning on building the JSON with the same docs.rs metadata (= the same feature set) as the HTML docs. ( this seems to be OK for cargo-semver-checks too @obi1kenobi )
  • the download will be a single JSON download via CDN, probably gz-compressed.

there are some missing features before we deem the format stable enough: ( coming from @jyn514 )

@syphar
Copy link
Member

syphar commented May 17, 2025

next set of notes coming from the all-hands / rustweek: we can actually start building & hosting rustdoc json.

about the implementation:

  • memory usage of the JSON build is higher than the HTML build. We might hit limits for crates that reach that limit.
  • Files would be compressed on S3, potentially with zstd
  • download API is a redirect to static.docs.rs for download, similar to the archive download.
  • format_version could be part of the filename / API.
  • question: can we keep hosting old format versions? and provide via API?

Normal rebuilds for the latest version of each crate would then also build & upload the rustdoc json. So after a couple of months of this being live, all latest versions have rustdoc json.

There are some pending additions that will come later that will make it easier to link between json docs easier (notably: extend the information we provide in external_crates)

edit: don't forget, we have a separate JSON output per target to store.

@syphar syphar removed the S-blocked Status: marked as blocked ❌ on something else such as an RFC or other implementation work. label May 17, 2025
@syphar
Copy link
Member

syphar commented May 17, 2025

implementation notes from first codebase read:

  • rustdoc will clear the /target/doc/ directory before the HTML or JSON builds. So we have to pull out the JSON output before we run the HTML build
  • we have to store & support multiple targets.
  • due to the potentially higher memory usage of rustdoc json, the JSON build might fail when HTML doesn't. We want HTML to be published even after a failed JSON build. Do we need to store separate logs for crate authors to see this somewhere? Or is our internal sentry / log good enough, similar to coverage-generate?
  • any API would probably redirect to default target if no target is given.

structure on S3 bucket is currently:

  • rustdoc: /rustdoc/{krate}/{version}.zip
  • sources: /sources/{krate}/{version}.zip
  • build logs: /build-logs/{build_id}/{target}.txt

and legacy locations before archive-storage

  • rustdoc: /rustdoc/{krate}/{version}/
  • sources: /sources/{krate}/{version}/

after each build we store the archive, and run delete_prefix on the legacy locations.

For rustdoc JSON we preferably want a folder structure where we can later s3 ls and return the available format-versions?

This means the best structure is probably a separate root: /rustdoc-json/{krate}/{version}/{target}/{format_version}.json.zst, perhaps duplicating krate & version & target on the filename for a nicer download. s3 ls could use all files from the target folder.

That also means every request to docs.rs for a rustdoc JSON file needs an s3 ls call. The only way to prevent that, so we can statically determine the path, would be if we remove the format version from the path, or additionally maintain a file without the format version that we overwrite each rebuild.

@syphar
Copy link
Member

syphar commented May 17, 2025

@obi1kenobi following up on your request to be able to fetch old format versions, if we have them.

With my approach from above you would be able to run parallel requests to:

https://static.docs.rs/rustdoc-json/{krate}/{version}/{target}/{format_version}.json.zst

using your list of supported format versions. You'll get either the file or a 404 then, the 404s also cached when I updated the cloudfront config then.

Would that suffice for your use-case?

( perhaps I add the other parameters to the filename additionally to make it self consistent in your download directory)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-builds Area: Building the documentation for a crate C-enhancement Category: This is a new feature
Projects
None yet
Development

No branches or pull requests

5 participants