Formalise Benchmarks #308

pedromxavier · 2023-05-16T03:51:04Z

Rationale

Create a formal benchmark pipeline to compare

Python
PythonCall (dev)
PythonCall (stable)
PyCall

Originally posted by @cjdoris in #300 (comment)

Requirements

Match benchmark cases across suites
Use the same Python executable across all interfaces
Store multiple results or condensed statistics
Track memory usage

Comments

Julia Side

Most benchmarking tools in Julia run atop BenchmarkTools.jl¹ and using their interface to define test suites and store results is the way to go. Both PkgBenchmark.jl² and AirspeedVelocity.jl³ provide functionality to compare multiple versions of a single package. Yet, they don't support comparison across multiple packages out-of-the-box. There will be some homework for us in building the right tools for this slightly generalized toolset.

Important to say that PkgBenchmark.jl has useful methods in its public API that we could leverage to build what we need. This includes methods for comparison between suites and for exporting those results to Markdown. AirspeedVelocity.jl is only made available through the CLI.

Python Side

In order to enjoy the same level of detail providede by BenchmarkTools.jl, we should adopt pyperf⁴.
There are many ways to use it, but a few experiments showed that the CLI + JSON interface is probably the desired option.

For each test case, stored in the PY_CODE variable, we would then create a temporary path JSON_PATH and run

run(`$(PY_EXE) -m pyperf timeit "$(PY_CODE)" --append="$(JSON_PATH)" --tracemalloc`)

After that, we should be able parse the output JSON and convert it into a PkgBenchmark.BenchmarkResults object. This makes it easier for integrating those results in the overall machinery, reducing the problem to setting the Python result as the reference value.

Tasks

Implement the reference Python benchmark cases
Implement the corresponding versions in the other suites
- PythonCall (dev)
- PythonCall (stable)
- PyCall
Write a translator for pyperf JSON into BenchmarkResults
Write comparison tools
Write report generator
Setup GitHub actions

Resources

BENCHMARKS.md

References

The text was updated successfully, but these errors were encountered:

github-actions · 2023-08-19T17:45:52Z

This issue has been marked as stale because it has been open for 60 days with no activity. If the issue is still relevant then please leave a comment, or else it will be closed in 7 days.

github-actions · 2023-08-27T01:48:41Z

This issue has been closed because it has been stale for 7 days. You can re-open it if it is still relevant.

LilithHafner · 2023-09-21T17:06:20Z

IMO this is still relevant, it should be re-opened and added to a milestone so that it is not automatically re-closed as stale.

cjdoris · 2023-09-21T21:02:20Z

Indeed, I like this PR, just haven't had a chance to properly review it.

pedromxavier · 2023-09-21T21:13:32Z

I had a similar task in another project and some of the ideas converged to slightly different approaches. I will be happy to update this PR soon, probably during the next weekend.

cjdoris · 2023-09-21T21:22:26Z

Sounds good!

pedromxavier · 2023-09-28T00:51:14Z

Since PythonCall is not v1 yet, we have to decide on how we want to compare the different branches under interface changes. Are we going to keep separate suites for dev and stable or not?

pedromxavier mentioned this issue May 16, 2023

Add benchmark pipeline #309

Closed

4 tasks

github-actions bot added the stale Issues about to be auto-closed label Aug 19, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 27, 2023

cjdoris reopened this Sep 21, 2023

cjdoris added the priority Should be fixed or implemented soon label Sep 21, 2023

cjdoris removed the stale Issues about to be auto-closed label Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Formalise Benchmarks #308

Formalise Benchmarks #308

pedromxavier commented May 16, 2023 •

edited

Loading

github-actions bot commented Aug 19, 2023

github-actions bot commented Aug 27, 2023

LilithHafner commented Sep 21, 2023

cjdoris commented Sep 21, 2023

pedromxavier commented Sep 21, 2023

cjdoris commented Sep 21, 2023

pedromxavier commented Sep 28, 2023

Formalise Benchmarks #308

Formalise Benchmarks #308

Comments

pedromxavier commented May 16, 2023 • edited Loading

Rationale

Requirements

Comments

Julia Side

Python Side

Tasks

Resources

References

Footnotes

github-actions bot commented Aug 19, 2023

github-actions bot commented Aug 27, 2023

LilithHafner commented Sep 21, 2023

cjdoris commented Sep 21, 2023

pedromxavier commented Sep 21, 2023

cjdoris commented Sep 21, 2023

pedromxavier commented Sep 28, 2023

pedromxavier commented May 16, 2023 •

edited

Loading