Skip to content

Don't push results from eval workflow #129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 24, 2024
Merged

Don't push results from eval workflow #129

merged 2 commits into from
Oct 24, 2024

Conversation

pamelafox
Copy link
Contributor

@pamelafox pamelafox commented Oct 24, 2024

Purpose

We don't have permission to do that in our org.

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[ ] Yes
[X] No

Type of change

[X] Bugfix
[ ] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

N/A

@pamelafox
Copy link
Contributor Author

/evaluate

Copy link

Starting evaluation! Check the Actions tab for progress, or wait for a comment with the results.

Copy link

Evaluation results

metric stat baseline pr129
gpt_groundedness mean_rating 5.0 5.0
pass_rate 1.0 1.0
gpt_relevance mean_rating 5.0 5.0
pass_rate 1.0 1.0
answer_length mean 1017.6 1013.8
latency mean 2.56 2.46
citations_matched mean 0.73 0.73

Check the workflow run for more details.

@pamelafox pamelafox changed the title Wording change Don't push results from eval workflow Oct 24, 2024
@pamelafox pamelafox merged commit 4b3f88f into main Oct 24, 2024
1 check passed
@pamelafox pamelafox deleted the evaltest9 branch October 24, 2024 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant