Skip to content

Port to embedding-3-large #198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 30, 2025
Merged

Port to embedding-3-large #198

merged 2 commits into from
Apr 30, 2025

Conversation

pamelafox
Copy link
Contributor

@pamelafox pamelafox commented Apr 29, 2025

Purpose

This pull request updates the embedding model used across the project from text-embedding-ada-002 to text-embedding-3-large. It modifies configuration files, code, and tests to reflect the new model, its dimensions, and associated parameters. The changes ensure compatibility with the new model and maintain functionality across different environments.

Note that I used 1024 for the dimensions, since the vector type in pgvector only supports up to 2K dimensions. The pgvector does now have a halfvec type that can store more dimensions, but DiskANN only supports vector, and I want to maintain compatibility with DiskAnn. Fortunately, text-embedding-3-large uses MRL (Matryoska Representation Learning) so the dimensions can be truncated and still retain good semantic representation.

Embedding Model Updates

  • .env.sample: Updated environment variables to use text-embedding-3-large with dimensions of 1024 and a new embedding column embedding_3l. [1] [2]
  • infra/main.parameters.json, infra/main.bicep: Updated infrastructure parameters and deployment configurations to reflect the new embedding model and its properties. [1] [2]

Backend Code Changes

  • src/backend/fastapi_app/dependencies.py: Updated embedding-related environment variable defaults and dimensions to align with the new model.
  • src/backend/fastapi_app/postgres_models.py: Renamed embedding column, updated dimensions, and adjusted indexing for the new model. [1] [2]
  • src/backend/fastapi_app/setup_postgres_seeddata.py, src/backend/fastapi_app/update_embeddings.py: Updated embedding column references and logic for seeding and updating embeddings. [1] [2]

Documentation Updates

  • README.md: Updated references to the embedding model in instructions for selecting Azure regions.

Test Suite Updates

  • tests/conftest.py: Updated mock environment variables and test configurations to use the new embedding model. [1] [2] [3] [4]
  • tests/test_dependencies.py, tests/test_embeddings.py, tests/test_openai_clients.py: Adjusted test cases to validate the new model and its parameters. [1] [2] [3] [4]

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[X] Yes - this would be breaking if anyone is pulling from main and updating an existing DB, but dont think thats the case for this repo.
[ ] No

Type of change

[ ] Bugfix
[X] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

  • The current tests all pass (python -m pytest).
  • I added tests that prove my fix is effective or that my feature works
  • I ran python -m pytest --cov to verify 100% coverage of added lines
  • I ran python -m mypy to check for type errors
  • I either used the pre-commit hooks or ran ruff manually on my code.

@pamelafox pamelafox merged commit 1023c8b into main Apr 30, 2025
17 checks passed
@pamelafox pamelafox deleted the embedding3l branch April 30, 2025 04:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant