Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
This pull request updates the embedding model used across the project from
text-embedding-ada-002
totext-embedding-3-large
. It modifies configuration files, code, and tests to reflect the new model, its dimensions, and associated parameters. The changes ensure compatibility with the new model and maintain functionality across different environments.Note that I used 1024 for the dimensions, since the vector type in pgvector only supports up to 2K dimensions. The pgvector does now have a halfvec type that can store more dimensions, but DiskANN only supports vector, and I want to maintain compatibility with DiskAnn. Fortunately, text-embedding-3-large uses MRL (Matryoska Representation Learning) so the dimensions can be truncated and still retain good semantic representation.
Embedding Model Updates
.env.sample
: Updated environment variables to usetext-embedding-3-large
with dimensions of 1024 and a new embedding columnembedding_3l
. [1] [2]infra/main.parameters.json
,infra/main.bicep
: Updated infrastructure parameters and deployment configurations to reflect the new embedding model and its properties. [1] [2]Backend Code Changes
src/backend/fastapi_app/dependencies.py
: Updated embedding-related environment variable defaults and dimensions to align with the new model.src/backend/fastapi_app/postgres_models.py
: Renamed embedding column, updated dimensions, and adjusted indexing for the new model. [1] [2]src/backend/fastapi_app/setup_postgres_seeddata.py
,src/backend/fastapi_app/update_embeddings.py
: Updated embedding column references and logic for seeding and updating embeddings. [1] [2]Documentation Updates
README.md
: Updated references to the embedding model in instructions for selecting Azure regions.Test Suite Updates
tests/conftest.py
: Updated mock environment variables and test configurations to use the new embedding model. [1] [2] [3] [4]tests/test_dependencies.py
,tests/test_embeddings.py
,tests/test_openai_clients.py
: Adjusted test cases to validate the new model and its parameters. [1] [2] [3] [4]Does this introduce a breaking change?
When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.
Type of change
Code quality checklist
See CONTRIBUTING.md for more details.
python -m pytest
).python -m pytest --cov
to verify 100% coverage of added linespython -m mypy
to check for type errorsruff
manually on my code.