|
| 1 | +[[mongo.search]] |
| 2 | += MongoDB Search |
| 3 | + |
| 4 | +MongoDB enables users to do keyword or lexical search as well as vector search data using dedicated search indexes. |
| 5 | + |
| 6 | +[[mongo.search.vector]] |
| 7 | +== Vector Search |
| 8 | + |
| 9 | +MongoDB Vector Search uses the `$vectorSearch` aggregation stage to run queries against specialized indexes. |
| 10 | +Please refer to the MongoDB documentation to learn more about requirements and restrictions of `vectorSearch` indexes. |
| 11 | + |
| 12 | +[[mongo.search.vector.index]] |
| 13 | +=== Managing Vector Indexes |
| 14 | + |
| 15 | +`SearchIndexOperationsProvider` implemented by `MongoTemplate` are the entrypoint to `SearchIndexOperations` offering various methods for managing vector indexes. |
| 16 | + |
| 17 | +The following snippet shows how to create a vector index for a collection |
| 18 | + |
| 19 | +.Create a Vector Index |
| 20 | +[tabs] |
| 21 | +====== |
| 22 | +Java:: |
| 23 | ++ |
| 24 | +==== |
| 25 | +[source,java,indent=0,subs="verbatim,quotes",role="primary"] |
| 26 | +---- |
| 27 | +VectorIndex index = new VectorIndex("vector_index") |
| 28 | + .addVector("plotEmbedding"), vector -> vector.dimensions(1536).similarity(COSINE)) <1> |
| 29 | + .addFilter("year"); <2> |
| 30 | +
|
| 31 | +mongoTemplate.searchIndexOps(Movie.class) <3> |
| 32 | + .createIndex(index); |
| 33 | +---- |
| 34 | +<1> A vector index may cover multiple vector embeddings that can be added via the `addVector` method. |
| 35 | +<2> Vector indexes can contain additional fields to narrow down search results when running queries. |
| 36 | +<3> Obtain `SearchIndexOperations` bound to the `Movie` type which is used for field name mapping. |
| 37 | +==== |
| 38 | +
|
| 39 | +Mongo Shell:: |
| 40 | ++ |
| 41 | +==== |
| 42 | +[source,console,indent=0,subs="verbatim,quotes",role="secondary"] |
| 43 | +---- |
| 44 | +db.movie.createSearchIndex("movie", "vector_index", |
| 45 | + { |
| 46 | + "fields": [ |
| 47 | + { |
| 48 | + "type": "vector", |
| 49 | + "numDimensions": 1536, |
| 50 | + "path": "plot_embedding", <1> |
| 51 | + "similarity": "cosine" |
| 52 | + }, |
| 53 | + { |
| 54 | + "type": "filter", |
| 55 | + "path": "year" |
| 56 | + } |
| 57 | + ] |
| 58 | + } |
| 59 | +) |
| 60 | +---- |
| 61 | +<1> Field name `plotEmbedding` got mapped to `plot_embedding` considering a `@Field(name = "...")` annotation. |
| 62 | +==== |
| 63 | +====== |
| 64 | + |
| 65 | +Once created, vector indexes are not immediately ready to use although the `exists` check returns `true`. |
| 66 | +The actual status of a search index can be obtained via `SearchIndexOperations#status(...)`. |
| 67 | +The `READY` state indicates the index is ready to accept queries. |
| 68 | + |
| 69 | +[[mongo.search.vector.query]] |
| 70 | +=== Querying Vector Indexes |
| 71 | + |
| 72 | +Vector indexes can be queried by issuing an aggregation using a `VectorSearchOperation` via `MongoOperations` as shown in the following example |
| 73 | + |
| 74 | +.Query a Vector Index |
| 75 | +[tabs] |
| 76 | +====== |
| 77 | +Java:: |
| 78 | ++ |
| 79 | +==== |
| 80 | +[source,java,indent=0,subs="verbatim,quotes",role="primary"] |
| 81 | +---- |
| 82 | +VectorSearchOperation search = VectorSearchOperation.search("vector_index") <1> |
| 83 | + .path("plotEmbedding") <2> |
| 84 | + .vector( ... ) |
| 85 | + .numCandidates(150) |
| 86 | + .limit(10) |
| 87 | + .quantization(SCALAR) |
| 88 | + .withSearchScore("score"); <3> |
| 89 | +
|
| 90 | +AggregationResults<MovieWithSearchScore> results = mongoTemplate |
| 91 | + .aggregate(newAggregation(Movie.class, search), MovieWithSearchScore.class); |
| 92 | +---- |
| 93 | +<1> Provide the name of the vector index to query since a collection may hold multiple ones. |
| 94 | +<2> The name of the path used for comparison. |
| 95 | +<3> Optionally add the search score with given name to the result document. |
| 96 | +==== |
| 97 | +
|
| 98 | +Mongo Shell:: |
| 99 | ++ |
| 100 | +==== |
| 101 | +[source,console,indent=0,subs="verbatim,quotes",role="secondary"] |
| 102 | +---- |
| 103 | +db.embedded_movies.aggregate([ |
| 104 | + { |
| 105 | + "$vectorSearch": { |
| 106 | + "index": "vector_index", |
| 107 | + "path": "plot_embedding", <1> |
| 108 | + "queryVector": [ ... ], |
| 109 | + "numCandidates": 150, |
| 110 | + "limit": 10, |
| 111 | + "quantization": "scalar" |
| 112 | + } |
| 113 | + }, |
| 114 | + { |
| 115 | + "$addFields": { |
| 116 | + "score": { $meta: "vectorSearchScore" } |
| 117 | + } |
| 118 | + } |
| 119 | +]) |
| 120 | +---- |
| 121 | +<1> Field name `plotEmbedding` got mapped to `plot_embedding` considering a `@Field(name = "...")` annotation. |
| 122 | +==== |
| 123 | +====== |
| 124 | + |
0 commit comments