Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move kNN search and dense vectors to core #87815

Merged
merged 5 commits into from
Jun 23, 2022

Conversation

jtibshirani
Copy link
Contributor

This PR proposes moving kNN search and dense vector support out of an xpack
plugin and into server.

In #87625 we plan to integrate ANN search into the main _search endpoint as a
new top-level component called knn. So kNN will be a dedicated part of the
search request, and we'll have kNN logic within the search phases. The classes
and logic will live in server, matching the other search components like
suggesters, field collapsing, etc.

Why can't we leave dense vector in its own plugin, and just add kNN search
logic in server?

  • This makes testing more difficult/ confusing, since kNN search can only be
    exercised effectively using dense vector fields. All tests for kNN search would
    need to live in the dense vector plugin, although kNN is part of core search.
  • It feels cleanest for all logic related to kNN search to live in the same
    place.

@jtibshirani jtibshirani force-pushed the move-knn-search branch 6 times, most recently from f66350f to 4896d85 Compare June 20, 2022 00:31
@jtibshirani jtibshirani added :Search/Search Search-related issues that do not fall into other categories >refactoring labels Jun 21, 2022
@jtibshirani jtibshirani marked this pull request as ready for review June 21, 2022 18:23
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jun 21, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@sethmlarson sethmlarson added the Team:Clients Meta label for clients team label Jun 21, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/clients-team (Team:Clients)

@jtibshirani
Copy link
Contributor Author

jtibshirani commented Jun 21, 2022

Summary of changes
Scripting:

  • ScoreScriptUtils with vector functions renamed to VectorScoreScriptUtils, moved to org.elasticsearch.script
  • Rename DenseVectorFunctionTests to VectorScoreScriptUtilsTests
  • Merge whitelist extensions (*.txt files) into existing context files
  • Move script fields API classes to org.elasticsearch.script.field.vectors
  • Move script YAML tests to lang-painless test suite

kNN search:

  • Move RestKnnSearchAction to org.elasticsearch.rest.action.search
  • Move supporting classes like KnnVectorQueryBuilder to org.elasticsearch.search.vectors
  • Move kNN search YAML tests to rest-api-spec test suite

Dense vector field:

  • Move field mapper and helper classes to org.elasticsearch.index.mapper.vectors

Other:

  • Remove hacky interface PerFieldKnnVectorsFormatFieldMapper since it's no longer needed, also remove MappingLookup#getKnnVectorsFormatForField
  • Make sure _knn_search works with new CCS integration tests

Copy link
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtibshirani Thanks, this arrangement LGTM.

@jtibshirani jtibshirani merged commit 3a9e511 into elastic:master Jun 23, 2022
@jtibshirani jtibshirani deleted the move-knn-search branch June 23, 2022 04:10
@jtibshirani jtibshirani added :Search Relevance/Vectors Vector search and removed :Search/Search Search-related issues that do not fall into other categories labels Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>refactoring :Search Relevance/Vectors Vector search Team:Clients Meta label for clients team Team:Search Meta label for search team v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants