Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

Implement documents API #324

Merged
merged 1 commit into from
Sep 21, 2021
Merged

Implement documents API #324

merged 1 commit into from
Sep 21, 2021

Conversation

MarinPostma
Copy link
Contributor

@MarinPostma MarinPostma commented Aug 24, 2021

This pr implement the intermediary document representation for milli. The JSON, JSONL and CSV formats are replaced with the format instead, to push the serialization duty on the client side.

The documents module contains the interface to the new document format:

  • The DocumentsBuilder allows the creation of a writer backed document addition, when documents are added either one by one, or as arrays of depth 1. This is made possible by the fact that the seriliazer used by the add_documents methods only accepts [Object] and Object. The related serialization logic is located in the serde.rs file.
  • The DocumentsReader allows to to iterate over the documents created by a DocumentsBuilder. A call to next_document_with_index returns the next obkv reader in the document addition, along with a reference to the index used to map the field ids in the obkv reader to the field names

All references to json, jsonl or csv in the tests have been replaced with the documents! macro, works exaclty like the serde_json::json macro, as a convenient way to create a DocumentsReader.

Rewrote the search cli, to the cli crate, to also allow index manipulation. This only offers basic functionalities for now, but is meant to be easier to extend than http ui

blocked by #308

@curquiza curquiza changed the title implement documents api Implement documents API Aug 31, 2021
@MarinPostma MarinPostma force-pushed the obkv-documents branch 4 times, most recently from bc7b676 to a1ce3cd Compare September 3, 2021 08:12
@MarinPostma MarinPostma marked this pull request as ready for review September 3, 2021 08:30
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review your clippy commit as it is too much and must not be in this PR, can you please remove it from there and create a new PR by cherry-picking it, please?

Cargo.toml Outdated Show resolved Hide resolved
cli/src/main.rs Outdated Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
cli/Cargo.toml Show resolved Hide resolved
http-ui/src/main.rs Show resolved Hide resolved
milli/src/update/index_documents/transform.rs Outdated Show resolved Hide resolved
milli/src/update/index_documents/transform.rs Show resolved Hide resolved
milli/src/update/settings.rs Outdated Show resolved Hide resolved
milli/src/update/update_step.rs Show resolved Hide resolved
milli/tests/search/mod.rs Outdated Show resolved Hide resolved
@MarinPostma MarinPostma force-pushed the obkv-documents branch 2 times, most recently from 16b641c to 46b80b8 Compare September 9, 2021 13:56
@MarinPostma MarinPostma force-pushed the obkv-documents branch 3 times, most recently from a99dc21 to 6de1b41 Compare September 13, 2021 14:12
@curquiza curquiza added the DB breaking The related changes break the DB label Sep 20, 2021
http-ui/src/main.rs Show resolved Hide resolved
milli/src/documents/mod.rs Show resolved Hide resolved
milli/src/documents/serde.rs Outdated Show resolved Hide resolved
milli/src/documents/mod.rs Show resolved Hide resolved
milli/src/documents/serde.rs Outdated Show resolved Hide resolved
milli/src/documents/serde.rs Outdated Show resolved Hide resolved
ManyTheFish
ManyTheFish previously approved these changes Sep 21, 2021
@MarinPostma
Copy link
Contributor Author

I have rebased and squashed this pr

milli/Cargo.toml Outdated Show resolved Hide resolved
milli/src/update/index_documents/transform.rs Show resolved Hide resolved
document reader transform

remove update format

support document sequences

fix document transform

clean transform

improve error handling

add documents! macro

fix transform bug

fix tests

remove csv dependency

Add comments on the transform process

replace search cli

fmt

review edits

fix http ui

fix clippy warnings

Revert "fix clippy warnings"

This reverts commit a1ce3cd.

fix review comments

remove smallvec in transform loop

review edits
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you marin!
bors merge

@bors
Copy link
Contributor

bors bot commented Sep 21, 2021

@bors bors bot merged commit 9d9010e into main Sep 21, 2021
@bors bors bot deleted the obkv-documents branch September 21, 2021 15:53
@irevoire irevoire mentioned this pull request Sep 22, 2021
bors bot added a commit that referenced this pull request Sep 22, 2021
364: Fix all the benchmarks  r=Kerollmops a=irevoire

#324 broke all benchmarks.
I fixed everything and noticed that `cargo check --all` was insufficient to check the bench in multiple workspaces, so I also updated the CI to use `cargo check --workspace --all-targets`.

Co-authored-by: Tamo <tamo@meilisearch.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
DB breaking The related changes break the DB
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants