-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for exact matches only? #41
Comments
@buchanae general query like You can get exactly what you need by using the fielded query: or limited to human only: |
Ah, ok, thanks! I actually can't even reproduce the results I mentioned now. Wish I had posted the query. These are the queries I tried this morning: https://gist.github.com/buchanae/5cba60894e190c35da1ac3e1c7e5e511 |
Here's an example I don't understand:
Since I'm not passing |
And another.
As far as I can tell, the second match is happening because of a partial match on the string |
@buchanae "alias" field was indexed as free text, as we did observe the values of "alias" field can have whitespaces in it sometime. We can do some more inspection on the alias field and optimize the indexing a bit (e.g. do not treat "-" as a word separator). |
"alias" field is coming from entrez_gene collection, currently contains 21M documents:
|
Querying something like "BRCA1", I get a lot of seemingly unrelated matches such as "BRAT1".
This is obviously a symptom of the nature of ElasticSearch. In analytical use cases, personally, I think fuzzy matches are dangerous.
Could we add a query parameter to require an exact match? Or maybe it exists and I'm not seeing the docs?
The text was updated successfully, but these errors were encountered: