Get rid of code duplication between Autocomplete and Search #331

duker33 · 2019-05-16T07:32:38Z

Now configuring Search and Autocomplete search_entities list leads to different results.
But they should be identical

ArtemijRodionov · 2019-05-16T18:57:23Z

@duker33 Title of the issue looks like you going to refactor search.views, but after I made a review it seemed to me that you are going to refactor search.search. We already have an issue for it.

What exactly are you going to do in this issue?

duker33 · 2019-05-17T08:24:58Z

@artemiy312 , wow, i didn't notice this issue, sorry.
It will affect search.views too, of course. Since search.views now is strong coupled to search.search one.

You hold #289 throw 05 Mar. So, let me take it plz.
If you agree, we'll mark #289 as duplicate of current issue and move along

ArtemijRodionov · 2019-05-17T13:39:08Z

@duker33 Let's specify an issue we are going to solve here.

We have a mess in search.views, that should be generalized. I think we both agree on that.
search.search looks ok on the background. It just need little bit refactored and cleaned up. It lift up lot of work that should be separated
We have shared logic between frontend and backend:

Backend performs a search query and returns a response with json
Frontend somehow tries to parse the json with heterogeneous data and renders it

All backend logic is generalized in refarm-site, but frontend is uniquely implemented in each client site.
As i remember we have a convention to keep frontend thin

duker33 · 2019-05-17T15:36:01Z

@artemiy312 , i agree with most of the points, but i think that search.views.AutocompleteView is not perfect.
I've just tried to add some new search_entities to Autocomplete and it didn't work. I looked AutocompleteView, broke my brain with it's prepare_fields method.
So, i decided to redesign this strange procedural data adapting to some modular and obvious.

Bad AutocompleteView logic roots from search.search of course. That's why i started from it.
Good search.search will hide all data adapting inside. So AutocompleteView will go out in favor of simple objects composition directly in the client code.
I provided example at the related PR:

# site's side
def autocomplete(request, query):
    # we'll use django orm search based on postgresql.
    # Query sets will already contain searched entities.
    # https://docs.djangoproject.com/en/1.11/topics/db/search/
    return JsonResponse(list(
        ResultsStack([
            Results(
                CategoryQuerySet().filter(name__search=query),
            ),
            Results(
                ProductQuerySet().filter(name__search=query),
            ),
        ]).chain()
    ))

ArtemijRodionov · 2019-05-19T14:04:31Z

@duker33 >Good search.search will hide all data adapting inside

It doesn't mean that we should fully rewrite it, I believe we should create a new abstraction, that will adapt data to needs.

I worry that we will lose some features and get more bugs, but if you sure let's make like this

duker33 · 2019-05-19T18:55:53Z

@artemiy312 , search.search.search is the only data seaching behaviour. The rest of search.search code are strange conditions when we should use one or another search algo. We have no tests, no UI analytics about search efficiency. So, we can throw away current search realization in favor of much more ready and clear django qs based search.

Also we have enough resources to implement good arch. How we can move forward our arch discussion? That's my proposition: let's merge related PR with my arch version. And you'll propose another PR fixing my arch prototype with expreaded code comments.

Let's move?

ArtemijRodionov · 2019-05-20T16:21:19Z

@duker33 >So, we can throw away current search realization in favor of much more ready and clear django qs based search.

Django qs search uses full-text search, in our case it is postgresql full-text search.
We decided to not use it because it doesn't provide fuzzy full-text search, so it can't handle any typos in a query. Instead we are using the trigram extension.

Trigram works only with textual data, so we have to determine the type of searched fields to provide ability to search by non-textual fields too. For those non-textual fields we are using our custom search function. All it does the current implementation, so we can't simply throw away it

We should document this

duker33 · 2019-05-20T16:51:23Z

@artemiy312 , trigram is implemented at postgres side too.
We can choose algo like this. It can solve the problem i think:

def autocomplete(request, query):
    # we'll use django orm search based on postgresql.
    # Query sets will already contain searched entities.
    # https://docs.djangoproject.com/en/1.11/topics/db/search/
    return JsonResponse(list(
        ResultsStack([
            Results(
                CategoryQuerySet().filter(name__search=query),
                CategoryQuerySet().filter(mark__trigram=query),
                CategoryQuerySet().filter(h1__search=query),
            ),
            Results(
                ProductQuerySet().filter(name__search=query),
                ProductQuerySet().filter(mark__trigram=query),
                ProductQuerySet().filter(h1__search=query),
            ),
        ]).chain()
    ))

I'll repeat myself: we have no data if current search algo is working or not. Maybe simple name__search will make more good UX then current algo.
So, we can throw it away in favor of a new flexible system with min start features

ArtemijRodionov · 2019-05-20T17:25:05Z

@duker33 >We can choose algo like this. It can solve the problem i think:

We already have implemented this algo in search.search, but it makes less queries to db. Your algo will make a query per a field, but current implementation makes a query per a model. Soon or later we have to implement the same what we already have.
I want to emphasize that current implementation already use django qs search and postgresql trigram extension, but in more optimized way.

I believe it will be enough to rewrite search.views and refactor search.search

duker33 · 2019-05-20T19:44:42Z

@artemiy312 , let me fix example =)

Soon or later we have to implement the same what we already have.
Of course we have. That's my aim: do the same in more clear way.

def autocomplete(request, query):
    # we'll use django orm search based on postgresql.
    # Query sets will already contain searched entities.
    # https://docs.djangoproject.com/en/1.11/topics/db/search/
    return JsonResponse(list(
        ResultsStack([
            Results(
                CategoryQuerySet().filter(
					SearchQuery(name__search=query)
					| SearchQuery(mark__trigram=query)
					| SearchQuery(h1__search=query)
				),
            ),
            Results(
                ProductQuerySet().filter(
					SearchQuery(name__search=query)
					| SearchQuery(mark__trigram=query)
					| SearchQuery(h1__search=query)
				),
            ),
        ]).chain()
    ))

We also can use sorting by SearchRank and so on.

Generally speaking our task consists of two parts:

1. Search engine
It should be implemented with django querysets. It's django's recommended way and it's have good arch i beleave.
Now we have owful procedural code like this:

if not self.trigram_fields:
	return search(term, self.qs, self.decimal_fields)
elif not self.decimal_fields:
	return _trigram_search(term)
else:
	trigram_data = _trigram_search(term)
	default_data = search(term, self.qs, self.decimal_fields)
	return trigram_data if trigram_data else default_data if default_data else None

We can relatively easily rewrite it in one queryset. If you don't believe it, just let me continue working with code.
I'll propose some real code, you'll review it and we'll discuss.

2. Results list
It has geterogeniouos nature. Every element of the list can have represent different entity.
We should provide clear mech to pack results list for rendering.

I proposed a way to do it. It's definetily not the most perfect. But it's much more good then simple geterogeniouos list merging with no interface and contracts.
Results list template simply don't know (and no one know) what fields are exists and what are not

ArtemijRodionov · 2019-05-25T03:19:54Z

@duker33

do the same in more clear way.

You provided the code that uses full-text search. As I said before, full-text search doesn't support fuzzy search, it doesn't handle any typos in a search query. Also postgres can't combine the full-text search with trigram

We should provide clear mech to pack results list for rendering.

I have concern about the heterogeneous list too, but if we will make it homogeneous we have to produce such template:

{% for r in results %}
  {% if r.name %}
    {{ r.name }} 
  {% endif %}
  {% if r.price %}
    {{ r.price }} 
  {% endif %}
  {% if r.specification %}
    {{ r.specification }} 
  {% endif %}
  and so on
{% endfor %}

It looks unclear to me. Things will get worse, when we add markup.
Also we have to extends and manages Result and Results classes.

I'd prefer to use dict for this purpose:

{% if results.products %}
  {% for p in results.products %}
    {{ p.name }} {{ p.price }} {{ p.category.name }}
  {% endfor %}
{% endif %}
{% if results.series %}
  {% for s in results.series %}
    {{ s.name }} {{ p.specification }}
  {% endfor %}
{% endif %}

But it seems a matter of taste, so it will up to you.

duker33 · 2019-05-25T07:05:42Z

@artemiy312 ,

Also postgres can't combine the full-text search with trigram

It doesn't clear to me why postgres can't mix trigram with the common search. On django QS level trigram is just another query filter (trigram_similar) or can be used with TrigramSimilarity/TrigramDistance queries just as common queryset
https://docs.djangoproject.com/en/2.2/ref/contrib/postgres/search/#trigram-similarity

As I said before, full-text search doesn't support fuzzy search, it doesn't handle any typos in a search query

It's true, but trigram has pros too. I'll not going cover them - it's for the separated task. But in general, we should be able to combine different techniques for different content: trigram/search/like.

we have to produce such template: <...> It looks unclear to me.

It's true problem. To solve we can provide good default empty values at Result class. This code will be much more clear since it will use class instantiation.

Let me generalize the question you've shown: Should search template know about results nature or not?
I don't know the right answer. Only i know, that this knowledge is semantic coupling between templates and search view. So we should clarify why we use it (if we).
If we'll face this issue, we'll fix template processing somehow. It'll not be simple, but will be easy

duker33 · 2019-05-31T12:04:37Z

i close the discussion. We'll continue it at sub issue's PR. We'll discuss more concrete code proposition

* #331 Draft for the search arch * #331 Subtask new search arch implementation * #331 Drop contexts in favor of results processing * #331 Minor imports fix * #331 Review#1 fixes. Approve docs

0pdd · 2019-05-31T12:05:50Z

@duker33 the puzzle #337 is still not solved.

duker33 self-assigned this May 16, 2019

duker33 added a commit that referenced this issue May 16, 2019

#331 Draft for the search arch

66b462a

duker33 added a commit that referenced this issue May 16, 2019

#331 Subtask new search arch implementation

a2099a6

This was referenced May 16, 2019

Use the new search app arch fidals/stroyprombeton#652

Open

#331 Proposition for a new search app arch #332

Merged

duker33 added a commit that referenced this issue May 16, 2019

#331 Drop contexts in favor of results processing

e375136

duker33 added a commit that referenced this issue May 17, 2019

#331 Minor imports fix

a6bf558

duker33 added the blocked issue is blocked by another issue label May 17, 2019

duker33 added a commit that referenced this issue May 17, 2019

#331 Review#1 fixes. Approve docs

59bd500

duker33 added 2 performer can implement issue at his closest convenient time and removed 1 burning issue labels May 19, 2019

duker33 added discuss issue needs to finish discussion before start working and removed blocked issue is blocked by another issue labels May 19, 2019

duker33 removed the discuss issue needs to finish discussion before start working label May 31, 2019

duker33 closed this as completed in #332 May 31, 2019

0pdd mentioned this issue May 31, 2019

arch_draft.py:4-10: Implement the new arch at the search... #337

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get rid of code duplication between Autocomplete and Search #331

Get rid of code duplication between Autocomplete and Search #331

duker33 commented May 16, 2019 •

edited

Loading

ArtemijRodionov commented May 16, 2019 •

edited

Loading

duker33 commented May 17, 2019

ArtemijRodionov commented May 17, 2019

duker33 commented May 17, 2019 •

edited

Loading

ArtemijRodionov commented May 19, 2019

duker33 commented May 19, 2019

ArtemijRodionov commented May 20, 2019

duker33 commented May 20, 2019

ArtemijRodionov commented May 20, 2019 •

edited

Loading

duker33 commented May 20, 2019

ArtemijRodionov commented May 25, 2019

duker33 commented May 25, 2019

duker33 commented May 31, 2019

0pdd commented May 31, 2019

Get rid of code duplication between Autocomplete and Search #331

Get rid of code duplication between Autocomplete and Search #331

Comments

duker33 commented May 16, 2019 • edited Loading

ArtemijRodionov commented May 16, 2019 • edited Loading

duker33 commented May 17, 2019

ArtemijRodionov commented May 17, 2019

duker33 commented May 17, 2019 • edited Loading

ArtemijRodionov commented May 19, 2019

duker33 commented May 19, 2019

ArtemijRodionov commented May 20, 2019

duker33 commented May 20, 2019

ArtemijRodionov commented May 20, 2019 • edited Loading

duker33 commented May 20, 2019

ArtemijRodionov commented May 25, 2019

duker33 commented May 25, 2019

duker33 commented May 31, 2019

0pdd commented May 31, 2019

duker33 commented May 16, 2019 •

edited

Loading

ArtemijRodionov commented May 16, 2019 •

edited

Loading

duker33 commented May 17, 2019 •

edited

Loading

ArtemijRodionov commented May 20, 2019 •

edited

Loading