Skip to content

Commit

Permalink
Process subtractive_selectors first
Browse files Browse the repository at this point in the history
Currently, if a filter rule is more selective than the subtractive selector, the subtractive selector will not be executed because the information by which to do the subtraction is already discarded. This fixes that by removing elements first and then applying filter selections.
  • Loading branch information
emichael committed Sep 27, 2022
1 parent 3ebb2ab commit cb4c7dd
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions changedetectionio/fetch_site_status.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,17 +157,17 @@ def run(self, uuid):
stripped_text_from_html = html_content
else:
# Then we assume HTML
if has_subtractive_selectors:
html_content = html_tools.element_removal(subtractive_selectors, html_content)

if has_filter_rule:
# For HTML/XML we offer xpath as an option, just start a regular xPath "/.."
if css_filter_rule[0] == '/' or css_filter_rule.startswith('xpath:'):
html_content = html_tools.xpath_filter(xpath_filter=css_filter_rule.replace('xpath:', ''),
html_content=fetcher.content)
html_content=html_content)
else:
# CSS Filter, extract the HTML that matches and feed that into the existing inscriptis::get_text
html_content = html_tools.css_filter(css_filter=css_filter_rule, html_content=fetcher.content)

if has_subtractive_selectors:
html_content = html_tools.element_removal(subtractive_selectors, html_content)
html_content = html_tools.css_filter(css_filter=css_filter_rule, html_content=html_content)

if not is_source:
# extract text
Expand Down

0 comments on commit cb4c7dd

Please sign in to comment.