You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Attempted fix in 981b8a5, but this causes a StackOverflowError in the regex evaluation (even though small-scale test succeeds). Possibly the document is too large for handling this way. See https://stackoverflow.com/a/7510006
It might be time to consider rewriting highlighting of document fragments using something like https://jsoup.org/
If that doesn't seem practical for whatever reason, another alternative is to loop through the document character by character, only using regexes whenever we find a < (and possibly not using them at all for comments or CDATA, which can get large, unlike tags).
"Tags" inside a CDATA are seen as actual (unbalanced) XML open tags, and closing tags are added at the end of the document.
Example:
https://portal.clarin.ivdnt.org/blacklab-server-new/opensonar/docs/WR-P-E-C-0000000129/contents?query=%5Bword%3D%22schip%22%5D&wordstart=7000
The text was updated successfully, but these errors were encountered: