Skip to content

Latest commit

 

History

History
139 lines (132 loc) · 5.94 KB

CHANGELOG.rst

File metadata and controls

139 lines (132 loc) · 5.94 KB

Changelog

  • N/A
  • 1.9.0 (2024-02-29)
    • Adding support for Python 3.12
    • Accepting only ASCII characters left from TLD.
    • Fixing parsing of Markdown links
    • Fixing filter of mixed case hostnames
    • updated list of TLDs
  • 1.8.0 (2022-12-19)
    • adding ability to filter out mixed case hostnames (issue #143)
    • adding the ability to set stop characters inside of scheme - default stop chars ':' (issue #82)
    • Fix index issue with uppercase characters in domain names - by Peng Wang
    • updated GitHub Action test - by Wu Tingfeng
  • 1.7.1 (2022-10-25)
    • fixes urlextract without authority causes AttributeError
  • 1.7.0 (2022-10-22)
    • correct handling when authority starts with @ symbol
    • remove unreserved characters from the beginning of found URL
    • added typing and mypy checkcs - by mimi89999
    • updated list of TLDs
  • 1.6.0 (2022-05-17)
    • Add a list of URLs allowed to extract (issue #125) - by khoben
    • correct order of actual and expected in tests
    • updated list of TLDs
  • 1.5.0 (2021-12-22)
    • Fix incorrect indices when TLD is found twice (issue #109)
    • Replace unmaintained appdirs with maintained platformdirs - by Hugo van Kemenade (issue #106)
    • update readme, code style and code formatting using black - by za
    • updated list of TLDs
  • 1.4.0 (2021-10-06)
    • urlextract detects URLs which start with double slash '//' (issue #94)
    • adding ability to return only URLs with schema (issue #96)
    • updated list of TLDs
  • 1.3.0 (2021-06-12)
    • fixing None of the cache directory is writable (issue #61)
    • fixes RE for IPv4 addresses - by kak-bo-che (issue #86)
    • updated list of TLDs
    • urlextract cli is telling people to report errors on GitHub
  • 1.2.0 (2020-12-08)
    • ignore space character before URL inside of enclosure (parenthesis) (issue #77)
    • case insensitive search for TLDs (issue #76)
    • removed methods get_stop_char, set_stop_char (deprecated since 0.7)
    • updated list of TLDs
  • 1.1.0 (2020-10-01)
    • possibility to return indices of found URLs - by Benoit Laures (issue #71)
    • fixed typo in error log message - by Yossi Rafelson
    • updated list of TLDs
  • 1.0.0
    • new feature: DNS caching - by John Vandenberg
    • fixed race condition in cache loading and don't hold lock during download #55 (#56) - by Ben Schmidt
    • updated MANIFEST.in (issue #56) - by John Vandenberg
    • fixing 'IPv4Address' object has no attribute 'split' (issue #57)
    • allow to use localhost as tld (issue #45) - by Diego Mascialino
  • 0.14.0
    • added detection of IPv4 addresses (issue #10)
    • catching PermissionError (issue #25)
    • support of ignore list - list of url exception (issue #40)
  • 0.13.0
    • fixed IPv4Address object has no attribute split (issue #41)
    • updated list of TLDs
  • 0.12.0
    • fixed missing URLs using find_urls (issue #42)
    • updated list of TLDs
    • added config for bump2version
  • 0.11
    • added ability to turn on/off detecting email addresses (issue #37)
    • improved excluding of trailing enclosure characters (issue #38)
    • fixing - Incomplete URL extracted (issue #39)
    • trailing '/' after TLD is kept as part of found URL
    • set auto deploy in Travis CI
  • 0.10
    • only longest URL is returned when URLs contains URLs (issue #17)
    • fixed bug ValueError with text from a reference (issue #30)
    • order of returned URLs is preserved (same as order in the input text) while retuning unique URLs (issues #31)
    • code refactoring (created separate classes for urlextract logic and cache file manipulation)
    • fixed non deterministic extraction - (issue #33) by Dmitrii Gerasimov
  • 0.9
    • include list of TLDs to package
    • added 3 level fallback to cache directory
      • data directory inside package
      • users cache directory (using appdirs)
      • global temp directory
    • removed auto-updates from initialization of class
      • use update() or update_when_older() after creating object
    • updated parsing of URL surrounded with parenthesis (issue #23)
    • urlextract will now return URLs with Authority (e.g. emails)
    • added extracting URL surrounded by enclosure characters; (example.com) -> example.com (issue #14)
    • added methods for setting enclosure pairs
      • get_enclosures()
      • add_enclosure()
      • remove_enclosure()
    • fixing extraction of URLs from markdown (issue #15)
    • code changes:
      • using pytest for unit testing
      • removed python3.3 from automatic testing (unsupported by pytest)
  • 0.8.3
    • urlextract command line tool takes stdin as input when no parameter is set (issue #11).
    • URLExtract class raises exception instead of sys.exit()
    • Fixed issue #9; wrong result for several urls
    • Replaced print with logging module
    • code changes:
      • Console script moved directly to urlextract.py file.
      • PEP8 support
  • 0.7
    • Faster stop char matching
    • Fixing issue #7 by splitting stop characters to left and right. Created new methods:
      • get_stop_chars_left() and set_stop_chars_left()
      • get_stop_chars_right() and set stop_chars_right()
    • Deprecated:
      • get_stop_chars() and set_stop_chars()
  • 0.6
    • Make setup.py parsable on Python3 with LANG unset - by Dave Pretty (#6)
  • 0.5
    • Fix issue #5 - URL is extracted when it ends with TLD + after_tld_chars (usually: comma, dot, ...)
  • 0.4.1
    • Efficient use of memory in find_urls() method
  • 0.4
    • Adding features:
      • has_urls() - returns True if in text is at least one URL
      • gen_urls() - returns generator over found URLs
  • 0.3.2.6
    • Centralized version number
    • fixed bug when installing via pip on system without uritools installed
  • 0.3.2
  • 0.3.1
    • Adding badges to README.rst
  • 0.3
    • Adding hostname validation
  • 0.2.7
    • Public release