Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add x_tolerance_ratio param to extract_text and similar functions (now properly linted!) #1041

Merged
merged 8 commits into from
Nov 9, 2023

Conversation

afriedman412
Copy link
Contributor

Fix #987 (partially)

Passing x_tolerance_ratio to extract_text() and any other function that relies on WordExtractor will use the ratio * text size to determine where words begin and end. Overrides the x_tolerance param.

There is room to build out y_tolerance_ratio too, if need be in the future!

Copy link

codecov bot commented Nov 8, 2023

Codecov Report

Merging #1041 (33ac833) into develop (d9561d1) will not change coverage.
Report is 14 commits behind head on develop.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##           develop     #1041   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           18        18           
  Lines         1615      1620    +5     
=========================================
+ Hits          1615      1620    +5     
Files Coverage Δ
pdfplumber/utils/text.py 100.00% <100.00%> (ø)

@jsvine jsvine changed the base branch from stable to develop November 9, 2023 20:13
@jsvine
Copy link
Owner

jsvine commented Nov 9, 2023

Thanks, and now merging!

@jsvine jsvine merged commit fff846b into jsvine:develop Nov 9, 2023
7 checks passed
@afriedman412 afriedman412 deleted the x_tolerance_ratio branch November 9, 2023 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

For text extraction, add fractional versions of x/y_tolerance arguments
2 participants