Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The rotation configuration set to IndirectObject, which is preventing the PDF from being uploaded. #1176

Closed
zzhangyun opened this issue Jul 24, 2024 · 3 comments
Labels

Comments

@zzhangyun
Copy link

zzhangyun commented Jul 24, 2024

Describe the bug

The rotation configuration for the PDF file is set to IndirectObject(12, 0, 4419697344). When uploading this file, it reports below error:

2024-07-19 14:40:21,146 - werkzeug - INFO - 10.131.0.1 - - [19/Jul/2024 14:40:21] "POST /api/v1/project/6698c33997df8cbbfd8770c0/document/?collection=6698c40f97df8cbbfd8770c1 HTTP/1.0" 500 -
Traceback (most recent call last):
File "/app/server/service/pdf_svc.py", line 767, in get_doc_pages_raw_data
for num_page_index in range(0, len(pdf.pages)):
File "/usr/local/lib/python3.9/site-packages/pdfplumber/pdf.py", line 142, in pages
p = Page(self, page, page_number=page_number, initial_doctop=doctop)
File "/usr/local/lib/python3.9/site-packages/pdfplumber/page.py", line 226, in init
self.rotation = _rotation % 360
TypeError: unsupported operand type(s) for %: 'NoneType' and 'int'

Have you tried repairing the PDF?

No

Code to reproduce the problem

Paste it here, or attach a Python file.

PDF file

Split_Part_1.pdf.zip

If you need to redact text in a sensitive PDF, you can run it through JoshData/pdf-redactor.

Expected behavior

What did you expect the result should have been?

Actual behavior

What actually happened, instead?

Screenshots

If applicable, add screenshots to help explain your problem.

Environment

  • pdfplumber version: [e.g., 0.5.22]
  • Python version: [e.g., 3.8.1]
  • OS: [e.g., Mac, Linux, etc.]

Additional context

Add any other context/notes about the problem here.

@zzhangyun zzhangyun added the bug label Jul 24, 2024
@zzhangyun zzhangyun changed the title The pdf lost rotation configuration The rotation configuration set to IndirectObject, which is preventing the PDF from being uploaded. Jul 24, 2024
@jsvine
Copy link
Owner

jsvine commented Aug 3, 2024

Hmmmm, as far as I'm aware, IndirectObject(12, 0, 4419697344) is not a valid value for a PDF's rotation. Given that the PDF loads without without problems when using pdfplumber.open(path, repair=True), I'm closing this issue, but feel free to continue the discussion here.

@jsvine jsvine closed this as completed Aug 3, 2024
@mkl-public
Copy link

Generally any value in a PDF can be given either by a direct or indirect object with some exceptions explicitly mentioned in the spec.
For the page rotation no restriction is mentioned in the spec, so it may be indirect.

@jsvine
Copy link
Owner

jsvine commented Aug 4, 2024

Thank you, @mkl-public and my apologies @zzhangyun; I misunderstood the issue, thinking that the value was literally set to IndirectObject(12, 0, 4419697344). This should now be fixed in c20cd3b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants