You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The rotation configuration for the PDF file is set to IndirectObject(12, 0, 4419697344). When uploading this file, it reports below error:
2024-07-19 14:40:21,146 - werkzeug - INFO - 10.131.0.1 - - [19/Jul/2024 14:40:21] "POST /api/v1/project/6698c33997df8cbbfd8770c0/document/?collection=6698c40f97df8cbbfd8770c1 HTTP/1.0" 500 -
Traceback (most recent call last):
File "/app/server/service/pdf_svc.py", line 767, in get_doc_pages_raw_data
for num_page_index in range(0, len(pdf.pages)):
File "/usr/local/lib/python3.9/site-packages/pdfplumber/pdf.py", line 142, in pages
p = Page(self, page, page_number=page_number, initial_doctop=doctop)
File "/usr/local/lib/python3.9/site-packages/pdfplumber/page.py", line 226, in init
self.rotation = _rotation % 360
TypeError: unsupported operand type(s) for %: 'NoneType' and 'int'
zzhangyun
changed the title
The pdf lost rotation configuration
The rotation configuration set to IndirectObject, which is preventing the PDF from being uploaded.
Jul 24, 2024
Hmmmm, as far as I'm aware, IndirectObject(12, 0, 4419697344) is not a valid value for a PDF's rotation. Given that the PDF loads without without problems when using pdfplumber.open(path, repair=True), I'm closing this issue, but feel free to continue the discussion here.
Generally any value in a PDF can be given either by a direct or indirect object with some exceptions explicitly mentioned in the spec.
For the page rotation no restriction is mentioned in the spec, so it may be indirect.
Thank you, @mkl-public and my apologies @zzhangyun; I misunderstood the issue, thinking that the value was literally set to IndirectObject(12, 0, 4419697344). This should now be fixed in c20cd3b.
Describe the bug
The rotation configuration for the PDF file is set to IndirectObject(12, 0, 4419697344). When uploading this file, it reports below error:
2024-07-19 14:40:21,146 - werkzeug - INFO - 10.131.0.1 - - [19/Jul/2024 14:40:21] "POST /api/v1/project/6698c33997df8cbbfd8770c0/document/?collection=6698c40f97df8cbbfd8770c1 HTTP/1.0" 500 -
Traceback (most recent call last):
File "/app/server/service/pdf_svc.py", line 767, in get_doc_pages_raw_data
for num_page_index in range(0, len(pdf.pages)):
File "/usr/local/lib/python3.9/site-packages/pdfplumber/pdf.py", line 142, in pages
p = Page(self, page, page_number=page_number, initial_doctop=doctop)
File "/usr/local/lib/python3.9/site-packages/pdfplumber/page.py", line 226, in init
self.rotation = _rotation % 360
TypeError: unsupported operand type(s) for %: 'NoneType' and 'int'
Have you tried repairing the PDF?
No
Code to reproduce the problem
Paste it here, or attach a Python file.
PDF file
Split_Part_1.pdf.zip
If you need to redact text in a sensitive PDF, you can run it through JoshData/pdf-redactor.
Expected behavior
What did you expect the result should have been?
Actual behavior
What actually happened, instead?
Screenshots
If applicable, add screenshots to help explain your problem.
Environment
Additional context
Add any other context/notes about the problem here.
The text was updated successfully, but these errors were encountered: