Name		Name	Last commit message	Last commit date
parent directory ..
parsr_client		parsr_client
tests		tests
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

README.md

Parsr Client

Provides a python interface to the Parsr tool via its API. Parsr transforms PDF, documents and images into enriched, structured data.

Find out all about Parsr (including download) at https://github.com/axa-group/Parsr.

1 Installation

pip install parsr-client

2 Usage

Make sure that the Parsr Server is already running. Let us suppose that the address is localhost:3001

2.1 Connect to the Parsr server

from parsr_client import ParsrClient
parsr = ParsrClient('localhost:3001')

2.2 Send the document

parsr.send_document(
   file_path='README.pdf',
   config_path='defaultConfig.json',
   document_name='The Readme',
   save_request_id=True)

2.4 Retrieve results

Get everything as a JSON:
```
parsr.get_json()
```
As Markdown:
```
parsr.get_markdown()
```
As text:
```
parsr.get_text()
```

Get the first table on the first page:

parsr.get_table(
    page=1,
    table=1,
)

Get all the versions of the document:
```
parsr.get_revisions('The Readme')
```
Get pretty diffs between each successive pair of a document's revisions:
```
parsr.compare_revisions('The Readme', pretty_html=True)
```

3 Interpreting the whole JSON output locally

The supplied ParsrOutputInterpreter class can be used to interpret the downloaded JSON output and generate higher level structures like the text body.

Here's an example to generate text body on the first page from the above example.

from parsr_client import ParsrOutputInterpreter

parsr_interpreter = ParsrOutputInterpreter(
    parsr.get_json()
)

t = parsr_interpreter.get_text(
    page_number=1
)
print(t)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python-client

python-client

README.md

Parsr Client

1 Installation

2 Usage

2.1 Connect to the Parsr server

2.2 Send the document

2.4 Retrieve results

3 Interpreting the whole JSON output locally

Files

python-client

Directory actions

More options

Directory actions

More options

Latest commit

History

python-client

Folders and files

parent directory

README.md

Parsr Client

1 Installation

2 Usage

2.1 Connect to the Parsr server

2.2 Send the document

2.4 Retrieve results

3 Interpreting the whole JSON output locally