Skip to content

Tags: TimonKai/unstructured

Tags

0.5.4

Toggle 0.5.4's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
feat: add `partition_epub` function (Unstructured-IO#364)

* add pypandoc dependency

* added epub partitioner and file conversion

* test for partition_epub

* tests for file conversion

* add epub to filetype detection

* added epub to auto partition

* update bricks docs

* updated installing docs

* changelot and version

* add pandoc to dependencies

* add pandoc to debian dependencies

* linting, linting, linting

* typo fix

* typo fix

* file conversion type hints

* more type hints

---------

Co-authored-by: qued <64741807+qued@users.noreply.github.com>

0.5.3

Toggle 0.5.3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
feat: amazon linux 2 setup script (Unstructured-IO#350)

Added Amazon Linux 2 setup script. Also updated Ubuntu setup script to keep the scripts as aligned as possible.

Co-authored-by: cragwolfe <crag@unstructured.io>

0.5.2

Toggle 0.5.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
fix: ensure all text is maintained in html output (Unstructured-IO#335)

* fix: ensure all text is maintained in html pages

* add back in replace unicode quotes

* changelog and version bump

* apt-get update in ci

* white space differences in output

0.5.1

Toggle 0.5.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
bump: release commit (Unstructured-IO#317)

* update github ingest outputs

* CHANGELOG, test github ingest more often in CI

* more changelog detail

0.5.0

Toggle 0.5.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
fix: track narrative text and figure captions in HTML documents (Unst…

…ructured-IO#309)

* fix for missing narrative text in partition_html

* fixes so existing tests pass

* tests for figure caption and narrative text

* bump version; changelog

0.4.16

Toggle 0.4.16's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
build: Release commit for version 0.4.16 (Unstructured-IO#305)

0.4.15

Toggle 0.4.15's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
fix: preserve all elements when serialized; feat: helper functions fo…

…r serialization (Unstructured-IO#273)

* added type to text element map

* add element_id and coordinates

* added test for serialization

* added serialization for check boxes

* add dict_to_elements and covert_to_dict aliases

* helpers for serializing and deserializing elements

* bump version; changelog

* add Text to tests

* aliases for isd functions

* remove test elements json

* changelog updates

* make indent a kwarg

* update expected structured output

* docs update

* use new function in ingest code

* pop coordinates due to floating point differences

* pop coordinates

0.4.14

Toggle 0.4.14's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
build(deps): automatically download `nltk` models when required (Unst…

…ructured-IO#246)

* code for downloading nltk packages

* don't run nltk make command in ci

* test for model downloads

* remove nltk install from docs

* update changelog and bump version

0.4.13

Toggle 0.4.13's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
fix: Adds missing __init__.py (Unstructured-IO#259)

0.4.12

Toggle 0.4.12's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
build: new release (Unstructured-IO#249)

Cut a release that has the unstructured-ingest command line included in the unstructured package.

Bonus tweak to the Ingest checklist.