Skip to content

Commit

Permalink
Sparrow Parse refactoring and cleanup from dependencies
Browse files Browse the repository at this point in the history
  • Loading branch information
abaranovskis-redsamurai committed Nov 8, 2024
1 parent 98a7ec9 commit 2fb649f
Show file tree
Hide file tree
Showing 10 changed files with 8 additions and 959 deletions.
5 changes: 1 addition & 4 deletions sparrow-data/parse/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,14 +1,11 @@
torch==2.2.2
unstructured[all-docs]==0.14.5
unstructured-inference==0.7.33
rich
pymupdf4llm==0.0.9
transformers==4.41.2
sentence-transformers==3.0.1
numpy==1.26.4
pypdf==4.3.0
easyocr==1.7.1
gradio_client
pdf2image


# Force reinstall:
Expand Down
6 changes: 3 additions & 3 deletions sparrow-data/parse/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@

setup(
name="sparrow-parse",
version="0.3.4",
version="0.3.5",
author="Andrej Baranovskij",
author_email="[email protected]",
description="Sparrow Parse is a Python package for parsing and extracting information from documents.",
description="Sparrow Parse is a Python package (part of Sparrow) for parsing and extracting information from documents.",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/katanaml/sparrow/tree/main/sparrow-data/parse",
Expand All @@ -30,7 +30,7 @@
'sparrow-parse=sparrow_parse:main',
],
},
keywords="llm, rag, vision",
keywords="llm, vllm, ocr, vision",
packages=find_packages(),
python_requires='>=3.10',
install_requires=requirements,
Expand Down
2 changes: 1 addition & 1 deletion sparrow-data/parse/sparrow_parse/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '0.3.4'
__version__ = '0.3.5'
9 changes: 0 additions & 9 deletions sparrow-data/parse/sparrow_parse/data/invoice_1_table.txt

This file was deleted.

251 changes: 0 additions & 251 deletions sparrow-data/parse/sparrow_parse/extractors/html_extractor.py

This file was deleted.

Loading

0 comments on commit 2fb649f

Please sign in to comment.