Improved file parsing for LLM’s
-
Updated
Nov 13, 2024 - Python
Improved file parsing for LLM’s
A lightweight Python library for metadata-rich document chunking in Retrieval-Augmented Generation (RAG) workflows. It leverages Azure AI Document Intelligence to enhance chunking by retaining hierarchical structure, page numbers, and bounding boxes for seamless integration with PDF viewers.
--UNDER CONSTRUCTION-- (Undergrad Research) Exploring layout parsing capabilities in Python
Add a description, image, and links to the layout-parsing topic page so that developers can more easily learn about it.
To associate your repository with the layout-parsing topic, visit your repo's landing page and select "manage topics."