A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
-
Updated
Dec 29, 2025 - Python
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
Fast and Accurate ML in 3 Lines of Code
LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence https://arxiv.org/abs/2509.03505
Get clean data from tricky documents, powered by vision-language models ⚡
Extract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple formats (Markdown, JSON, CSV, HTML) with intelligent structured data extraction and advanced OCR.
DeepTables: Deep-learning Toolkit for Tabular data
PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)
Identify hardcoded secrets in static structured text
Streamlit PDF viewer
Accurate, private and configurable document retrieval LLM
GPTuner is a manual-reading database tuning system leveraging domain knowlege automatically and extensively to enhance knob tuning process.
Superpipe - optimized LLM pipelines for structured data
Extract structured data from any content using LLMs.
Excel to structured JSON (tables, shapes, charts) for LLM/RAG pipelines
A ready-to-use framework of the state-of-the-art models for structured (tabular) data learning with PyTorch. Applications include recommendation, CRT prediction, healthcare analytics, anomaly detection, and etc.
Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and HTML files. Extensive support of tabular data extraction and multimodal queries.
Automatic machine learning for tabular data. ⚡🔥⚡
Extract structured data from local or remote LLM models
Python library for Entities, relationships and schemas extraction from documents
General template for most Pytorch projects
Add a description, image, and links to the structured-data topic page so that developers can more easily learn about it.
To associate your repository with the structured-data topic, visit your repo's landing page and select "manage topics."