🖼️📄E2E Multi-modal Document Preprocessing with Azure Document Intelligence
-
Updated
Oct 22, 2025 - Python
🖼️📄E2E Multi-modal Document Preprocessing with Azure Document Intelligence
Workshop for Azure OpenAI Service
This is a collection of various document parsers and hands-on to construct structured data for your RAG applications.
AI-Powered Web Application for Talent Search and CV Management
An application that automatically parses bank statements to visualize current income and spending compared to budgeting and savings targets
A channel layer that turns physical documents into trustworthy digital data — OCR + Markdown + metadata + optional field extraction, exposed via REST / EventBus / MCP / Webhook to downstream RAG platforms, business systems, and AI clients. Built on ABP.
Azure Document Intelligence Result Processor: A toolset for annotating PDFs based on Azure Document Intelligence analysis results, featuring a React web application and a standalone Python script for processing and visualizing extracted data with confidence indicators.
Solución inteligente para la digitalización y gestión de facturas. Transforma documentos PDF no estructurados en datos SQL procesables mediante IA, optimizando el flujo de trabajo financiero.
OCR-enabled PDF text extraction in Python with pypdf and Azure Document Intelligence.
Frontend and Backend Web App for Receipt Splitting with Friends
Multi-agent parallel document extraction using Gemini LLM and Azure Document Intelligence OCR, running locally in Docker.
PDF extraction samples comparing Azure Document Intelligence (layout model) 🏢 vs Markitdown ✍️vs Apache Tika
🚀 Intelligent document extraction system powered by Azure AI & Gemini 2.5. Transform any form into structured JSON with real-time editing and enterprise-grade validation.
Scribbly - Convert your boring notes into interactive flashcards using Azure Text Analytics, Azure Document Intelligence
A simple Python Tk app to automate the process of uploading invoices to the Evelstar courier portal.
A Streamlit-based app with a FastAPI backend for extracting structured data (text, images, tables) from websites and PDFs. Processed data is stored in AWS S3 and rendered in a markdown-standardized format. APIs are deployed on Google Cloud Run Service
Enterprise AI system to classify, split, and auto-route PDFs using Azure Document Intelligence and SharePoint.
Serverless invoice extraction API using Azure Document Intelligence and Azure Functions. Upload a PDF invoice and receive normalized JSON output including line items, totals, dates, and vendor details.
Uses OCR and PII detection models to mask PII in .tiff files. Configurable to use Azure and AWS OCR and PII detection models
Add a description, image, and links to the azure-document-intelligence topic page so that developers can more easily learn about it.
To associate your repository with the azure-document-intelligence topic, visit your repo's landing page and select "manage topics."