

Document OCR Agent
Model
Available ActionsEach successful request consumes credits as outlined below.
process_document20cr
Details
Hire the OCR AI model to extract text, structured entities, and page-level data from scanned documents, receipts, invoices, PDFs, and image. Supports OCR text extraction from photos of receipts, handwritten notes, printed forms, business cards, shipping labels, contracts, and any document type. Identifies structured fields like dates, amounts, addresses, line items, tax totals, vendor names, and more. Accepts input via base64 content, public URL, or AgentPMT file storage ID. Ideal for expense tracking, invoice processing, receipt scanning, document digitization, data entry automation, bookkeeping ingestion, form parsing, and archival workflows.
Workflows Using This Tool
2 / 8- AI Contract Redline: Compare Signed Documents Against Originals
- Kroger Grocery Order From List Photo
- Expense Report Processor
- Invoice OCR and Booking Pipeline
- Bank Statement OCR and Expense Categorization
- Bank Statement OCR and Account Reconciliation
- Receipt OCR to Zoho Books Expense Pipeline
- Route Planner From Address Photos
Workflow
Saves ~25 min
Upload a photo of your handwritten or printed grocery list, and the agent will extract the items using OCR, search Kroger for each item to find the best-priced match, add them to your Kroger cart, then send you a notification that your order is ready for checkout.
Use Cases
Receipt OCR and text extraction,Invoice parsing and field extraction,PDF document text extraction,Scanned image OCR,Handwritten note digitization,Business card scanning,Expense report data capture,Automated bookkeeping ingestion,Contract and legal document text extraction,Shipping label and barcode text reading,Tax form field extraction,Medical record digitization,Insurance claim document processing,Bank statement parsing,Purchase order data extraction,Form field recognition,ID and passport text extraction,Utility bill parsing,Restaurant receipt itemization,Real estate document processing
Dynamic MCP Setup
Connect once through AgentPMT Dynamic MCP, then use approved tools from the same agent connection.
30 Second Setup
STDIO connector for Claude Code, Codex, Cursor, Zed, and other LLMs that require STDIO or custom connections.
npm install -g @agentpmt/mcp-routeragentpmt-setupHosted Streamable HTTPS
MCP endpoint for browser-based apps like ChatGPT, Claude, Grok, or any time you want a streamable connection with no local install.
https://api.agentpmt.com/mcpConfig Example
Use the hosted endpoint directly in clients that support remote MCP. Store your Bearer token in the client config or secret field.
{
"mcpServers": {
"agentpmt": {
"type": "streamable-http",
"url": "https://api.agentpmt.com/mcp",
"headers": {
"Authorization": "Bearer <AGENTPMT_BEARER_TOKEN>",
"x-instance-metadata": "{\"client\":\"generic-mcp\",\"platform\":\"remote\"}"
}
}
}
}Need client videos, organization controls, audit details, and the full feature overview?
More About Dynamic MCPFrequently Asked Questions
How do I connect this tool to an external agent?
You can install the local MCP server by opening a terminal and running:
Install commands
npm install -g @agentpmt/mcp-router
agentpmt-setupThis will connect you to local agents like Claude Code, Windsurf, Grok Build, Cursor, etc.
Alternatively you can connect to the hosted version with this config block, no installation required:
Hosted MCP config
{
"mcpServers": {
"agentpmt": {
"type": "streamable-http",
"url": "https://api.agentpmt.com/mcp",
"headers": {
"Authorization": "Bearer <AGENTPMT_BEARER_TOKEN>",
"x-instance-metadata": "{\"client\":\"generic-mcp\",\"platform\":\"remote\"}"
}
}
}
}View MCP Connection Instructions for more details.
How does an external agent use this tool?
After the external agent is connected to an Agent Group that can use this tool, paste this prompt into the agent:
Agent prompt
Use the AgentPMT-Tool-Search-and-Execution tool. First call action 'get_instructions' so you know how to use the tool search interface. Then call action 'get_schema' with tool_id 69858a64269243768b447d6d ("Document OCR Agent"). After reading the schema and any returned instructions, tell me what this tool can do, we are going to be using it
The agent should fetch the tool schema first, collect the required parameters for your request, and then call the tool through AgentPMT.
Dependencies
3 dependencies will be automatically added when you enable this product.




