

File To JSON Parsing
Core Utility
Available ActionsEach successful request consumes credits as outlined below.
extract-csv5crextract-html5crextract-json5crextract-ics5crextract-ods5crextract-pdf5crextract-rtf5crextract-text5crextract-xls5crextract-xlsx5crfile-to-base645cr
Details
A powerful data extraction tool that converts a wide variety of binary file formats into structured JSON output for seamless processing in automated workflows. This function supports eleven extraction actions covering the most common document and data formats: CSV for tabular data parsing, HTML for extracting text content and table structures using BeautifulSoup, JSON for direct parsing, ICS for calendar event extraction, ODS and XLSX/XLS for spreadsheet processing across LibreOffice and Microsoft Excel formats, PDF for page-by-page text and table extraction using pdfplumber, RTF for rich text conversion, and plain text for basic content retrieval. Users can provide input via base64-encoded content or cloud storage file ID, with support for files up to 100MB and inline base64 returns up to 10MB. Configurable parameters allow fine-tuning of extraction behavior including maximum row limits up to 100,000 for spreadsheets, maximum page counts up to 1,000 for PDFs, and toggles for text and table inclusion in applicable formats. The function automatically handles character encoding detection and returns consistently structured JSON with customizable output field names, making it an essential bridge between raw file uploads and downstream data processing pipelines.
Use Cases
Parsing uploaded CSV files into structured records for database import or API submission, extracting tabular data from HTML reports or web page snapshots for analysis, converting calendar ICS files into event objects for scheduling integrations, processing Excel spreadsheets from user uploads into JSON for data transformation pipelines, extracting text and tables from PDF invoices or contracts for automated document processing, converting legacy XLS files from enterprise systems into modern JSON formats, parsing RTF documents from email attachments into plaintext for content indexing, scraping structured table data from HTML exports for reporting dashboards, extracting event details from shared calendar files for synchronization workflows, converting uploaded spreadsheet data into API-compatible payloads for third-party service integrations
Actions(11)
extract-csv5cr4 paramsParse a CSV file into structured row data.
extract-csv5cr4 paramsParse a CSV file into structured row data.
input_base64stringBase64-encoded file content.
file_idstringFile ID from cloud storage.
output_fieldstringKey name for the extracted data in the response.
Default:
datamax_rowsintegerMaximum rows to extract.
Default:
1000Range: 1 - 100000
extract-html5cr6 paramsParse an HTML file, extracting text content and/or table data.
extract-html5cr6 paramsParse an HTML file, extracting text content and/or table data.
input_base64stringBase64-encoded file content.
file_idstringFile ID from cloud storage.
output_fieldstringKey name for the extracted data in the response.
Default:
datainclude_textbooleanInclude extracted text content.
Default:
trueinclude_tablesbooleanInclude extracted table data.
Default:
truemax_rowsintegerMaximum rows per table.
Default:
1000Range: 1 - 100000
extract-json5cr3 paramsParse a JSON file and return its contents as structured data.
extract-json5cr3 paramsParse a JSON file and return its contents as structured data.
input_base64stringBase64-encoded file content.
file_idstringFile ID from cloud storage.
output_fieldstringKey name for the extracted data in the response.
Default:
dataextract-ics5cr3 paramsParse an ICS calendar file and extract events with summary, start, end, location, and description.
extract-ics5cr3 paramsParse an ICS calendar file and extract events with summary, start, end, location, and description.
input_base64stringBase64-encoded file content.
file_idstringFile ID from cloud storage.
output_fieldstringKey name for the extracted data in the response.
Default:
dataextract-ods5cr4 paramsParse an OpenDocument Spreadsheet (.ods) file, returning sheets with row data.
extract-ods5cr4 paramsParse an OpenDocument Spreadsheet (.ods) file, returning sheets with row data.
input_base64stringBase64-encoded file content.
file_idstringFile ID from cloud storage.
output_fieldstringKey name for the extracted data in the response.
Default:
datamax_rowsintegerMaximum rows per sheet.
Default:
1000Range: 1 - 100000
extract-pdf5cr6 paramsExtract text and/or tables from a PDF document, page by page.
extract-pdf5cr6 paramsExtract text and/or tables from a PDF document, page by page.
input_base64stringBase64-encoded file content.
file_idstringFile ID from cloud storage.
output_fieldstringKey name for the extracted data in the response.
Default:
datainclude_textbooleanInclude text extraction per page.
Default:
trueinclude_tablesbooleanInclude table extraction per page.
Default:
truemax_pagesintegerMaximum pages to process.
Default:
50Range: 1 - 1000
extract-rtf5cr3 paramsParse an RTF (Rich Text Format) file and extract plain text.
extract-rtf5cr3 paramsParse an RTF (Rich Text Format) file and extract plain text.
input_base64stringBase64-encoded file content.
file_idstringFile ID from cloud storage.
output_fieldstringKey name for the extracted data in the response.
Default:
dataextract-text5cr3 paramsRead a plain text file and return its contents.
extract-text5cr3 paramsRead a plain text file and return its contents.
input_base64stringBase64-encoded file content.
file_idstringFile ID from cloud storage.
output_fieldstringKey name for the extracted data in the response.
Default:
dataextract-xls5cr4 paramsParse a legacy Excel (.xls) file, returning sheets with row data.
extract-xls5cr4 paramsParse a legacy Excel (.xls) file, returning sheets with row data.
input_base64stringBase64-encoded file content.
file_idstringFile ID from cloud storage.
output_fieldstringKey name for the extracted data in the response.
Default:
datamax_rowsintegerMaximum rows per sheet.
Default:
1000Range: 1 - 100000
extract-xlsx5cr4 paramsParse a modern Excel (.xlsx) file, returning sheets with row data.
extract-xlsx5cr4 paramsParse a modern Excel (.xlsx) file, returning sheets with row data.
input_base64stringBase64-encoded file content.
file_idstringFile ID from cloud storage.
output_fieldstringKey name for the extracted data in the response.
Default:
datamax_rowsintegerMaximum rows per sheet.
Default:
1000Range: 1 - 100000
file-to-base645cr2 paramsConvert a file to base64-encoded string. File must be 10 MB or smaller for inline return.
file-to-base645cr2 paramsConvert a file to base64-encoded string. File must be 10 MB or smaller for inline return.
input_base64stringBase64-encoded file content.
file_idstringFile ID from cloud storage.
Frequently Asked Questions
How do I connect this tool to an external agent?
You can install the local MCP server by opening a terminal and running:
Install commands
npm install -g @agentpmt/mcp-router
agentpmt-setupThis will connect you to local agents like Claude Code, Windsurf, Grok Build, Cursor, etc.
Alternatively you can connect to the hosted version with this config block, no installation required:
Hosted MCP config
{
"mcpServers": {
"agentpmt": {
"type": "streamable-http",
"url": "https://api.agentpmt.com/mcp",
"headers": {
"Authorization": "Bearer <AGENTPMT_BEARER_TOKEN>",
"x-instance-metadata": "{\"client\":\"generic-mcp\",\"platform\":\"remote\"}"
}
}
}
}View MCP Connection Instructions for more details.
How does an external agent use this tool?
After the external agent is connected to an Agent Group that can use this tool, paste this prompt into the agent:
Agent prompt
Call the AgentPMT-Tool-Search-and-Execution tool with action 'get_schema' and tool_id 695c3797767df5adfd9bc872 ("File To JSON Parsing"). Then call the same tool with action 'call_tool', tool_id 695c3797767df5adfd9bc872, and the parameters needed for my request.
The agent should fetch the tool schema first, collect the required parameters for your request, and then call the tool through AgentPMT.
Dependencies
1 dependencies will be automatically added when you enable this product.










