File To JSON Parsing by Apoth3osis

Name: File To JSON Parsing
Brand: Apoth3osis
SKU: 695c3797767df5adfd9bc872
Price: 0.05 USD
Availability: InStock

File To JSON Parsing

Core Utility

Available ActionsEach successful request consumes credits as outlined below.

extract-csv^5crextract-html^5crextract-json^5crextract-ics^5crextract-ods^5crextract-pdf^5crextract-rtf^5crextract-text^5crextract-xls^5crextract-xlsx^5crfile-to-base64^5cr

Details

A powerful data extraction tool that converts a wide variety of binary file formats into structured JSON output for seamless processing in automated workflows. This function supports eleven extraction actions covering the most common document and data formats: CSV for tabular data parsing, HTML for extracting text content and table structures using BeautifulSoup, JSON for direct parsing, ICS for calendar event extraction, ODS and XLSX/XLS for spreadsheet processing across LibreOffice and Microsoft Excel formats, PDF for page-by-page text and table extraction using pdfplumber, RTF for rich text conversion, and plain text for basic content retrieval. Users can provide input via base64-encoded content or cloud storage file ID, with support for files up to 100MB and inline base64 returns up to 10MB. Configurable parameters allow fine-tuning of extraction behavior including maximum row limits up to 100,000 for spreadsheets, maximum page counts up to 1,000 for PDFs, and toggles for text and table inclusion in applicable formats. The function automatically handles character encoding detection and returns consistently structured JSON with customizable output field names, making it an essential bridge between raw file uploads and downstream data processing pipelines.

Use Cases

Parsing uploaded CSV files into structured records for database import or API submission, extracting tabular data from HTML reports or web page snapshots for analysis, converting calendar ICS files into event objects for scheduling integrations, processing Excel spreadsheets from user uploads into JSON for data transformation pipelines, extracting text and tables from PDF invoices or contracts for automated document processing, converting legacy XLS files from enterprise systems into modern JSON formats, parsing RTF documents from email attachments into plaintext for content indexing, scraping structured table data from HTML exports for reporting dashboards, extracting event details from shared calendar files for synchronization workflows, converting uploaded spreadsheet data into API-compatible payloads for third-party service integrations

Actions(11)

extract-csv^5cr4 params

Parse a CSV file into structured row data.

input_base64string

Base64-encoded file content.

file_idstring

File ID from cloud storage.

output_fieldstring

Key name for the extracted data in the response.

Default: data

max_rowsinteger

Maximum rows to extract.

Default: 1000

Range: 1 - 100000

extract-html^5cr6 params

Parse an HTML file, extracting text content and/or table data.

input_base64string

Base64-encoded file content.

file_idstring

File ID from cloud storage.

output_fieldstring

Key name for the extracted data in the response.

Default: data

include_textboolean

Include extracted text content.

Default: true

include_tablesboolean

Include extracted table data.

Default: true

max_rowsinteger

Maximum rows per table.

Default: 1000

Range: 1 - 100000

extract-json^5cr3 params

Parse a JSON file and return its contents as structured data.

input_base64string

Base64-encoded file content.

file_idstring

File ID from cloud storage.

output_fieldstring

Key name for the extracted data in the response.

Default: data

extract-ics^5cr3 params

Parse an ICS calendar file and extract events with summary, start, end, location, and description.

input_base64string

Base64-encoded file content.

file_idstring

File ID from cloud storage.

output_fieldstring

Key name for the extracted data in the response.

Default: data

extract-ods^5cr4 params

Parse an OpenDocument Spreadsheet (.ods) file, returning sheets with row data.

input_base64string

Base64-encoded file content.

file_idstring

File ID from cloud storage.

output_fieldstring

Key name for the extracted data in the response.

Default: data

max_rowsinteger

Maximum rows per sheet.

Default: 1000

Range: 1 - 100000

extract-pdf^5cr6 params

Extract text and/or tables from a PDF document, page by page.

input_base64string

Base64-encoded file content.

file_idstring

File ID from cloud storage.

output_fieldstring

Key name for the extracted data in the response.

Default: data

include_textboolean

Include text extraction per page.

Default: true

include_tablesboolean

Include table extraction per page.

Default: true

max_pagesinteger

Maximum pages to process.

Default: 50

Range: 1 - 1000

extract-rtf^5cr3 params

Parse an RTF (Rich Text Format) file and extract plain text.

input_base64string

Base64-encoded file content.

file_idstring

File ID from cloud storage.

output_fieldstring

Key name for the extracted data in the response.

Default: data

extract-text^5cr3 params

Read a plain text file and return its contents.

input_base64string

Base64-encoded file content.

file_idstring

File ID from cloud storage.

output_fieldstring

Key name for the extracted data in the response.

Default: data

extract-xls^5cr4 params

Parse a legacy Excel (.xls) file, returning sheets with row data.

input_base64string

Base64-encoded file content.

file_idstring

File ID from cloud storage.

output_fieldstring

Key name for the extracted data in the response.

Default: data

max_rowsinteger

Maximum rows per sheet.

Default: 1000

Range: 1 - 100000

extract-xlsx^5cr4 params

Parse a modern Excel (.xlsx) file, returning sheets with row data.

input_base64string

Base64-encoded file content.

file_idstring

File ID from cloud storage.

output_fieldstring

Key name for the extracted data in the response.

Default: data

max_rowsinteger

Maximum rows per sheet.

Default: 1000

Range: 1 - 100000

file-to-base64^5cr2 params

Convert a file to base64-encoded string. File must be 10 MB or smaller for inline return.

input_base64string

Base64-encoded file content.

file_idstring

File ID from cloud storage.

Connect Your Agent In 5 Min

Use hosted MCP first, or install the local STDIO fallback when your client requires a command

Hosted Streamable HTTPS

Use the hosted endpoint directly in clients that support remote MCP. Store your Bearer token in the client config or secret field.

Endpoint: https://api.agentpmt.com/mcp
Authorization: Bearer <AGENTPMT_BEARER_TOKEN>

Full connection guide

Local STDIO Fallback

Use the local router for Claude Code, Codex, Cursor, Zed, and other clients that require a command-based MCP server. The router forwards requests to the hosted endpoint and does not execute tools locally.

npm install -g @agentpmt/mcp-routeragentpmt-setup

curl -X POST "https://api.agentpmt.com/products/purchase" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ********" \
  -d '{
    "product_id": "695c3797767df5adfd9bc872",
    "parameters": {
      "action": "extract-csv",
      "output_field": "data",
      "max_rows": 1000
    }
  }'

import requests
import json

url = "https://api.agentpmt.com/products/purchase"

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer ********"
}

data = {
    "product_id": "695c3797767df5adfd9bc872",
    "parameters": {
        "action": "extract-csv",
        "output_field": "data",
        "max_rows": 1000
    }
}

response = requests.post(url, headers=headers, json=data)
print(response.status_code)
print(response.json())

const url = "https://api.agentpmt.com/products/purchase";

const headers = {
  "Content-Type": "application/json",
  "Authorization": "Bearer ********"
};

const data = {
  product_id: "695c3797767df5adfd9bc872",
  parameters: {
    "action": "extract-csv",
    "output_field": "data",
    "max_rows": 1000
  }
};

fetch(url, {
  method: "POST",
  headers,
  body: JSON.stringify(data)
})
  .then(response => response.json())
  .then(data => console.log(data))
  .catch(error => console.error("Error:", error));

const axios = require('axios');

const url = "https://api.agentpmt.com/products/purchase";

const headers = {
  "Content-Type": "application/json",
  "Authorization": "Bearer ********"
};

const data = {
  product_id: "695c3797767df5adfd9bc872",
  parameters: {
    "action": "extract-csv",
    "output_field": "data",
    "max_rows": 1000
  }
};

axios.post(url, data, { headers })
  .then(response => {
    console.log(response.status);
    console.log(response.data);
  })
  .catch(error => {
    console.error("Error:", error.message);
  });

Login to view your API and budget keys. The example above uses placeholder values. Sign in to see personalized code with your bearer token.

Usage Instructions

Usage guidance provided directly by the developer for this product.

File To JSON Parsing - Instructions

Overview

Extract structured JSON data from a wide range of file formats. Provide a file via base64-encoded content or a cloud storage file ID, and receive parsed, structured output. Supports CSV, HTML, JSON, ICS (calendar), ODS, PDF, RTF, plain text, XLS, and XLSX files. Also supports converting any file to base64.

File Input

Every action (except get_instructions) requires one of the following:

input_base64 (string) - Base64-encoded file content (up to 100 MB raw; 10 MB for file-to-base64 return)
file_id (string) - File ID from cloud storage

Actions

extract-csv

Parse a CSV file into structured row data.

Required: action, plus input_base64 or file_id Optional:

max_rows (integer, default 1000, max 100000) - Maximum rows to extract
output_field (string, default "data") - Key name for the extracted data in the response

Example:

{
  "action": "extract-csv",
  "input_base64": "bmFtZSxhZ2UKQWxpY2UsMzAKQm9iLDI1"
}

extract-html

Parse an HTML file, extracting text content and/or table data.

Required: action, plus input_base64 or file_id Optional:

include_text (boolean, default true) - Include extracted text content
include_tables (boolean, default true) - Include extracted table data
max_rows (integer, default 1000) - Maximum rows per table
output_field (string, default "data")

Example:

{
  "action": "extract-html",
  "file_id": "abc123",
  "include_text": true,
  "include_tables": true
}

extract-json

Parse a JSON file and return its contents as structured data.

Required: action, plus input_base64 or file_id Optional:

output_field (string, default "data")

Example:

{
  "action": "extract-json",
  "input_base64": "eyJrZXkiOiAidmFsdWUifQ=="
}

extract-ics

Parse an ICS calendar file and extract events with summary, start, end, location, and description.

Required: action, plus input_base64 or file_id Optional:

output_field (string, default "data")

Example:

{
  "action": "extract-ics",
  "file_id": "calendar_file_id"
}

extract-ods

Parse an OpenDocument Spreadsheet (.ods) file, returning sheets with row data.

Required: action, plus input_base64 or file_id Optional:

max_rows (integer, default 1000, max 100000) - Maximum rows per sheet
output_field (string, default "data")

Example:

{
  "action": "extract-ods",
  "file_id": "spreadsheet_file_id",
  "max_rows": 500
}

extract-pdf

Extract text and/or tables from a PDF document, page by page.

Required: action, plus input_base64 or file_id Optional:

include_text (boolean, default true) - Include text extraction per page
include_tables (boolean, default true) - Include table extraction per page
max_pages (integer, default 50, max 1000) - Maximum pages to process
output_field (string, default "data")

Example:

{
  "action": "extract-pdf",
  "file_id": "report_pdf_id",
  "max_pages": 10,
  "include_text": true,
  "include_tables": false
}

extract-rtf

Parse an RTF (Rich Text Format) file and extract plain text.

Required: action, plus input_base64 or file_id Optional:

output_field (string, default "data")

Example:

{
  "action": "extract-rtf",
  "input_base64": "e1xydGYxIEhlbGxvIFdvcmxkfQ=="
}

extract-text

Read a plain text file and return its contents.

Required: action, plus input_base64 or file_id Optional:

output_field (string, default "data")

Example:

{
  "action": "extract-text",
  "file_id": "text_file_id"
}

extract-xls

Parse a legacy Excel (.xls) file, returning sheets with row data.

Required: action, plus input_base64 or file_id Optional:

max_rows (integer, default 1000, max 100000) - Maximum rows per sheet
output_field (string, default "data")

Example:

{
  "action": "extract-xls",
  "file_id": "legacy_excel_id",
  "max_rows": 2000
}

extract-xlsx

Parse a modern Excel (.xlsx) file, returning sheets with row data.

Required: action, plus input_base64 or file_id Optional:

max_rows (integer, default 1000, max 100000) - Maximum rows per sheet
output_field (string, default "data")

Example:

{
  "action": "extract-xlsx",
  "input_base64": "<base64_encoded_xlsx>",
  "max_rows": 5000
}

file-to-base64

Convert a cloud-stored file to base64 for inline use. The file must be 10 MB or smaller.

Required: action, plus input_base64 or file_id

Example:

{
  "action": "file-to-base64",
  "file_id": "image_file_id"
}

Common Workflows

Parse an uploaded spreadsheet: Use extract-xlsx or extract-xls with a file_id to get structured row data from each sheet.
Extract text from a PDF report: Use extract-pdf with include_text: true and include_tables: false for text-only extraction.
Convert HTML to structured data: Use extract-html to pull both readable text and any embedded tables from an HTML file.
Read calendar events: Use extract-ics to get a list of events from an ICS calendar export.
Retrieve a file as base64: Use file-to-base64 with a file_id to get the raw file content encoded for inline transfer.

Important Notes

Every extraction action requires either input_base64 or file_id -- at least one must be provided.
Maximum file size is 100 MB. The file-to-base64 action has a stricter 10 MB limit for the returned content.
The max_rows parameter applies to CSV, HTML tables, ODS, XLS, and XLSX extractions.
The max_pages parameter applies only to PDF extraction.
The include_text and include_tables options apply to HTML and PDF extraction.
The output_field parameter lets you customize the key name in the response (default is "data").
Text files are decoded as UTF-8, falling back to Latin-1 if needed.
Spreadsheet actions (ODS, XLS, XLSX) return data organized by sheet, each with a name and rows array.

Frequently Asked Questions

How do I connect this tool to an external agent?

You can install the local MCP server by opening a terminal and running:

Install commands

npm install -g @agentpmt/mcp-router
agentpmt-setup

This will connect you to local agents like Claude Code, Windsurf, Grok Build, Cursor, etc.

Alternatively you can connect to the hosted version with this config block, no installation required:

Hosted MCP config

{
  "mcpServers": {
    "agentpmt": {
      "type": "streamable-http",
      "url": "https://api.agentpmt.com/mcp",
      "headers": {
        "Authorization": "Bearer <AGENTPMT_BEARER_TOKEN>",
        "x-instance-metadata": "{\"client\":\"generic-mcp\",\"platform\":\"remote\"}"
      }
    }
  }
}

View MCP Connection Instructions for more details.

How does an external agent use this tool?

After the external agent is connected to an Agent Group that can use this tool, paste this prompt into the agent:

Agent prompt

Call the AgentPMT-Tool-Search-and-Execution tool with action 'get_schema' and tool_id 695c3797767df5adfd9bc872 ("File To JSON Parsing"). Then call the same tool with action 'call_tool', tool_id 695c3797767df5adfd9bc872, and the parameters needed for my request.

The agent should fetch the tool schema first, collect the required parameters for your request, and then call the tool through AgentPMT.

Dependencies

1 dependencies will be automatically added when you enable this product.

File Management

File To JSON Parsing

Available ActionsEach successful request consumes credits as outlined below.

Details

Use Cases

Actions(11)

Connect Your Agent In 5 Min

Hosted Streamable HTTPS

Local STDIO Fallback

Credit-Based Access Using AgentAddress

Direct x402 Payment

Usage Instructions

File To JSON Parsing - Instructions

Overview

File Input

Actions

extract-csv

extract-html

extract-json

extract-ics

extract-ods

extract-pdf

extract-rtf

extract-text

extract-xls

extract-xlsx

file-to-base64

Common Workflows

Important Notes

Frequently Asked Questions

How do I connect this tool to an external agent?

How does an external agent use this tool?

Dependencies

Workflows Using This Tool

Appointment Scheduling and Route Planner

Looking for help integrating AI into your business? Set up a free consultation.