🚧 Portal docs are evolving — more APIs and guides coming soon.
Welcome to LDX hub
LDX hub is a document AI gateway — built on 5 years of production infrastructure and 100+ integrated AI vendors. Originally developed at Kawamura International, a Tokyo-based language services company with 40 years of enterprise expertise.
The APIs in this portal are ready-to-use endpoints that wrap the core LDX hub engine. No infrastructure setup required — bring your API key and start processing.
Try it in 60 seconds
Get a free API key, then run two curl commands — the second pipes the
first's job_id and waits for the result.
1. Get your API key — Sign up free and grab your API key from the dashboard. No credit card required, 25,000 credits included.
2. Submit and fetch — extract name, email, and intent from a message:
Requires jq for parsing JSON in the shell. Install via your package manager (brew, winget, apt, etc.) — or just copy the job_id manually from the response.
Code
Returns clean, structured JSON:
Code
That's it. Two calls, piped — submit, then wait. The wait=10 parameter
holds the connection open for up to 10 seconds, so the result returns as
soon as the job is done. For longer-running jobs, just call the same GET
again — each call holds for another 10 seconds. Now try it with your own
data, your own schema, your own scale.
Built for scale, by design
That same endpoint accepts thousands of inputs in a single call. Drop in 10,000 records — StructFlow runs them in parallel server-side and returns results when the job completes.
Code
For larger workloads, upload a JSONL file and pass file_id instead of inputs. No rate limit juggling, no client-side concurrency, no retries to write yourself. The async-first design is what makes this scale possible — every endpoint in LDX hub works the same way.
What You Can Do Here
| Service | What it does |
|---|---|
| StructFlow | Unstructured text → structured JSON |
| RefineLoop | Machine translation → refined translation (XLIFF) |
| RenderOCR | PDF / images → editable Office files (with OCR) |
| CastDoc | Text-based PDF → editable Office files (no OCR) |
| ExtractDoc | Documents → plain text (preprocessing for AI pipelines) |
StructFlow
Turn unstructured text into structured data. At scale.
Free-form text — medical records, contracts, reviews, HR notes — is everywhere. StructFlow takes that text, applies your schema and instructions, and returns clean, validated JSON. Fast. In parallel. With output structure guaranteed by Structured Outputs technology.
- Works with any major AI engine (Claude, Gemini, GPT, Amazon Nova, Grok, and more)
- Define your output schema from a single sample JSON line
- Built for batch: drop a JSONL file, get a JSONL file back
- 5 years of production-hardened parallel processing under the hood
RefineLoop
Machine translation quality, refined until it converges.
RefineLoop takes MT output in XLIFF format and runs it through an AI review loop — checking for mistranslations, omissions, and additions across three axes. It keeps revising until the AI agrees with itself. That's the convergence.
- Drops into any existing translation pipeline as a single XLIFF step
- Compatible with XTM, memoQ, Trados, and other CAT tools
- Revision history recorded in
alt-transtags — fully auditable - Powered by the same parallel processing engine as StructFlow
RenderOCR
Scanned documents, converted to editable Office files — with layout intact.
PDFs and images sitting in your archive aren't searchable, editable, or pipeline-ready. RenderOCR runs high-fidelity OCR across your documents and returns Word, Excel, or PowerPoint files that preserve the original layout and formatting — not just raw text dumps.
- Supports PDF, TIFF, JPEG, PNG, and BMP as input
- Output to DOCX, XLSX, or PPTX — your choice
- 120+ languages supported, including Japanese, Chinese, Korean, Arabic, and more
- Powered by KI OCR, a battle-tested enterprise OCR engine
CastDoc
PDF files, converted to editable Office formats — without OCR.
Already have a text-based PDF? CastDoc converts it directly to Word, Excel, or PowerPoint while preserving the original layout and formatting — no optical recognition required. Faster and more accurate than OCR for born-digital documents.
- Input: text-based PDF (not scanned)
- Output to DOCX, XLSX, or PPTX — your choice
- Powered by KI Cast
- Ideal for contracts, reports, and any PDF created from digital sources
ExtractDoc
Documents to plain text — fast, deterministic, and pipeline-ready.
Need just the text from a document? ExtractDoc pulls plain text from PDFs and Office files in reading order, with no AI overhead and no layout assumptions. Designed as the entry point for StructFlow and other downstream AI pipelines.
- Supports PDF, DOCX, XLSX, and PPTX as input
- Output as plain text or as JSONL — directly compatible with StructFlow's
file_idinput - No AI billing, no language settings, no parameters to tune
- Built to chain: pair with StructFlow for extraction, RefineLoop for translation review
Getting Started
Sign up free to get your API key instantly — no credit card required, 25,000 credits included. Then head to the API Reference to explore all available endpoints, request/response formats, and authentication details.
Credits & Pricing
LDX hub uses a credit-based billing system. Credits are consumed based on the engine you choose and the volume of data processed.
Use Anywhere
LDX hub works as a backend for your existing tools:
- Dify — LDX hub plugin brings StructFlow and RefineLoop into your Dify workflows
- n8n — n8n-nodes-ldxhub exposes all 5 services in n8n automations
- Claude Desktop / MCP-compatible AI — see MCP Setup below
- Direct API — start with the curl examples above
Use with AI Assistants (MCP)
LDX hub supports the Model Context Protocol (MCP), allowing you to use StructFlow, RefineLoop, RenderOCR, CastDoc, and ExtractDoc directly from AI assistants like Claude Desktop.

