List ExtractDoc job history
Returns a list of jobs for the authenticated user, ordered by creation date descending.
Use GET /extractdoc/jobs/{job_id} to retrieve full details for a specific job.
Headers
AuthorizationThe Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.
List ExtractDoc job history › Responses
A list of jobs
Create an ExtractDoc job
Enqueues an asynchronous text extraction job.
Upload your file via POST /files first, then submit the file_id here.
Returns 202 Accepted immediately with a job_id.
Poll GET /extractdoc/jobs/{job_id} to check completion.
When completed, download the extracted output via GET /files/{output_file_id}/content.
Supported input formats: PDF, DOCX, XLSX, PPTX. The input format is automatically detected from the uploaded file.
Headers
AuthorizationThe Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.
Create an ExtractDoc job › Request Body
engineEngine ID obtained from GET /extractdoc/engines
file_idFile ID of the input file to process, obtained from POST /files. Supported input formats: PDF, DOCX, XLSX, PPTX. The input format is automatically detected from the uploaded file.
output_formatOutput format. Use 'text' for plain text, or 'jsonl' for a single-line JSON object compatible with StructFlow input.
Create an ExtractDoc job › Responses
Job accepted
job_idfile_idInput file ID
enginestatusprogressoutput_file_idFile ID of the extracted output. Present only when status is completed. Download via GET /files/{output_file_id}/content.
Job-level error. Present only when job status is failed.
created_atupdated_atcompleted_atexpires_atGet ExtractDoc job status and results
Returns the current status of an ExtractDoc job.
Poll this endpoint until status is completed or failed.
Recommended polling interval: 1-5 seconds.
When status is completed, use output_file_id to download the extracted output
via GET /files/{output_file_id}/content.
Results are retained for a limited period after completion (expires_at).
path Parameters
job_idThe unique identifier of the job
Headers
AuthorizationThe Authorization header is used to authenticate with the API using your API key. Value is of the format Bearer YOUR_KEY_HERE.
Get ExtractDoc job status and results › Responses
Job details
job_idfile_idInput file ID
enginestatusprogressoutput_file_idFile ID of the extracted output. Present only when status is completed. Download via GET /files/{output_file_id}/content.
Job-level error. Present only when job status is failed.
created_atupdated_atcompleted_atexpires_at
