/api/v1/sec/document?url={sec_url}Fetch any SEC EDGAR document by its full URL.
Fetch any SEC EDGAR document by its full URL. Works with any file type — HTML exhibits, XML, XBRL, images, PDFs, etc. Use this when you need a specific document (not the primary one) from a filing's document list.
Why use this
Common use case
Fetches any SEC EDGAR document by URL — exhibits (EX-10.1, EX-99.1), XBRL data files (.xml, .xsd), inline XBRL HTML, plain-text filings, images, PDFs. The hostname allowlist (sec.gov, www.sec.gov, efts.sec.gov, edgar.sec.gov) prevents this endpoint from being used as a generic URL-fetcher proxy — non-SEC URLs return 400 BAD_REQUEST. Two response modes: (1) JSON-wrapped (default) — content in data.content with encoding field telling you whether to treat it as UTF-8 text or base64-encoded binary; useful for LLM ingestion or programmatic parsing; (2) raw passthrough (?raw=true) — body returned directly with original Content-Type header; useful for <iframe src=...> or <img src=...> embedding. Get URLs from the all_links array on GET /api/v1/sec/filings/{accession_number}. For the primary document of a filing, the shortcut endpoint GET /api/v1/sec/filings/{accession_number}/html avoids the metadata round-trip. For surgical item-level extraction from 10-K/20-F (e.g. just Item 1A Risk Factors) use GET /api/v1/scrapping/extractor.
Parameters
| Name | In | Required | Default | Allowed | Description | Example |
|---|---|---|---|---|---|---|
| url | query | required | — | — | Full SEC EDGAR document URL — must be on `sec.gov`, `www.sec.gov`, `efts.sec.gov`, or `edgar.sec.gov`. Returns 400 BAD_REQUEST when the hostname is anything else. Get these URLs from the `all_links` array in `GET /api/v1/sec/filings/{accession_number}`'s response. The `/ix?doc=` inline-XBRL viewer wrapper is auto-stripped — pass either form. | https://www.sec.gov/Archives/edgar/data/320193/000032019325000123/aapl-20250928.htm |
| raw | query | optional | false | — | If `true`, returns raw content with the original `Content-Type` header (for iframe rendering or direct binary download). If `false` or omitted, returns a JSON envelope with content in `data.content` (UTF-8 for text, base64 for binary). For LLM ingestion of HTML/XML use the default JSON mode; for `<iframe>` or `<img>` embedding use `raw=true`. | false |
Response schema
| Field | Type | Nullable | Description |
|---|---|---|---|
| status | string | no | Always `"success"` on 2xx in JSON-wrapped mode. ApiResponse envelope marker. NOT present when `raw=true` (raw mode returns the document body directly, no JSON envelope). |
| request_id | string | yes | Per-request UUID generated server-side. JSON-wrapped mode only. |
| timestamp | string | no | ISO-8601 UTC timestamp the response was generated. JSON-wrapped mode only. |
| data | object | no | Document content envelope (JSON-wrapped mode only — see `data.*`). Absent in raw mode (`?raw=true`). |
| data.url | string | no | Echoed (and `/ix?doc=`-stripped) URL of the document fetched. May differ from the input URL when SEC's viewer wrapper was present. |
| data.original_url | string | yes | Original URL as passed by the caller, ONLY present when the server stripped a `/ix?doc=` wrapper. Null when no rewrite occurred. |
| data.content_type | string | no | Detected MIME type based on URL extension: `text/html` (.htm, .html), `application/xml` (.xml, .xsd), `application/json` (.json), `text/plain` (.txt), `image/jpeg` / `image/png` / `image/gif` / `application/pdf` (binary types), `application/octet-stream` (unknown extensions). Use to dispatch the correct decoder client-side. |
| data.size | integer | no | Decoded content size in bytes. For text content this is the UTF-8 byte length; for binary content this is the post-base64-decode byte length (the actual file size, not the inflated base64-string length). |
| data.encoding | string | no | `utf-8` for text content (`text/*`, `application/xml`, `application/json`) — `data.content` is decoded text. `base64` for binary content (PDFs, images) — `data.content` is base64-encoded; decode client-side before consuming. |
| data.content | string | no | Document content. UTF-8 plain text for text content types; base64-encoded string for binary types (decode via `Buffer.from(content, 'base64')` in Node, `atob(content)` in browsers, or `base64.b64decode(content)` in Python). HTML content is post-processed: `<noscript>` blocks stripped, relative `href`/`src` URLs rewritten to absolute `https://www.sec.gov/...` so embedded rendering works outside sec.gov. |
| (raw mode body) | string | no | When `?raw=true`: the document body is returned directly with the original `Content-Type` header (no JSON envelope). For binary types (PDFs, images), the body is the raw bytes. For HTML, the body is the post-processed UTF-8 HTML. Use raw mode for `<iframe src=...>` or `<img src=...>` embedding patterns. |
Sample response
- "status": "success"
- "request_id": "0f14ed05-3a2e-4b76-9c11-1a7c8b3f6de2"
- "timestamp": "2026-05-02T16:30:14.122Z"
- "data":
- "url": "https://www.sec.gov/Archives/edgar/data/320193/000032019325000123/aapl-20250928.htm"
- "original_url": null
- "content_type": "text/html"
- "size": 12943821
- "encoding": "utf-8"
- "content": "<html>\n<head><title>Apple Inc. Form 10-K</title></head>\n<body>\n<h1>UNITED STATES SECURITIES AND EXCHANGE COMMISSION</h1>\n<h2>FORM 10-K</h2>\n<p>Annual report pursuant to Section 13 or 15(d) of the Securities Exchange Act of 1934. For the fiscal year ended September 28, 2025.</p>\n<h2>PART I — Item 1. Business</h2>\n<p>The Company designs, manufactures and markets smartphones, personal computers, tablets, wearables and accessories...</p>\n..."
Errors
| Status | Label | Description |
|---|---|---|
| 200 | OK | Request succeeded. |
| 400 | Bad Request | Invalid query, body, or path parameter. |
| 401 | Unauthorized | Missing or invalid Authorization header / api_Token. |
| 402 | Payment Required | Insufficient token balance for this call. Top up |
| 429 | Too Many Requests | Rate limit exceeded for your tier (see /pricing for tier limits). Tier limits |
| 500 | Server Error | Unexpected server-side failure. Retry with backoff; report if persistent. |
Code samples
curl "https://api.finradar.ai/api/v1/sec/document?url={sec_url}?api_Token=YOUR_API_KEY" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"Generate an API key in /account/credentials to run live queries (literal YOUR_API_KEY placeholder shown until then).