Skip to content
/api/v1/sec/document?url={sec_url}

Fetch any SEC EDGAR document by its full URL.

Fetch any SEC EDGAR document by its full URL. Works with any file type — HTML exhibits, XML, XBRL, images, PDFs, etc. Use this when you need a specific document (not the primary one) from a filing's document list.

5 tokensSince v1.0.0

Why use this

Universal SEC EDGAR document fetcher — pass any document URL from a filing's `all_links` array (HTML exhibits, XML data files, raw XBRL instances, images, PDFs, plain-text filings) and receive the content back in either a JSON envelope (default — base64 for binary, UTF-8 for text) or a raw passthrough (`?raw=true`, sets the original `Content-Type` for iframe/img embedding). The server-side guard rejects any URL whose hostname is not in `{www.sec.gov, sec.gov, efts.sec.gov, edgar.sec.gov}` — pass-through is gated to SEC domains only. Auto-strips SEC's `/ix?doc=` inline-XBRL viewer wrapper, so passing a URL copied directly from a browser's address bar (which often has `/ix?doc=` prepended) works without manual cleanup. HTML responses get the same `_fix_sec_html` post-processing as `/{accession_number}/html` (noscript stripped, relative URLs absolutized).

Common use case

Chain: GET /filings/{accession} → pick a URL from all_links → GET /document?url={url}. Example: fetch an exhibit agreement or an XBRL data file from a 10-K filing package.

Fetches any SEC EDGAR document by URL — exhibits (EX-10.1, EX-99.1), XBRL data files (.xml, .xsd), inline XBRL HTML, plain-text filings, images, PDFs. The hostname allowlist (sec.gov, www.sec.gov, efts.sec.gov, edgar.sec.gov) prevents this endpoint from being used as a generic URL-fetcher proxy — non-SEC URLs return 400 BAD_REQUEST. Two response modes: (1) JSON-wrapped (default) — content in data.content with encoding field telling you whether to treat it as UTF-8 text or base64-encoded binary; useful for LLM ingestion or programmatic parsing; (2) raw passthrough (?raw=true) — body returned directly with original Content-Type header; useful for <iframe src=...> or <img src=...> embedding. Get URLs from the all_links array on GET /api/v1/sec/filings/{accession_number}. For the primary document of a filing, the shortcut endpoint GET /api/v1/sec/filings/{accession_number}/html avoids the metadata round-trip. For surgical item-level extraction from 10-K/20-F (e.g. just Item 1A Risk Factors) use GET /api/v1/scrapping/extractor.

Parameters

NameInRequiredDefaultAllowedDescriptionExample
urlqueryrequiredFull SEC EDGAR document URL — must be on `sec.gov`, `www.sec.gov`, `efts.sec.gov`, or `edgar.sec.gov`. Returns 400 BAD_REQUEST when the hostname is anything else. Get these URLs from the `all_links` array in `GET /api/v1/sec/filings/{accession_number}`'s response. The `/ix?doc=` inline-XBRL viewer wrapper is auto-stripped — pass either form.https://www.sec.gov/Archives/edgar/data/320193/000032019325000123/aapl-20250928.htm
rawqueryoptionalfalseIf `true`, returns raw content with the original `Content-Type` header (for iframe rendering or direct binary download). If `false` or omitted, returns a JSON envelope with content in `data.content` (UTF-8 for text, base64 for binary). For LLM ingestion of HTML/XML use the default JSON mode; for `<iframe>` or `<img>` embedding use `raw=true`.false

Response schema

FieldTypeNullableDescription
statusstringnoAlways `"success"` on 2xx in JSON-wrapped mode. ApiResponse envelope marker. NOT present when `raw=true` (raw mode returns the document body directly, no JSON envelope).
request_idstringyesPer-request UUID generated server-side. JSON-wrapped mode only.
timestampstringnoISO-8601 UTC timestamp the response was generated. JSON-wrapped mode only.
dataobjectnoDocument content envelope (JSON-wrapped mode only — see `data.*`). Absent in raw mode (`?raw=true`).
data.urlstringnoEchoed (and `/ix?doc=`-stripped) URL of the document fetched. May differ from the input URL when SEC's viewer wrapper was present.
data.original_urlstringyesOriginal URL as passed by the caller, ONLY present when the server stripped a `/ix?doc=` wrapper. Null when no rewrite occurred.
data.content_typestringnoDetected MIME type based on URL extension: `text/html` (.htm, .html), `application/xml` (.xml, .xsd), `application/json` (.json), `text/plain` (.txt), `image/jpeg` / `image/png` / `image/gif` / `application/pdf` (binary types), `application/octet-stream` (unknown extensions). Use to dispatch the correct decoder client-side.
data.sizeintegernoDecoded content size in bytes. For text content this is the UTF-8 byte length; for binary content this is the post-base64-decode byte length (the actual file size, not the inflated base64-string length).
data.encodingstringno`utf-8` for text content (`text/*`, `application/xml`, `application/json`) — `data.content` is decoded text. `base64` for binary content (PDFs, images) — `data.content` is base64-encoded; decode client-side before consuming.
data.contentstringnoDocument content. UTF-8 plain text for text content types; base64-encoded string for binary types (decode via `Buffer.from(content, 'base64')` in Node, `atob(content)` in browsers, or `base64.b64decode(content)` in Python). HTML content is post-processed: `<noscript>` blocks stripped, relative `href`/`src` URLs rewritten to absolute `https://www.sec.gov/...` so embedded rendering works outside sec.gov.
(raw mode body)stringnoWhen `?raw=true`: the document body is returned directly with the original `Content-Type` header (no JSON envelope). For binary types (PDFs, images), the body is the raw bytes. For HTML, the body is the post-processed UTF-8 HTML. Use raw mode for `<iframe src=...>` or `<img src=...>` embedding patterns.

Sample response

·
  • "status": "success"
  • "request_id": "0f14ed05-3a2e-4b76-9c11-1a7c8b3f6de2"
  • "timestamp": "2026-05-02T16:30:14.122Z"
  • "data":
    • "url": "https://www.sec.gov/Archives/edgar/data/320193/000032019325000123/aapl-20250928.htm"
    • "original_url": null
    • "content_type": "text/html"
    • "size": 12943821
    • "encoding": "utf-8"
    • "content": "<html>\n<head><title>Apple Inc. Form 10-K</title></head>\n<body>\n<h1>UNITED STATES SECURITIES AND EXCHANGE COMMISSION</h1>\n<h2>FORM 10-K</h2>\n<p>Annual report pursuant to Section 13 or 15(d) of the Securities Exchange Act of 1934. For the fiscal year ended September 28, 2025.</p>\n<h2>PART I — Item 1. Business</h2>\n<p>The Company designs, manufactures and markets smartphones, personal computers, tablets, wearables and accessories...</p>\n..."
    }
}

Errors

StatusLabelDescription
200OKRequest succeeded.
400Bad RequestInvalid query, body, or path parameter.
401UnauthorizedMissing or invalid Authorization header / api_Token.
402Payment RequiredInsufficient token balance for this call. Top up
429Too Many RequestsRate limit exceeded for your tier (see /pricing for tier limits). Tier limits
500Server ErrorUnexpected server-side failure. Retry with backoff; report if persistent.

Code samples

curl "https://api.finradar.ai/api/v1/sec/document?url={sec_url}?api_Token=YOUR_API_KEY" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Generate an API key in /account/credentials to run live queries (literal YOUR_API_KEY placeholder shown until then).