/api/v1/scrapping/pdfDownload filing as PDF, HTML, or plain text.
Download filing as PDF, HTML, or plain text. Auto-detects CIK from the SEC URL (or accepts an explicit `cik` body field as fallback). iXBRL viewer links (containing `/ix?doc=`) are normalized server-side.
25 tokensSince v1.0.0
Why use this
Render any SEC filing or exhibit URL to a PDF (or HTML / plain text) for archival, investor memos, regulatory submissions, or simply human-readable distribution. Runs a headless Chrome render under the hood — supports the full SEC filing surface including 10-K/10-Q (often 100-400 page documents), 8-Ks with embedded press-release exhibits, proxy statements with vote tabulations, S-1 prospectuses with financial-statement tables. iXBRL viewer URLs (`https://www.sec.gov/ix?doc=/...`) are auto-cleaned to point at the underlying document. Returns a signed S3 URL valid for 24 hours — fetch the actual file from there. The 25-token cost reflects the heavy headless-Chrome render. For raw cached HTML/PDF without re-rendering use `GET /api/v1/scrapping/public/{filepath}`; for surgical item-level extraction use `/scrapping/extractor`.
Common use case
Generating a PDF for an investor presentation.
Renders any filing or exhibit URL to a PDF and returns a signed download link valid for 24 hours. Use when you need archival-quality copies of filings (e.g. saving the 10-K MD&A as PDF for an investor memo). Heavy operation — runs a headless Chrome under the hood — so the 25-token cost is justified. Prefer GET /api/v1/scrapping/public/{filepath} when you need the raw HTML/PDF without re-rendering.
Parameters
| Name | In | Required | Default | Allowed | Description | Example |
|---|---|---|---|---|---|---|
| link | body | required | — | — | Direct SEC document URL — the filing's primary HTML document (NOT the index page; must be the actual content URL). Server-side rejected if not on the `sec.gov` domain. iXBRL viewer URLs (`https://www.sec.gov/ix?doc=/...`) are auto-normalized to the underlying document URL — pass either form. | https://www.sec.gov/Archives/edgar/data/320193/000032019324000123/aapl-20240928.htm |
| type | body | optional | — | Output format. `pdf` (default) — full headless-Chrome render with embedded fonts, tables, and images preserved. `html` — cleaned HTML with embedded styles inlined (lighter weight, smaller files). `txt` — plain-text strip-down for NLP pipelines and LLM grounding (loses table structure but tiny files). | ||
| fileName | body | optional | — | — | Custom filename for the download (extension auto-appended based on `type`). Useful for archival workflows where you want predictable filenames (e.g. `{ticker}-{form_type}-{fiscal_year}`). Defaults to a hash-based filename derived from the source URL when omitted. | AAPL-10K-2025 |
Response schema
| Field | Type | Nullable | Description |
|---|---|---|---|
| pdf_url | string | no | Signed S3 URL for the rendered output (PDF, HTML, or TXT — name is `pdf_url` for legacy reasons regardless of `type`). Valid for 24 hours from `rendered_at` — fetch the file before `expires_at`. Direct download — no auth required for the signed URL itself, the signature handles authorization. |
| page_count | integer | no | Number of pages in the rendered output. For PDF: actual page count (10-K outputs typically 100-400, proxy statements 30-150, 8-K with exhibits 5-50). For HTML/TXT: synthetic page count assuming ~3000 chars per page. Useful for client-side cost estimation when feeding the document into LLMs. |
| rendered_at | string | no | ISO-8601 UTC timestamp the headless-Chrome render completed. Renders are cached server-side for 7 days keyed by `(link, type)`; values older than 7 days indicate a fresh re-render was triggered (cold path). |
| expires_at | string | no | ISO-8601 UTC timestamp the signed S3 URL expires (always `rendered_at + 24h`). After expiry the URL returns 403 — re-call this endpoint to mint a new signed URL (cached render path; 25-token cost still applies but render itself is reused for 7 days). |
Sample response
·
- "pdf_url": "https://finradar-pdf.s3.amazonaws.com/AAPL-10K-2025.pdf?X-Amz-Signature=..."
- "page_count": 124
- "rendered_at": "2026-05-01T20:55:12.000Z"
- "expires_at": "2026-05-02T20:55:12.000Z"
Errors
| Status | Label | Description |
|---|---|---|
| 200 | OK | Request succeeded. |
| 400 | Bad Request | Invalid query, body, or path parameter. |
| 401 | Unauthorized | Missing or invalid Authorization header / api_Token. |
| 402 | Payment Required | Insufficient token balance for this call. Top up |
| 429 | Too Many Requests | Rate limit exceeded for your tier (see /pricing for tier limits). Tier limits |
| 500 | Server Error | Unexpected server-side failure. Retry with backoff; report if persistent. |
Code samples
curl -X POST "https://api.finradar.ai/api/v1/scrapping/pdf?api_Token=YOUR_API_KEY" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"Generate an API key in /account/credentials to run live queries (literal YOUR_API_KEY placeholder shown until then).