/api/v1/scrapping/pdf

Download filing as PDF, HTML, or plain text.

Download filing as PDF, HTML, or plain text. Auto-detects CIK from the SEC URL (or accepts an explicit `cik` body field as fallback). iXBRL viewer links (containing `/ix?doc=`) are normalized server-side.

25 tokensSince v1.0.0

Why use this

Render any SEC filing or exhibit URL to a PDF (or HTML / plain text) for archival, investor memos, regulatory submissions, or simply human-readable distribution. Runs a headless Chrome render under the hood — supports the full SEC filing surface including 10-K/10-Q (often 100-400 page documents), 8-Ks with embedded press-release exhibits, proxy statements with vote tabulations, S-1 prospectuses with financial-statement tables. iXBRL viewer URLs (`https://www.sec.gov/ix?doc=/...`) are auto-cleaned to point at the underlying document. Returns a signed S3 URL valid for 24 hours — fetch the actual file from there. The 25-token cost reflects the heavy headless-Chrome render. For raw cached HTML/PDF without re-rendering use `GET /api/v1/scrapping/public/{filepath}`; for surgical item-level extraction use `/scrapping/extractor`.

Common use case

Generating a PDF for an investor presentation.

Renders any filing or exhibit URL to a PDF and returns a signed download link valid for 24 hours. Use when you need archival-quality copies of filings (e.g. saving the 10-K MD&A as PDF for an investor memo). Heavy operation — runs a headless Chrome under the hood — so the 25-token cost is justified. Prefer GET /api/v1/scrapping/public/{filepath} when you need the raw HTML/PDF without re-rendering.

Parameters

Name	In	Required	Default	Allowed	Description	Example
link	body	required	—	—	Direct SEC document URL — the filing's primary HTML document (NOT the index page; must be the actual content URL). Server-side rejected if not on the `sec.gov` domain. iXBRL viewer URLs (`https://www.sec.gov/ix?doc=/...`) are auto-normalized to the underlying document URL — pass either form.	https://www.sec.gov/Archives/edgar/data/320193/000032019324000123/aapl-20240928.htm
type	body	optional	pdf	—	Output format. `pdf` (default) — full headless-Chrome render with embedded fonts, tables, and images preserved. `html` — cleaned HTML with embedded styles inlined (lighter weight, smaller files). `txt` — plain-text strip-down for NLP pipelines and LLM grounding (loses table structure but tiny files).	pdf
fileName	body	optional	—	—	Custom filename for the download (extension auto-appended based on `type`). Useful for archival workflows where you want predictable filenames (e.g. `{ticker}-{form_type}-{fiscal_year}`). Defaults to a hash-based filename derived from the source URL when omitted.	AAPL-10K-2025

Response schema

Field	Type	Nullable	Description
pdf_url	string	no	Signed S3 URL for the rendered output (PDF, HTML, or TXT — name is `pdf_url` for legacy reasons regardless of `type`). Valid for 24 hours from `rendered_at` — fetch the file before `expires_at`. Direct download — no auth required for the signed URL itself, the signature handles authorization.
page_count	integer	no	Number of pages in the rendered output. For PDF: actual page count (10-K outputs typically 100-400, proxy statements 30-150, 8-K with exhibits 5-50). For HTML/TXT: synthetic page count assuming ~3000 chars per page. Useful for client-side cost estimation when feeding the document into LLMs.
rendered_at	string	no	ISO-8601 UTC timestamp the headless-Chrome render completed. Renders are cached server-side for 7 days keyed by `(link, type)`; values older than 7 days indicate a fresh re-render was triggered (cold path).
expires_at	string	no	ISO-8601 UTC timestamp the signed S3 URL expires (always `rendered_at + 24h`). After expiry the URL returns 403 — re-call this endpoint to mint a new signed URL (cached render path; 25-token cost still applies but render itself is reused for 7 days).

Sample response

"pdf_url": "https://finradar-pdf.s3.amazonaws.com/AAPL-10K-2025.pdf?X-Amz-Signature=..."
"page_count": 124
"rendered_at": "2026-05-01T20:55:12.000Z"
"expires_at": "2026-05-02T20:55:12.000Z"

}

Errors

Status	Label	Description
200	OK	Request succeeded.
400	Bad Request	Invalid query, body, or path parameter.
401	Unauthorized	Missing or invalid Authorization header / api_Token.
402	Payment Required	Insufficient token balance for this call. Top up
429	Too Many Requests	Rate limit exceeded for your tier (see /pricing for tier limits). Tier limits
500	Server Error	Unexpected server-side failure. Retry with backoff; report if persistent.

Code samples

Reveal credentials

curl -X POST "https://api.finradar.ai/api/v1/scrapping/pdf?api_Token=YOUR_API_KEY" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Generate an API key in /account/credentials to run live queries (literal YOUR_API_KEY placeholder shown until then).

Try it

link*(body)

type(body)

fileName(body)

Related endpoints

/api/v1/scrapping/public/{filepath}10 tokens /api/v1/scrapping/extractor25 tokens /api/v1/scrapping/search/25 tokens /api/v1/scrapping/query/25 tokens