/api/v1/scrapping/search/Full-Text Search.
Full-Text Search.
25 tokensSince v1.0.0
Why use this
Full-text search across the actual extracted text content of every SEC filing — 10-K Risk Factors, 10-Q MD&A, 8-K material event narratives, exhibit text, and proxy statements. The differentiator vs metadata-only `/scrapping/query/`: this scans the words IN the filings, not just the filing headers. Drives keyword surveillance workflows (e.g. 'every 8-K mentioning bankruptcy filed this week', 'every 10-K with going-concern doubt language', 'every proxy mentioning a specific named executive'). Returns hit windows with the matching phrase wrapped in `<em>` tags for direct highlight rendering. Use this to locate the filings, then call `/scrapping/extractor` to retrieve the full item once you've found the right hit.
Common use case
Searching for specific keywords like 'bankruptcy' or 'merger' across millions of filings.
Full-text search across extracted filing text (10-K/10-Q/8-K/20-F items, exhibits, MD&A) — the differentiator vs metadata-only query endpoints. Returns hit windows with the matching phrase highlighted. The 25-token cost reflects the heavy ElasticSearch hit on extracted text. Pair with GET /api/v1/scrapping/extractor to retrieve a full item once you have located the right filing.
Parameters
| Name | In | Required | Default | Allowed | Description | Example |
|---|---|---|---|---|---|---|
| query | body | required | — | — | Search term — single word, multi-word phrase (auto-quoted as a phrase search), or quoted exact-match string. Boolean operators NOT supported on this endpoint (use `/scrapping/query/` for boolean syntax). Multi-word queries match the phrase exactly, not the bag-of-words. Stemming and case-folding apply (`bankruptcy` matches `bankrupt` and `BANKRUPTCY`). | going concern |
| limit | body | optional | 10 | — | Maximum hit windows returned, capped at 50 server-side. Each hit is a 200-300 character text window centered on the match — for narrower scrolls, drop limit to 5-10; for sweeping audits, set to 50 and paginate via `meta.next_cursor`. | 20 |
Response schema
| Field | Type | Nullable | Description |
|---|---|---|---|
| hits | array | no | Array of text-match windows, sorted by relevance (BM25-style scoring) then `filed_date DESC` as a tiebreaker. A single filing can contribute multiple hits if the query phrase appears in multiple sections. Empty array on no match — never null. |
| hits[].accession_number | string | no | SEC accession number of the filing containing this hit. Pass to `GET /api/v1/sec/filings/{accession_number}` for filing metadata, or to `GET /api/v1/scrapping/extractor` (with the matching `section`) to retrieve the full item text. |
| hits[].ticker | string | yes | Resolved ticker for the issuer (canonical hyphen form). Null when the filer has no public-equity ticker (private funds, individuals, foreign issuers without ADRs). Useful for drilling: filter hits to S&P 500 issuers client-side via the resolved ticker. |
| hits[].form_type | string | no | Canonical SEC form type containing the hit (e.g. `10-K`, `10-Q`, `8-K`, `DEF 14A`, `S-1`, `20-F`). Filter client-side to narrow the corpus by form (e.g. only `8-K`s for material-event surveillance). |
| hits[].filed_date | string | no | ISO `YYYY-MM-DD` filing acceptance date in ET. Sort hit lists by this descending to surface most-recent occurrences. Newer hits typically score lower than older hits in BM25 unless the recency-boost flag is on (off by default). |
| hits[].section | string | no | Item or exhibit identifier where the match was located (e.g. `Item 7` for MD&A, `Item 1A` for Risk Factors, `EX-99.1` for press-release exhibits, `EX-10.1` for material contracts). Pass this directly to `/scrapping/extractor`'s `item` parameter to retrieve the full section. |
| hits[].snippet | string | no | 200-300 character text window centered on the match, with the query phrase wrapped in HTML `<em>` tags for direct rendering. Edge characters may be word-broken — for clean rendering use a CSS truncation/ellipsis pattern. Plain text otherwise (no other HTML). |
| meta | object | no | Result metadata: `{ total: integer, next_cursor: string|null }`. `total` is the full-corpus match count (cap-aware — values > 10000 surface as `gte`-style estimates). `next_cursor` is the opaque pagination cursor for the next page; null on last page. |
Sample response
·
- "hits":
- "meta":
- "total": 412
- "next_cursor": "eyJpZCI6..."
Errors
| Status | Label | Description |
|---|---|---|
| 200 | OK | Request succeeded. |
| 400 | Bad Request | Invalid query, body, or path parameter. |
| 401 | Unauthorized | Missing or invalid Authorization header / api_Token. |
| 402 | Payment Required | Insufficient token balance for this call. Top up |
| 429 | Too Many Requests | Rate limit exceeded for your tier (see /pricing for tier limits). Tier limits |
| 500 | Server Error | Unexpected server-side failure. Retry with backoff; report if persistent. |
Code samples
curl -X POST "https://api.finradar.ai/api/v1/scrapping/search/?api_Token=YOUR_API_KEY" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"Generate an API key in /account/credentials to run live queries (literal YOUR_API_KEY placeholder shown until then).