URL: /guides/server-reporting

---
title: Server-side reporting
description: Capture crawlers and AI bots with POST /v1/report from your origin
---

The JS beacon only sees clients that run JavaScript, so it never observes pure crawlers — most importantly the declared AI agents (GPTBot, ClaudeBot, PerplexityBot, Bytespider) that fetch your HTML without running scripts. To capture them, report each request **server-side** from your origin: a Cloudflare Worker, edge middleware, or any backend.

This guide shows the request shape, the fire-and-forget pattern that adds zero latency, and a complete Cloudflare Worker example.

## The endpoint

`POST https://api.formshield.dev/v1/report` with your publishable key and a JSON body of the **visitor's** signals. FormShield classifies and scores the request server-side and stores the observation.

```bash
curl -X POST https://api.formshield.dev/v1/report \
  -H "Authorization: Bearer fs_pub_live_…" \
  -H "Content-Type: application/json" \
  -d '{
    "ua": "Mozilla/5.0 (compatible; GPTBot/1.1; +https://openai.com/gptbot)",
    "ip": "203.0.113.42",
    "hostname": "example.com",
    "path": "/pricing",
    "action": "pageview"
  }'
```

Response:

```json
{ "ok": true, "request_id": "rpt_a1b2c3d4e5f6" }
```

<Warning>
  Send the **visitor's** UA and IP from the incoming request — not your server's. On Cloudflare read the IP from the `CF-Connecting-IP` header and the UA from the `User-Agent` header of the request your origin received.
</Warning>

## Request body

<ParamField body="ua" type="string">
  The visitor's `User-Agent`. Drives user-agent classification, which names AI and search crawlers. Falls back to the request's `User-Agent` header if omitted.
</ParamField>

<ParamField body="ip" type="string">
  The visitor's IP address. Drives IP reputation (VPN, proxy, datacenter, scanner, country, ASN).
</ParamField>

<ParamField body="hostname" type="string">
  The host the visitor requested, e.g. `example.com`. Falls back to the `Origin` header.
</ParamField>

<ParamField body="path" type="string">
  The path the visitor requested, e.g. `/pricing`.
</ParamField>

<ParamField body="action" type="string" default="pageview">
  A label for the hit, stored on the observation.
</ParamField>

<ParamField body="referrer" type="string">
  Optional. The visitor's referrer.
</ParamField>

## Response

<ResponseField name="ok" type="boolean">
  Always `true` on a `200`. The report was accepted for scoring.
</ResponseField>

<ResponseField name="request_id" type="string">
  An identifier for the stored observation, prefixed `rpt_`.
</ResponseField>

<Note>
  `/v1/report` does **not** return a verdict. Scoring happens server-side and the score, decision, and reasons are stored on the observation. View them in the dashboard Logs — never gate your response on this call.
</Note>

## Make it fire-and-forget

Reporting must add **zero latency** to your response, and your site must never depend on FormShield being up. The rule is: send the report in the background and return your real response immediately.

On Cloudflare Workers, `ctx.waitUntil(fetch(...))` keeps the worker invocation alive until the background fetch settles, while your response returns right away. Wrap the fetch in `try/catch` and swallow errors so a FormShield outage can never break your page.

```ts
async function reportToFormShield(request: Request, url: URL): Promise<void> {
  try {
    await fetch("https://api.formshield.dev/v1/report", {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${FORMSHIELD_KEY}`,
      },
      body: JSON.stringify({
        ua: request.headers.get("User-Agent") ?? undefined,
        ip: request.headers.get("CF-Connecting-IP") ?? undefined,
        hostname: url.hostname,
        path: url.pathname,
        action: "pageview",
      }),
    })
  } catch {
    // never let the page depend on FormShield being up
  }
}

export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    const url = new URL(request.url)
    ctx.waitUntil(reportToFormShield(request, url)) // fires after the response
    return handleRequest(request, env, ctx) // response returns immediately
  },
}
```

Store `FORMSHIELD_KEY` as a Worker secret (`wrangler secret put FORMSHIELD_KEY`) — it is your publishable key, but keeping it out of source is good hygiene.

## Report from Node, Express, or Next.js

The pattern is identical on any backend: build the same body from the incoming request and **do not await it in the request path**. Fire the fetch, ignore the promise, and return your response.

```ts
function reportToFormShield(req: { headers: Record<string, string | undefined>; hostname: string; path: string }) {
  // Do not await — let it run in the background.
  void fetch("https://api.formshield.dev/v1/report", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${process.env.FORMSHIELD_KEY}`,
    },
    body: JSON.stringify({
      ua: req.headers["user-agent"],
      ip: req.headers["x-forwarded-for"]?.split(",")[0]?.trim(),
      hostname: req.hostname,
      path: req.path,
      action: "pageview",
    }),
  }).catch(() => {
    // swallow — reporting must never break the request
  })
}
```

<Warning>
  On a serverless platform, an unawaited fetch can be killed when the function returns. If your reports go missing, await the fetch but cap it with a short timeout (for example `AbortSignal.timeout(500)`) so a slow report never stalls the response. Cloudflare's `ctx.waitUntil` avoids this trade-off entirely.
</Warning>

## How it behaves

- **Scoring skips the missing-client penalty.** A server-side request legitimately has no browser token, so its absence is expected, not a bot tell. The score rests on user-agent classification and IP reputation.
- **Crawlers are named and verified.** A GPTBot or ClaudeBot request is classified, scored, and labeled with a `bot:ai_crawler` reason and the agent name. For operators that publish IP ranges (Google, Microsoft, OpenAI, DuckDuckGo), the report is also IP-verified — a real Googlebot is confirmed, a forged one from the wrong IP is flagged as spoofed. See [bot detection](/guides/bot-detection).
- **Humans who run JS are counted twice.** A real visitor who runs the beacon produces both a server report and a beacon observation. That is fine for crawler-heavy traffic; deduplication is on the roadmap. If you run both, server reporting is most valuable on routes where crawlers dominate.

## Next steps

<CardGroup cols={2}>
  <Card title="Pageview tracking" icon="chart-line" href="/guides/pageview-tracking">
    The beacon reference for client-side pageviews.
  </Card>
  <Card title="Next.js" icon="atom" href="/guides/nextjs">
    Report from middleware or a route handler.
  </Card>
</CardGroup>
