Building a Reliable Webhook Consumer: Best Practices

April 16, 2026 5 min read

webhooksengineeringreliability

Most webhook documentation focuses on the sender. The receiver gets less attention, which is unfortunate because reliable delivery is only half the job. A perfectly delivered webhook that’s poorly handled on the receiving end is still a failure.

If you’re building an endpoint that receives webhooks, the practices below apply whether you’re consuming events from Stripe, GitHub, Hookbridge, or anywhere else.

Respond Fast, Process Later

The most common cause of duplicate webhooks is a slow handler. Most senders have a timeout in the 5 to 30 second range. If your endpoint doesn’t respond in time, the sender treats the request as failed and retries. A handler that takes 10 seconds while the sender times out at 5 will see every event twice.

The fix is to do as little as possible inside the request handler:

@app.route("/webhooks", methods=["POST"])
def handle_webhook():
    event = request.get_json()

    if not verify_signature(request):
        return "Unauthorized", 401

    task_queue.enqueue(process_event, event)
    return "OK", 200

The pattern:

Verify the signature so you reject invalid requests immediately
Enqueue the event for background processing
Return 200

Keep the handler under one or two seconds. Database writes, downstream API calls, and any actual business logic should happen in a worker, not in the request path.

A few useful guideposts for handler latency:

Under 500ms is comfortable. No realistic risk of a sender timing out.
One to three seconds is fine for most senders.
Five seconds and above is risky. Some senders will start timing out.
Anything past 30 seconds will produce duplicates regularly.

Verify Every Request

Treat unsigned webhooks as untrusted input, because that’s exactly what they are. Don’t skip verification “just for development” or “just for this one source.” Build it into the handler from day one.

@app.route("/webhooks", methods=["POST"])
def handle_webhook():
    signature = request.headers.get("X-Webhook-Signature")

    if not signature:
        return "Missing signature", 401

    if not verify_hmac(request.data, signature, WEBHOOK_SECRET):
        return "Invalid signature", 401

    event = request.get_json()
    task_queue.enqueue(process_event, event)
    return "OK", 200

Three things to watch for when implementing verification:

Verify against the raw request bytes, not the parsed JSON. JSON parsers normalize whitespace, key ordering, and number formatting, which breaks the hash.
Use a timing-safe comparison like hmac.compare_digest() rather than ==. A naive string comparison can leak information about the secret.
Keep the secret out of source code. Environment variables or a secrets manager are both reasonable choices.

Our earlier post on HMAC signatures walks through the implementation in more detail.

Handle Duplicates

Any at-least-once delivery system will deliver some events more than once. Your endpoint needs to be safe to call repeatedly with the same payload. The minimum viable approach uses the database as the deduplication store:

def process_event(event):
    event_id = event["id"]

    try:
        db.execute(
            "INSERT INTO processed_events (event_id) VALUES (%s)",
            (event_id,)
        )
    except UniqueViolationError:
        return

    handle_payment_completed(event["data"])

If you skip this step, a single retry can charge a customer twice, send a duplicate confirmation email, or create a second copy of an order. Our longer post on idempotency keys covers race conditions and high-throughput patterns.

Handle Events Out of Order

Events do not arrive in chronological order. Retries, network delays, and parallel workers all contribute to events landing in a different order than the one they were sent in.

Consider a sequence like this arriving at your endpoint:

subscription.updated (plan changed to premium), timestamp 10:01
subscription.updated (plan changed to basic), timestamp 10:00

Apply each update in arrival order and you’ll end up with the wrong plan. The earlier event arrived second and overwrote the correct state.

There are three reasonable approaches.

Compare Timestamps

Reject any event older than the state you already have:

def handle_subscription_update(data):
    current = db.get_subscription(data["subscription_id"])

    if current and current.last_updated >= data["timestamp"]:
        return

    db.update_subscription(
        id=data["subscription_id"],
        plan=data["plan"],
        last_updated=data["timestamp"]
    )

Use Sequence Numbers

If the sender includes a monotonically increasing sequence number, prefer that over timestamps. Sequence numbers don’t suffer from clock skew:

def handle_event(data):
    current_seq = db.get_last_sequence(data["resource_id"])

    if data["sequence"] <= current_seq:
        return

    db.update_resource(data["resource_id"], data, data["sequence"])

Treat the Webhook as a Notification

When you really can’t reason about ordering, use the webhook as a trigger to refetch the authoritative state from the sender:

def handle_order_update(data):
    current_order = api_client.get_order(data["order_id"])
    db.update_order(current_order)

This is the most robust option. It also adds latency and an API call per event, so reach for it when ordering really matters.

Return Meaningful Status Codes

Your status code is how you tell the sender what happened. Use it correctly.

Status Code	Meaning	Sender Behavior
200, 201, 204	Event received and accepted	No retry
400	Malformed payload	No retry (not retryable)
401	Signature verification failed	No retry
404	Endpoint not found	No retry
429	Rate limited	Retry with backoff
500	Server error	Retry
502, 503, 504	Infrastructure error	Retry

Two failure modes are worth calling out. Returning 200 when processing actually failed tells the sender the event was handled, so it won’t retry and the event is lost. Returning 500 for a malformed payload makes the sender keep retrying a request that will never succeed, wasting capacity on both sides.

Use a Dedicated Endpoint Per Source

Don’t multiplex webhook handling onto a general-purpose route. Create a dedicated path for each source:

/webhooks/stripe
/webhooks/github
/webhooks/hookbridge

This keeps signature verification logic separate per source, makes per-source rate limiting and monitoring straightforward, simplifies access control, and contains the blast radius if one integration breaks.

Log Everything

For every webhook you receive, log:

The event type and ID
Whether signature verification passed
Whether the event was a duplicate
Whether async processing succeeded or failed
Processing time

def handle_webhook(event):
    logger.info(
        "webhook_received",
        event_id=event["id"],
        event_type=event["event"],
        duplicate=is_duplicate,
        processing_result=result,
        processing_time_ms=elapsed
    )

These logs pay for themselves the first time you have to debug an integration. When the sender claims they delivered event X at time Y, you can check whether you actually received it, whether it passed verification, and what happened next.

Monitor the Endpoint Like Any Other Service

Webhook endpoints fail the same ways the rest of your stack does. Watch the same things:

Uptime. If your endpoint is down, you’re missing webhooks until it returns and the sender finishes its retry budget.
Error rate. A jump in 500s or 401s usually points to a real problem.
Queue depth. If your async queue is growing, events are arriving faster than you can process them.
Processing failure rate in the worker. Things that fail after acknowledgement need attention too.

Have a Recovery Plan

Failures will happen. The question is whether you’ve thought about them in advance. Some questions worth having answers to:

If your endpoint was down for hours, does the sender retry long enough for you to catch up? If not, can you request a replay?
If your processing queue fails, do you have dead letter handling for events that couldn’t be processed?
If you ship a bug that ACKs events but processes them incorrectly, can you replay events from the sender for a specific time window?

Many webhook senders, including Hookbridge, support event replay. Knowing how to use it before you need it is worth ten minutes of reading.

Quick Reference

If you only remember a few things from this post:

Return 200 within a second or two; do everything else asynchronously
Verify HMAC signatures on every request, against the raw body, with a timing-safe comparison
Deduplicate by event ID
Handle out-of-order delivery with timestamps, sequence numbers, or refetched state
Return accurate HTTP status codes
One endpoint per source
Log enough information to debug an integration six months from now
Watch uptime, error rate, and queue depth
Know how to replay events when something goes wrong
Keep the webhook secret in environment variables or a secrets manager

Get those right and your integrations will hold up under real production load.

Next up: Webhooks at Scale: Lessons from Delivering Millions of Events. What changes when you move from hundreds of deliveries per day to millions.