Most webhook documentation focuses on the sender. The receiver gets less attention, which is unfortunate because reliable delivery is only half the job. A perfectly delivered webhook that’s poorly handled on the receiving end is still a failure.
If you’re building an endpoint that receives webhooks, the practices below apply whether you’re consuming events from Stripe, GitHub, Hookbridge, or anywhere else.
Respond Fast, Process Later
The most common cause of duplicate webhooks is a slow handler. Most senders have a timeout in the 5 to 30 second range. If your endpoint doesn’t respond in time, the sender treats the request as failed and retries. A handler that takes 10 seconds while the sender times out at 5 will see every event twice.
The fix is to do as little as possible inside the request handler:
@app.route("/webhooks", methods=["POST"])
def handle_webhook():
event = request.get_json()
if not verify_signature(request):
return "Unauthorized", 401
task_queue.enqueue(process_event, event)
return "OK", 200
The pattern:
- Verify the signature so you reject invalid requests immediately
- Enqueue the event for background processing
- Return 200
Keep the handler under one or two seconds. Database writes, downstream API calls, and any actual business logic should happen in a worker, not in the request path.
A few useful guideposts for handler latency:
- Under 500ms is comfortable. No realistic risk of a sender timing out.
- One to three seconds is fine for most senders.
- Five seconds and above is risky. Some senders will start timing out.
- Anything past 30 seconds will produce duplicates regularly.
Verify Every Request
Treat unsigned webhooks as untrusted input, because that’s exactly what they are. Don’t skip verification “just for development” or “just for this one source.” Build it into the handler from day one.
@app.route("/webhooks", methods=["POST"])
def handle_webhook():
signature = request.headers.get("X-Webhook-Signature")
if not signature:
return "Missing signature", 401
if not verify_hmac(request.data, signature, WEBHOOK_SECRET):
return "Invalid signature", 401
event = request.get_json()
task_queue.enqueue(process_event, event)
return "OK", 200
Three things to watch for when implementing verification:
- Verify against the raw request bytes, not the parsed JSON. JSON parsers normalize whitespace, key ordering, and number formatting, which breaks the hash.
- Use a timing-safe comparison like
hmac.compare_digest()rather than==. A naive string comparison can leak information about the secret. - Keep the secret out of source code. Environment variables or a secrets manager are both reasonable choices.
Our earlier post on HMAC signatures walks through the implementation in more detail.
Handle Duplicates
Any at-least-once delivery system will deliver some events more than once. Your endpoint needs to be safe to call repeatedly with the same payload. The minimum viable approach uses the database as the deduplication store:
def process_event(event):
event_id = event["id"]
try:
db.execute(
"INSERT INTO processed_events (event_id) VALUES (%s)",
(event_id,)
)
except UniqueViolationError:
return
handle_payment_completed(event["data"])
If you skip this step, a single retry can charge a customer twice, send a duplicate confirmation email, or create a second copy of an order. Our longer post on idempotency keys covers race conditions and high-throughput patterns.
Handle Events Out of Order
Events do not arrive in chronological order. Retries, network delays, and parallel workers all contribute to events landing in a different order than the one they were sent in.
Consider a sequence like this arriving at your endpoint:
subscription.updated(plan changed to premium), timestamp 10:01subscription.updated(plan changed to basic), timestamp 10:00
Apply each update in arrival order and you’ll end up with the wrong plan. The earlier event arrived second and overwrote the correct state.
There are three reasonable approaches.
Compare Timestamps
Reject any event older than the state you already have:
def handle_subscription_update(data):
current = db.get_subscription(data["subscription_id"])
if current and current.last_updated >= data["timestamp"]:
return
db.update_subscription(
id=data["subscription_id"],
plan=data["plan"],
last_updated=data["timestamp"]
)
Use Sequence Numbers
If the sender includes a monotonically increasing sequence number, prefer that over timestamps. Sequence numbers don’t suffer from clock skew:
def handle_event(data):
current_seq = db.get_last_sequence(data["resource_id"])
if data["sequence"] <= current_seq:
return
db.update_resource(data["resource_id"], data, data["sequence"])
Treat the Webhook as a Notification
When you really can’t reason about ordering, use the webhook as a trigger to refetch the authoritative state from the sender:
def handle_order_update(data):
current_order = api_client.get_order(data["order_id"])
db.update_order(current_order)
This is the most robust option. It also adds latency and an API call per event, so reach for it when ordering really matters.
Return Meaningful Status Codes
Your status code is how you tell the sender what happened. Use it correctly.
| Status Code | Meaning | Sender Behavior |
|---|---|---|
| 200, 201, 204 | Event received and accepted | No retry |
| 400 | Malformed payload | No retry (not retryable) |
| 401 | Signature verification failed | No retry |
| 404 | Endpoint not found | No retry |
| 429 | Rate limited | Retry with backoff |
| 500 | Server error | Retry |
| 502, 503, 504 | Infrastructure error | Retry |
Two failure modes are worth calling out. Returning 200 when processing actually failed tells the sender the event was handled, so it won’t retry and the event is lost. Returning 500 for a malformed payload makes the sender keep retrying a request that will never succeed, wasting capacity on both sides.
Use a Dedicated Endpoint Per Source
Don’t multiplex webhook handling onto a general-purpose route. Create a dedicated path for each source:
/webhooks/stripe
/webhooks/github
/webhooks/hookbridge
This keeps signature verification logic separate per source, makes per-source rate limiting and monitoring straightforward, simplifies access control, and contains the blast radius if one integration breaks.
Log Everything
For every webhook you receive, log:
- The event type and ID
- Whether signature verification passed
- Whether the event was a duplicate
- Whether async processing succeeded or failed
- Processing time
def handle_webhook(event):
logger.info(
"webhook_received",
event_id=event["id"],
event_type=event["event"],
duplicate=is_duplicate,
processing_result=result,
processing_time_ms=elapsed
)
These logs pay for themselves the first time you have to debug an integration. When the sender claims they delivered event X at time Y, you can check whether you actually received it, whether it passed verification, and what happened next.
Monitor the Endpoint Like Any Other Service
Webhook endpoints fail the same ways the rest of your stack does. Watch the same things:
- Uptime. If your endpoint is down, you’re missing webhooks until it returns and the sender finishes its retry budget.
- Error rate. A jump in 500s or 401s usually points to a real problem.
- Queue depth. If your async queue is growing, events are arriving faster than you can process them.
- Processing failure rate in the worker. Things that fail after acknowledgement need attention too.
Have a Recovery Plan
Failures will happen. The question is whether you’ve thought about them in advance. Some questions worth having answers to:
- If your endpoint was down for hours, does the sender retry long enough for you to catch up? If not, can you request a replay?
- If your processing queue fails, do you have dead letter handling for events that couldn’t be processed?
- If you ship a bug that ACKs events but processes them incorrectly, can you replay events from the sender for a specific time window?
Many webhook senders, including Hookbridge, support event replay. Knowing how to use it before you need it is worth ten minutes of reading.
Quick Reference
If you only remember a few things from this post:
- Return 200 within a second or two; do everything else asynchronously
- Verify HMAC signatures on every request, against the raw body, with a timing-safe comparison
- Deduplicate by event ID
- Handle out-of-order delivery with timestamps, sequence numbers, or refetched state
- Return accurate HTTP status codes
- One endpoint per source
- Log enough information to debug an integration six months from now
- Watch uptime, error rate, and queue depth
- Know how to replay events when something goes wrong
- Keep the webhook secret in environment variables or a secrets manager
Get those right and your integrations will hold up under real production load.
Next up: Webhooks at Scale: Lessons from Delivering Millions of Events. What changes when you move from hundreds of deliveries per day to millions.