The first problem was not architecture. It was quota. The team had been running automated PR review routines in Claude Routines, which worked well enough to become load-bearing. Then the daily limit became visible: fifteen routine slots disappear quickly when the company has more pull requests, some PRs go through two or three review rounds, and several weekday-night jobs already consume three or four slots before the workday starts.

That makes PR review the wrong thing to keep spending scarce scheduled-routine capacity on. PR review is event-driven. It should fire when GitHub says a PR was opened, reopened, requested for review, or marked ready. The nightly work is cron-shaped. It should keep its scheduled lane. GitHub Events Routines exist to split those two workloads cleanly: leave cron-scheduled routines in Claude Routines, and move GitHub-triggered PR review routines into OpenClaw.

Key Takeaway

The system is a pressure valve for Claude Routines quota, not just a new webhook handler. The product boundary is: GitHub events wake OpenClaw; scheduled jobs stay where they already fit.

Why this exists

Claude Routines gave the team an automation surface before OpenClaw had this particular event bridge. The obvious use case was automated PR review: a PR changes state, a routine reads the diff and context, and the result lands where the team already discusses code. That is valuable precisely because it runs often.

Often is the problem. A fixed daily routine budget is a poor fit for activity that scales with the number of engineers, repositories, PR rounds, and review requests. If ten PRs open on the same day and three of them get a second pass, PR review alone can consume most of the quota. Add weekday-night routines and the team has converted a useful default into a contention point.

GitHub Events Routines move the contention boundary. GitHub-triggered review work becomes an OpenClaw event lane. Claude Routines remains useful for recurring scheduled work. The split is more important than the implementation detail because it matches the work’s natural shape.

The path from GitHub to OpenClaw

The pipeline has two halves. The AWS half receives the public webhook and turns it into durable work. The OpenClaw half polls for that work and runs the local routine.

flowchart TD
  GitHub[GitHub App webhook] --> APIGW[API Gateway REST API<br/>POST /webhook]
  APIGW --> Receiver[receiver Lambda<br/>HMAC verify]
  Receiver --> S3[S3 payload object<br/>YYYY/MM/DD/delivery-id]
  Receiver --> SQS[SQS main queue<br/>small envelope]
  SQS --> Consumer[gh-events-consumer<br/>Mac OpenClaw host]
  SQS --> DLQ[SQS DLQ]
  Consumer --> Match[match routine type + repo]
  Match --> Hook[POST /hooks/agent]
  Hook --> Routine[OpenClaw routine sandbox]
  Routine --> Slack[Slack result]

GitHub sends events to gh-events.constructured.ai/webhook. API Gateway handles the public edge and applies a resource-policy IP allowlist based on GitHub’s published hook CIDRs. The receiver Lambda then verifies X-Hub-Signature-256 against the webhook secret from Secrets Manager. If the signature is valid, the receiver writes the full request body to S3 and sends an SQS message whose body is only a claim-check envelope: bucket, key, and size. It also preserves the GitHub event name and delivery UUID as SQS message attributes.

The OpenClaw host does not receive inbound internet traffic. It long-polls SQS, reads the S3 object named by the envelope, parses enough JSON to know the repo, action, PR author, and draft state, then asks the matcher which routine instances should fire. A matching instance becomes an OpenClaw POST /hooks/agent call.

Tip

The S3 envelope is doing real work. GitHub event bodies can exceed SQS’s direct-message comfort zone, and GitHub only keeps failed deliveries for a limited window. Storing the body in S3 makes the queue message small and the payload recoverable.

The trust boundaries

There are three separate trust questions in this design: who may call the public endpoint, who may enqueue work, and who may run local routines.

The public endpoint is narrowed at the AWS edge by API Gateway’s resource policy. A daily refresh Lambda updates the allowlist from GitHub’s current hook CIDRs, which means most non-GitHub traffic is rejected before the receiver Lambda runs. That is useful, but it is not the security boundary. The security boundary is the HMAC signature. The receiver verifies the raw request body against the secret in Secrets Manager; caller errors like bad signatures return 401 without incrementing the Lambda Errors metric, while system failures like secret, S3, or SQS problems return errors that should alarm.

The queue boundary is IAM plus SQS delivery semantics. The receiver can write to S3 and SQS. The Mac host uses AWS credentials to read the queue and fetch the payload object. The OpenClaw machine stays behind the NAT line: it reaches out to AWS, but AWS does not call into the host.

The routine boundary is OpenClaw’s hook configuration. The consumer dispatches to agent IDs with the convention routine-<type>-<repo>. The installer patches OpenClaw’s agents.list and hooks.allowedAgentIds so only the expected routine agents are callable. That means adding a new routine instance is not just a TOML change; it changes the local hook allowlist and needs the full install path.

What the consumer actually does

The consumer is not a general GitHub automation framework. It is a narrow bridge from queued GitHub deliveries to configured OpenClaw routine instances.

On each poll loop, it requests up to ten SQS messages with twenty-second long polling. For each message, it parses the envelope, fetches the S3 object, parses the event body, evaluates routine matches, and deletes the SQS message only after processing succeeds. If processing fails, the message is left alone. SQS visibility timeout and redrive policy handle the retry path.

The matcher is deliberately boring. It matches exact GitHub event name, optional action allowlist, and repo basename. Filters such as ignore_bots, ignore_authors, and ignore_drafts run after a candidate match so obvious no-review cases do not spend a sandbox run. Rich policy like “only review PRs touching this subsystem” belongs in the routine prompt, not in the dispatch matcher.

When a routine does fire, the dispatcher does four durable things before asking OpenClaw to run anything. It checks SQLite for an existing (github_delivery, type, repo) record, creates a dispatch ID and session key, writes a dispatch directory containing the event body and context, and inserts a pending record. Only then does it call POST /hooks/agent with Deliver: false, WakeMode: now, and an idempotency key derived from the same delivery/type/repo tuple.

Key Takeaway

At-least-once delivery is assumed. GitHub can redeliver, SQS can redeliver, and DeleteMessage can fail after a successful run. The consumer’s SQLite dedupe is what keeps duplicate delivery from becoming duplicate review.

How PR review is configured

The current routine type is pr-review. Its matcher listens for GitHub’s pull_request event and the actions opened, reopened, review_requested, and ready_for_review. The ready_for_review action matters because the routine skips drafts. A PR opened as a draft is not asking for review yet; when it leaves draft, GitHub fires the event that should trigger the review.

The routine also skips bot-authored PRs. That catches GitHub App bot identities such as Dependabot and Renovate. It does not automatically skip every machine-looking user account, which is why the matcher also has an explicit ignore_authors list for user accounts the team decides should not trigger review.

Repository opt-in is filesystem-shaped. A repo settings file under routines/_repos/ names Slack output channels and optional overrides. A workspace under routines/pr-review/<repo>/workspace/ supplies the prompt and local agent instructions. Together those files register a concrete routine instance, such as routine-pr-review-osprey-strike.

The routine-base Docker image is pinned by digest in _defaults.toml. Install refuses the placeholder digest and refuses latest-style looseness. That is the right tradeoff for a routine that can run code-reading agents from webhook input: the event may be dynamic, but the sandbox toolbox should be a known artifact.

How operators change it

There are four common changes, and they land in different layers.

To change what GitHub emits, update the GitHub App installation or subscribed events. The AWS receiver does not care which Constructured repo emitted the event as long as the signature is valid and the event reaches the endpoint.

To change AWS behavior, edit infrastructure/opentofu/gh-events-ingest: queue visibility timeout, max receive count, receiver latency alarm, oldest-message alarm, secret cache TTL, lifecycle settings, and the IP refresh machinery all live there. The receiver Lambda code is in the same module, with scripts/build.sh producing the deployment zip.

To change which local routines fire, edit the consumer’s routines tree. Matcher changes, prompt changes, and repo Slack routing can hot-reload with SIGHUP because the consumer re-scans the tree and keeps the last good snapshot if a reload fails. Adding or removing a routine instance needs the installer because OpenClaw’s allowed agent list must be regenerated.

How operators ask about it

The installed gh-events-status skill is part of the product surface, not just a debugging convenience. The install path copies SKILL.md, command scripts, and rendered config into ~/.openclaw/skills/gh-events-status/, where OpenClaw can discover it. That lets the main Q agent answer operator questions from Slack in the channels where it listens, without requiring the operator to SSH into the host or remember the consumer’s file layout.

The skill has six commands: list recent dispatches, show a full run transcript, tail a running or recent transcript, inspect queue and DLQ depth, list configured routines, and re-publish a delivery envelope when appropriate. The routines view is especially important for “what are we listening to?” questions because it reports the configured routine instances, their GitHub event and action filters, the OpenClaw agent ID, the Slack destination, and the prompt file an operator would edit.

The rerun path is intentionally cautious. It is the skill’s mutating command and should be confirmed before use. The current dedupe model also means re-publishing a delivery that already recorded a run is a no-op, not a force-review button.

How it fails and recovers

Most of the design is about preserving events long enough for a human or agent to fix the broken layer.

If the receiver cannot load the secret, write S3, send SQS, or otherwise complete a system operation, the receiver returns a system error and CloudWatch should alarm. If the caller sends a bad signature or omits the delivery header, the receiver returns a 4xx response and does not page the operator through Lambda Errors; that is bad input, not an unhealthy receiver.

If the consumer is down, messages accumulate in SQS. With the configured retention window, OpenClaw can be down for a meaningful period and still drain the backlog when it returns. The oldest-message-age alarm is the signal that the queue is no longer flowing. If the consumer repeatedly fails a message, the redrive policy moves it to the DLQ after the receive limit, and the AWS-side replay script can inspect or move DLQ messages back to the main queue after the root cause is fixed.

If a routine is dispatched but never reports completion, the reaper handles the audit trail. Routines write a small report.json into their dispatch directory when they finish; the reaper sweeps pending records, picks up self-reports, and marks records terminal. If there is no report after the routine’s timeout plus grace period, the reaper inspects the OpenClaw session and can post a failure notification to the repo’s failure channel.

Warning

The recovery model is durable, not magic. SQS buys time, S3 preserves payloads briefly, GitHub retains recent deliveries briefly, and SQLite dedupes dispatches. Operators still need to watch the alarms and status surface when those buffers start filling.

What to remember

GitHub Events Routines are the event-driven lane for agent work. They exist because PR review is valuable, frequent, and tied to GitHub state, which makes it a poor fit for a scarce daily scheduled-routine budget.

The design keeps each system in its natural role. GitHub emits signed facts. AWS verifies, stores, queues, alarms, and buffers. The OpenClaw host polls outbound and runs local sandboxed agents. Slack gets the human result. The source repo remains the place to change infrastructure, matchers, prompts, and operator tooling.

  1. Use OpenClaw for GitHub-event-triggered PR review work, especially when review volume grows with team activity.
  2. Keep cron-shaped weekday and nightly routines in Claude Routines unless there is a separate reason to move them.
  3. Treat HMAC verification as the main public-edge security boundary; treat the GitHub IP allowlist as defense in depth.
  4. Expect duplicates and retries. The consumer's `(github_delivery, type, repo)` record is what makes the lane idempotent.
  5. Change matchers and prompts in the routines tree; change queue, alarm, and receiver behavior in the OpenTofu module; use the installer when OpenClaw hook allowlists change.

Discussion Prompts

  • Which GitHub events besides PR review deserve this lane: issue triage, release notes, security alerts, failed CI, or something else?
  • When should operators be allowed to force-rerun a delivery that already has a dedupe record, and what audit entry should that leave?
  • Which status questions are common enough that `gh-events-status` should grow a dashboard or scheduled summary instead of staying purely on-demand?

References

  1. From Plan to Pull Request - How PRs fit into Constructured's normal engineering loop.
  2. Where the Work Lives - The companion operating model for deciding which system owns which work.
  3. Runbooks Are Interfaces - Why the deployment README, status skill, and replay tools are part of the system surface.
  4. Operator's Guide to Q - The broader OpenClaw/Q operations context around channels, authority, and status.