A reply engine that drives real signups and a reply engine that gets you banned look identical on paper. Both scan conversations. Both draft responses. Both post to platforms. The difference is the guardrails. This piece is the list of guardrails that actually work, learned across twelve products running live on Reddit, Hacker News, and Indie Hackers.
The failure mode we are avoiding
The worst thing a reply engine can do is post a reply that is not actually relevant to the thread. Off-topic replies are the single largest reason automated accounts get banned. Rate limits are secondary. Tone is secondary. Relevance is primary.
An engine that always produces a draft (because it was asked to) will eventually produce a draft that does not belong on the thread. The fix is not better prompting. The fix is letting the engine refuse to draft.
The SKIP signal
The LLM call that drafts replies needs to be able to return one of three things:
- A draft reply, if the thread genuinely fits.
- A sentinel token (we use
SKIP_NOT_RELEVANT) if the thread is not a fit. - An error, if something upstream broke.
The SKIP signal is the most important part of the system. It lets the model say "I would not post here" and the engine respect it. Without SKIP, every scan produces a draft, and some fraction of those drafts are off-topic.
In the product, the conversation is marked as engaged (we read it) but no reply is drafted. It does not re-enter the queue. The reply engine has done its job.
Confidence thresholds
Not every platform has the same risk profile. Here are the thresholds we use:
- Twitter/X, Bluesky: confidence above 0.8 can auto-send.
- LinkedIn: confidence above 0.85 can auto-send.
- Reddit, Hacker News, Indie Hackers: human approval always.
- Dev.to, Hashnode: human approval always (small communities notice).
The confidence threshold is a founder-facing knob. Start conservative and raise as you build trust with the drafts the engine produces.
Human approval routing
Approval routing is the user experience that makes or breaks founder adoption. If approving a reply takes more than 20 seconds, founders stop approving and the queue grows. The approval flow has to work from mobile, with one-tap approve and one-tap reject.
What the approval card needs
- The original thread link and the 100-word summary.
- The draft reply as it will post.
- Confidence score with a one-line reason.
- One tap: approve, edit, or reject.
The Telegram bot and PWA approval flow exist specifically so founders can clear the queue from a phone without opening a laptop.
Cooldowns and per-subreddit caps
Even with relevance filtering and approval gating, a reply engine can produce bans if it posts too often in one place. The structural guardrails:
- Per-subreddit daily cap: no more than two replies a day in any single subreddit, regardless of approval queue depth.
- Cross-subreddit cap: no more than 10 Reddit replies per account per day.
- Cooldown between replies on the same thread: at least two hours.
- Global platform cap: 50 engagements per day across all platforms.
The measurement loop
The last piece is measurement. Every reply that posts gets scored at 6, 24, and 72 hours on: upvotes, replies, link clicks, attributed signups. After 30 days, the tuning engine surfaces two categories of suggestion:
- Drop a keyword: 10-plus conversations scanned, zero that converted to an engaged reply. The keyword is no longer pulling relevant threads.
- Retreat from a subreddit: 10-plus posts, average engagement score below 2.0. The community is not receptive.
Without the measurement loop, the engine produces output forever without learning. With it, the set of subreddits and keywords narrows to the ones that actually pay off. The autonomous marketing guide covers how this fits into the broader system; the Reddit marketing guide covers the community-side guardrails.