AI moderation: check posts before they publish
AI moderation: check posts before they publish
AI moderation reads each new post against a policy you write, before the post goes live. When a post breaks the policy, the action you chose handles it: flag it for review, hide it, or remove it. You set it up at /admin/ai/automod.
This is your own policy, in plain language, checked by AI on your own Anthropic key. It runs on posts as they are submitted, so it catches problems before other members see them. It is a separate system from Automation rules, which run a fixed action you defined when an event happens.
What you need first
Three things must be true for AI moderation to act:
- AI moderation is switched on.
- Your plan is Pro, Creator Plus, or Sovereign.
- Your Anthropic API key is set on the mobieusAI page.
Until all three are true, AI moderation does nothing. You can still write and test your policy first.
Writing a policy
A policy is a short, plain-language description of what to catch. Be specific. For example: "Hide spam, crypto and airdrop promotion, scams, and posts that harass or threaten members. Normal criticism and off-topic chat are fine."
The AI is told to be conservative. It acts only when a post clearly breaks your policy, not for ordinary disagreement or an off-topic post.
What happens on a match
You choose one action for posts that break the policy:
- Flag. The post stays live and a report is filed for a moderator to review. This is the safest choice, and the default.
- Hide. The post is hidden from everyone except its author and moderators, and a report is filed.
- Remove. The post is hidden and a report is filed.
Because AI can be wrong, start with Flag. Move to Hide or Remove once you trust how your policy behaves.
Who it checks
You choose who AI moderation applies to:
- New and untrusted members only. Recommended. This is where most problems come from, and it keeps the cost and the brief posting delay small.
- Every member.
How sure it has to be
The AI rates its own confidence as low, medium, or high. You set the bar:
- Low. Acts on any suspected problem. Catches more, with more false positives.
- Medium. Acts when the check is reasonably sure. Recommended.
- High. Acts only on clear problems. Fewest false positives.
Test before you turn it on
The settings page has a test box. Paste a sample post and see how the AI would rate it against your current policy. Testing changes nothing. Use it to tune your policy before you switch the feature on.
If the AI is unavailable
AI moderation fails open. If the check cannot run, for any reason, the post publishes as normal. A moderation outage never blocks members from posting.
What it costs
AI moderation runs on your own Anthropic API key, so you pay Anthropic directly for each check. Keeping the scope to new and untrusted members keeps that cost small.
Where to go
- Set up AI moderation:
/admin/ai/automod - Add your Anthropic key:
/admin/ai