In detail
- Internal tests since March: language models make 13 percent fewer errors than humans, catch 10 percent more violations.
- Meta switching from Google's Gemini to own model Muse Spark for moderation; models trained on historical decisions by human reviewers.
- Employees report: models still remove harmless content, insufficient oversight for rapid rollout; transition already causing layoffs, especially among external contractors.
Why it matters
Automating content moderation saves billions, but quality risks are real—moderation errors can damage trust and user experience.
For you If you operate on Meta platforms, document content removals carefully; error rates may spike temporarily.