Roblox, the online gaming platform wildly popular with children and teenagers, is rolling out an open-source version of an artificial intelligence system it says can help preemptively detect predatory language in game chats.
The move comes as the company faces lawsuits and criticism accusing it of not doing enough to protect children from predators. For instance, a lawsuit filed last month in Iowa alleges that a 13-year-old girl was introduced to an adult predator on Roblox, then kidnapped and trafficked across multiple states and raped. The suit, filed in Iowa District Court in Polk County, claims that Roblox's design features make children who use it “easy prey for pedophiles.”
Roblox says it strives to make its systems as safe as possible by default but notes that “no system is perfect, and one of the biggest challenges in the industry is to detect critical harms like potential child endangerment.”
The AI system, called Sentinel, helps detect early signs of possible child endangerment, such as sexually exploitative language. Roblox says the system has led the company to submit 1,200 reports of potential attempts at child exploitation to the National Center for Missing and Exploited Children in the first half of 2025. The company is now in the process of open-sourcing it so other platforms can use it too.
Preemptively detecting possible dangers to kids can be tricky for AI systems — and humans, too — because conversations can seem innocuous at first. Questions like “how old are you?” or “where are you from?” wouldn't necessarily raise red flags on their own, but when put in context over the course of a longer conversation, they can take on a different meaning.
Roblox, which has more than 111 million monthly users, doesn't allow users to share videos or images in chats and tries to block any personal information such as phone numbers, though — as with most moderation rules — people constantly find ways to get around such safeguards.
It also doesn't allow kids under 13 to chat with other users outside of games unless they have explicit parental permission — and unlike many other platforms, it does not encrypt private chat conversations, so it can monitor and moderate them.
“We’ve had filters in place all along, but those filters tend to focus on what is said in a single line of text or within just a few lines of text. And that’s really good for doing things like blocking profanity and blocking different types of abusive language and things like that,” said Matt Kaufman, chief safety officer at Roblox. "But when you’re thinking about things related to child endangerment or grooming, the types of behaviors you’re looking at manifest over a very long period of time.”
Sentinel captures one-minute snapshots of chats across Roblox — about 6 billion messages per day — and analyzes them for potential harms. To do this, Roblox says it developed two indexes — one made up of benign messages and, the other, chats that were determined to contain child endangerment violations. Roblox says this lets the system recognize harmful patterns that go beyond simply flagging certain words or phrases, taking the entire conversation into context.
“That index gets better as we detect more bad actors, we just continuously update that index. Then we have another sample of what does a normal, regular user do?" said Naren Koneru, vice president of engineering for trust and safety at Roblox.
As users are chatting, the system keeps score — are they closer to the positive cluster or the negative cluster?
“It doesn’t happen on one message because you just send one message, but it happens because of all of your days' interactions are leading towards one of these two,” Koneru said. “Then we say, okay, maybe this user is somebody who we need to take a much closer look at, and then we go pull all of their other conversations, other friends, and the games that they played, and all of those things.”
Humans review risky interactions and flag to law enforcement accordingly.
Barbara Ortutay, The Associated Press