MosaicLeaks: research agents can leak corporate secrets via ordinary web queries

In detail

MosaicLeaks uses multi‑hop questions that interleave public and private info to model the mosaic effect
Training only for task performance increases leakage; Privacy‑Aware Deep Research (PA‑DR) cuts full‑information leakage from 34.0% to 9.9%
PA‑DR raises strict chain success (every hop correct) from 48.7% to 58.7%

Why it matters

Agents that combine local documents with external tools create a realistic leakage channel; reducing query‑based inference is critical for safe enterprise usage.

For you Monitor agent outbound queries, run mosaic‑style leakage tests on your agents, and consider adopting leak‑aware training or retrieval filters for agents handling sensitive data.

Updates

June 18, 2026 · 10:01 PM

Hugging Face presents MosaicLeaks showing that research agents combining private docs with web retrieval can leak sensitive enterprise information through their query logs.

MosaicLeaks creates multi‑hop tasks mixing public and private info; many agents leaked private information in tests
Training solely for task performance increases leakage; Privacy‑Aware Deep Research (PA‑DR) raises strict chain success from 48.7% to 58.7%
PA‑DR lowers answer/full‑information leakage from 34.0% to 9.9%

Sources

TechCrunch