ViqusViqus
Navigate
Company
Blog
About Us
Contact
System Status
Enter Viqus Hub

New 'MosaicLeaks' Framework Reveals Critical Privacy Vulnerabilities in Deep Research AI Agents

Deep research agents Privacy risk Information leakage MosaicLeaks PA-DR RL training Multi-hop questions
June 18, 2026
Viqus Verdict Logo Viqus Verdict Logo 8
Fundamental Risk Disclosure
Media Hype 6/10
Real Impact 8/10

Article Summary

The paper introduces MosaicLeaks, a deep-research task designed to expose how sophisticated AI research agents leak private enterprise data. These agents, which integrate local private documents with external web search tools, often generate query logs that, when pieced together, allow an adversary to infer sensitive facts. The vulnerability is called the 'mosaic effect.' The authors propose a novel training method, Privacy-Aware Deep Research (PA-DR), which successfully increases task success rates while drastically lowering the amount of identifiable private information leaked through the query stream. This breakthrough highlights a core trade-off: making agents more informative for tasks often makes them significantly less private.

Key Points

  • Deep research agents pose a unique privacy risk because the combination of multi-hop queries and private local data can leak sensitive information via the public-facing web query log.
  • The proposed MosaicLeaks framework quantifies leakage across three levels—Intent, Answer, and Full-information—providing a rigorous measure of an agent’s privacy failure.
  • The PA-DR training method successfully balances performance and privacy, achieving high task success while reducing full-information leakage by over 70% compared to standard training.

Why It Matters

This is critical for enterprise AI adoption. The leakage described isn't a single data breach, but rather a systemic architectural flaw: the operational metadata (the search queries) itself is the vulnerability. Any company planning to deploy agentic workflows that synthesize internal knowledge with external web context must urgently implement leak-aware training or robust input/output sanitization layers. It forces a reckoning with the operational security risks that go far beyond simple data retention policies.

You might also be interested in