Microsoft Unveils 'Skeleton Key' Attack Exploiting Generative AI Systems

Microsoft researchers established a new type of “jailbreak” attack known as the “Skeleton Key” that exploits generative AI systems and delivers risky or sensitive information. The attack process is to input text to an AI model that instructs it to enhance the encoded security attributes.

One of the cases involved an AI model declining to suggest the recipe for a Molotov Cocktail because it is against the rules to do so. But if the Skeleton Key attack was used, the model recognized that it was enriching its actions, and after that, it provided the working recipe. Similar information can be obtained through the use of search engines; nonetheless, this type of attack would be extremely dangerous if applied to data containing personally identifiable and financial data.

What is an AI jailbreak?

According to the Microsoft blog, “An AI jailbreak is a technique that can cause the failure of guardrails (mitigations). The resulting harm comes from whatever guardrail was circumvented: for example, causing the system to violate its operators’ policies, make decisions unduly influenced by one user, or execute malicious instructions. This technique may be associated with additional attack techniques such as prompt injection, evasion, and model manipulation.”

Vulnerability in LLMs

Skeleton Key attack is effective on most of the currently prevalent generative AI models, including GPT-3. 5, GPT 4-0, Claude 3, Gemini Pro, and Meta Llama 3 70B. Some of the Large Language Models (LLM) like Google’s Gemini, Microsoft’s CoPilot, and OpenAI’s ChatGPT are trained on ‘internet-sized data.’ As such, the latter may contain over a trillion data points, and it may often include people’s names, phone numbers, addresses, account numbers, and personal IDs among various other sensitive information.

Risks of Organization Using AI

One of the blogs stated that “In bypassing safeguards, Skeleton Key allows the user to cause the model to produce ordinarily forbidden behaviours, which could range from the production of harmful content to overriding its usual decision-making rules.”

This paper identified that organizations using AI models are exposed to Skeleton Key attacks if they solely depend on current security mechanisms to block the output of sensitive data. For instance, in the case of the bank that links the chatbot to customers’ details, an attacker can exploit a Skeleton Key attack and penetrate deeper into the bank’s systems by mimicking the mentioned points.

Thus, to avoid such a situation, Microsoft, for example, has proposed to use hard-coded input/output (I/O) filtering and secure monitoring for its prevention and thus exclude the possibility of advancing in the construction of prompts for very dangerous operations which is beyond safe for the system’s configuration. When different AI models are applied to different industries, it is very important to ensure the security of those different models so that any form of attack cannot compromise the data stored in the models.

What are the key differences between large language models (LLMs) and generative AI?

Microsoft Unveils ‘Skeleton Key’ Attack Exploiting Generative AI Systems

Microsoft researchers have developed a new "Skeleton Key" jailbreak attack that exploits generative AI systems to access sensitive information, bypassing existing security measures. This poses significant risks for organizations using AI models, highlighting the need for robust security strategies.

Google Cloud AI Gemini 1.5: Flash and Pro Versions Now Available

How To Use AI In Our Daily Lives? 11+ Examples and Tools To Check

Tech Chilli Desk

How To Use AI In Our Daily Lives? 11+ Examples and Tools To Check

Top 13 Yield Farming Platforms in 2025: Maximize APY with Secure and Trusted Crypto Tools

Scott Wu Net Worth: Devin AI Software Engineer, CEO of Cognition Labs

Turbolearn AI: How to Use It for FREE, Features and Pricing Models

Artificial Intelligence (AI) Glossary and Terminologies – Complete Cheat Sheet List

What is Blockchain Technology And How Does It Work?

What is Enterprise AI? Meaning, Companies, Examples and More Details

PhonePe Partners with Liquid Group to Bring UPI Payments to Singapore for Indian Travelers

What is Cosine Genie and How to Use? Check Benchmark, Functions, and Access Details

What Are Autonomous AI Agent Layers?

How Will Artificial Intelligence (AI) Transform the Crypto Industry?

Top 10 AI Chatbots for Mental Health in 2025 (Rank-wise)

What is Threat Intelligence? Tools, Meaning and Sources

Recent News

What Are Autonomous AI Agent Layers?

How Will Artificial Intelligence (AI) Transform the Crypto Industry?

Top 10 AI Chatbots for Mental Health in 2025 (Rank-wise)

What is Threat Intelligence? Tools, Meaning and Sources

Trending in AI

Browse by Category

Top Searches

Recent News

What Are Autonomous AI Agent Layers?

How Will Artificial Intelligence (AI) Transform the Crypto Industry?

Top 10 AI Chatbots for Mental Health in 2025 (Rank-wise)

What is Threat Intelligence? Tools, Meaning and Sources