News

Microsoft Unveils ‘Skeleton Key’ Attack Exploiting Generative AI Systems

Microsoft researchers established a new type of “jailbreak” attack known as the “Skeleton Key” that exploits generative AI systems and delivers risky or sensitive information. The attack process is to input text to an AI model that instructs it to enhance the encoded security attributes.

One of the cases involved an AI model declining to suggest the recipe for a Molotov Cocktail because it is against the rules to do so. But if the Skeleton Key attack was used, the model recognized that it was enriching its actions, and after that, it provided the working recipe. Similar information can be obtained through the use of search engines; nonetheless, this type of attack would be extremely dangerous if applied to data containing personally identifiable and financial data.

What is an AI jailbreak?

According to the Microsoft blog, “An AI jailbreak is a technique that can cause the failure of guardrails (mitigations). The resulting harm comes from whatever guardrail was circumvented: for example, causing the system to violate its operators’ policies, make decisions unduly influenced by one user, or execute malicious instructions. This technique may be associated with additional attack techniques such as prompt injection, evasion, and model manipulation.”

Vulnerability in LLMs

Skeleton Key attack is effective on most of the currently prevalent generative AI models, including GPT-3. 5, GPT 4-0, Claude 3, Gemini Pro, and Meta Llama 3 70B. Some of the Large Language Models (LLM) like Google’s Gemini, Microsoft’s CoPilot, and OpenAI’s ChatGPT are trained on ‘internet-sized data.’ As such, the latter may contain over a trillion data points, and it may often include people’s names, phone numbers, addresses, account numbers, and personal IDs among various other sensitive information.

Risks of Organization Using AI

One of the blogs stated that “In bypassing safeguards, Skeleton Key allows the user to cause the model to produce ordinarily forbidden behaviours, which could range from the production of harmful content to overriding its usual decision-making rules.”

This paper identified that organizations using AI models are exposed to Skeleton Key attacks if they solely depend on current security mechanisms to block the output of sensitive data. For instance, in the case of the bank that links the chatbot to customers’ details, an attacker can exploit a Skeleton Key attack and penetrate deeper into the bank’s systems by mimicking the mentioned points.

Thus, to avoid such a situation, Microsoft, for example, has proposed to use hard-coded input/output (I/O) filtering and secure monitoring for its prevention and thus exclude the possibility of advancing in the construction of prompts for very dangerous operations which is beyond safe for the system’s configuration. When different AI models are applied to different industries, it is very important to ensure the security of those different models so that any form of attack cannot compromise the data stored in the models.

What are the key differences between large language models (LLMs) and generative AI?

Tech Chilli Desk

Tech Chilli News Desk is a conglomeration of Tech enthusiasts who are committed to delving deep into the evolving new-age technology of Web 3.0, Artificial Intelligence (AI), Robotics, Fintech, Crypto and more. This desk brings the latest information on Digital Transformation through use cases, implementations, coverage, case studies, reporting and deep analysis.

Recent Posts

Ray Kurzweil Predicts Human-Level AI by 2029 in a new book.

Ray Kurzweil, a renowned AI expert and futurist, predicts human-level AI by 2029 and a…

15 mins ago

Anthropic Launches AI Benchmark Improvement Program

Anthropic unveils a program to enhance AI benchmarks, focusing on security and efficiency. The initiative…

32 mins ago

Gavin Uberti Net Worth: CEO of Etched – AI Chip Company

Gavin Uberti is a technological entrepreneur who co-founded Etched and serves as the startup’s Chief…

2 hours ago

Top 11 FinTech Books for Beginners to Learn Basics of Financial Technology

If you want to begin learning about FinTech, it is a good idea to start…

2 hours ago

Optical Illusion: Can you find the red spoon in 8 seconds?

Optical illusions are fascinating pictures that trick our brains by making us see things that…

20 hours ago

Microsoft’s Suleyman Sparks Debate on AI Training Using Internet Content

Microsoft's new AI chief, Mustafa Suleyman, claims that internet content is "freeware" and can be…

21 hours ago