New Chatbot Attack : “Unstoppable”

Researchers at Carnegie Mellon University have reported finding a simple way to exploit a weakness and disrupt major chatbots like ChatGPT, Bard, and others.

Incantation

The researchers discovered that if they add specifically chosen sequences of characters (an incantation) to a user query, it causes the Large Language Model (LLM) system to obey user commands, even if it produces harmful content.

Works On Many Different Chatbots

The researchers say that because these types of adversarial attacks on LLMs are built in an “entirely automated” fashion, this could allow someone to create a virtually “unlimited” number of such attacks. Adversarial attacks refers to the method of altering the prompt given to a bot so as to gradually move it toward breaking its shackles and ‘misbehaving’.

Although the researchers built their attacks to target open source LLMs in their experiments, they discovered that using this method of adding strings of specific characters to queries works for many closed-source, publicly available chatbots like ChatGPT, Bard and Claude.

Security Challenge

The discovery of this particular weakness raises some serious concerns about the safety and security of popular Large Language Models (LLMs), especially as they start to be used in more autonomous fashion.

It May Not Be Possible To Patch

The researchers have said what is most concerning is that it’s not clear at this point whether LLM providers will be able to patch this vulnerability, adding that “analogous adversarial attacks have proven to be an exceedingly difficult problem to address in computer vision for the past 10 years”.

Also, the researchers believe that the very nature of deep learning models makes these kinds of threats inevitable and have suggested that these considerations should be taken into account as we increase usage of and rely more upon AI models in our lives.

What Does This Mean For Your Business?

The threats posed by AI have been highlighted a lot lately, not least by the open letter signed by many tech (and AI) leaders calling for six-month moratorium on the training of AI systems more powerful than GPT-4 to mitigate AI’s risks to society and humanity.

Discovering a vulnerability, therefore, that appears relatively easy to exploit (which it may not be possible to patch) raises serious security concerns, especially with more businesses becoming more reliant on AI chatbots like ChatGPT, Copilot, and more. With generative AI being a very helpful yet a very new tool for businesses (ChatGPT was only released in November) and given the nature of LLMs, it’s probably to be expected that there are bugs and possible zero-day issues yet to be discovered. Also, as the researchers pointed out, methods like analogous adversarial attacks have been tough to defend against for a decade.

All this means that businesses may be more exposed to risk than they would like but need to weigh up the benefits against the risks (researchers often discover things which aren’t actually being exploited yet in the real world) and hope that advances in AI chatbots are very soon accompanied by advancing security levels.

Latest posts

Anti-trust: OpenAI And Microsoft – The Latest Following the recent boardroom power struggle that led to the sacking and reinstatement of OpenAI boss Sam Altman, Microsoft’s relationship with OpenAI is now...

Microsoft Launches New AI Content Safety Service Microsoft has announced the launch of Azure AI Content Safety, a new content moderation service that uses AI to detect and filter out offensive,...

Safety Considerations Around ChatGPT Image Uploads With one of ChatGPT’s latest features being the ability to upload images to help get answers to queries, here we look at why there...

Navigating the Cybersecurity Landscape: A Guide for Insurance Companies Introduction The insurance sector is built on the foundation of trust and the secure handling of sensitive data. However, the increasing frequency of cyberattacks...

The Imperative of Cybersecurity in the Financial Sector: Addressing Key Pain Points Introduction In an era where data is the new gold, the financial sector remains a prime target for cybercriminals. With the increasing digitisation of...

No Email Backup For Microsoft 365? In this insight, we look at what many users think to be a surprising fact in that Microsoft 365 doesn’t provide a traditional email...

Zoom Data Concerns In this article, we look at why Zoom found itself as the subject of a backlash over an online update to its terms related...

Technologies we work with...