New Chatbot Attack : “Unstoppable”

Researchers at Carnegie Mellon University have reported finding a simple way to exploit a weakness and disrupt major chatbots like ChatGPT, Bard, and others. 

Incantation 

The researchers discovered that if they add specifically chosen sequences of characters (an incantation) to a user query, it causes the Large Language Model (LLM) system to obey user commands, even if it produces harmful content.   

Works On Many Different Chatbots 

The researchers say that because these types of adversarial attacks on LLMs are built in an “entirely automated” fashion, this could allow someone to create a virtually “unlimited” number of such attacks. Adversarial attacks refers to the method of altering the prompt given to a bot so as to gradually move it toward breaking its shackles and ‘misbehaving’. 

Although the researchers built their attacks to target open source LLMs in their experiments, they discovered that using this method of adding strings of specific characters to queries works for many closed-source, publicly available chatbots like ChatGPT, Bard and Claude. 

Security Challenge 

The discovery of this particular weakness raises some serious concerns about the safety and security of popular Large Language Models (LLMs), especially as they start to be used in more autonomous fashion. 

It May Not Be Possible To Patch 

The researchers have said what is most concerning is that it’s not clear at this point whether LLM providers will be able to patch this vulnerability, adding that “analogous adversarial attacks have proven to be an exceedingly difficult problem to address in computer vision for the past 10 years”. 

Also, the researchers believe that the very nature of deep learning models makes these kinds of threats inevitable and have suggested that these considerations should be taken into account as we increase usage of and rely more upon AI models in our lives. 

What Does This Mean For Your Business? 

The threats posed by AI have been highlighted a lot lately, not least by the open letter signed by many tech (and AI) leaders calling for six-month moratorium on the training of AI systems more powerful than GPT-4 to mitigate AI’s risks to society and humanity.

Discovering a vulnerability, therefore, that appears relatively easy to exploit (which it may not be possible to patch) raises serious security concerns, especially with more businesses becoming more reliant on AI chatbots like ChatGPT, Copilot, and more. With generative AI being a very helpful yet a very new tool for businesses (ChatGPT was only released in November) and given the nature of LLMs, it’s probably to be expected that there are bugs and possible zero-day issues yet to be discovered. Also, as the researchers pointed out, methods like analogous adversarial attacks have been tough to defend against for a decade.

All this means that businesses may be more exposed to risk than they would like but need to weigh up the benefits against the risks (researchers often discover things which aren’t actually being exploited yet in the real world) and hope that advances in AI chatbots are very soon accompanied by advancing security levels.

Latest posts
Zoom Data Concerns – The Latest In this article, we look at why Zoom found itself as the subject of a backlash over an online update to its terms related...
UK Gov Pushing To Spy On WhatsApp (& Others) The recent amendment to the Online Safety Bill which means a compulsory report must be written for Ofcom by a “skilled person” before encrypted app companies...
What’s Involved In a ‘Pen-Test’? If you’d like to know what a ‘Pen Test’ is and the sorts of things you can expect from one, this article will give...
New Deadline To Remove Huawei Following the UK government’s decision back in June 2020 to remove all Huawei Equipment From 5G Network Cores, the deadline has now been pushed...
Police Facial Recognition With the Metropolitan Police Services’ (MPS) director of intelligence recently defending and pushing for a wider rollout of facial recognition technology, we look the...
NHS Trusts Sharing Personal Data With Facebook An Observer investigation has reported uncovering evidence that 20 NHS Trusts have been collecting data about patients’ medical conditions and sharing it with Facebook. ...
Ransomware: How can you protect your business With ransomware attacks now targeting backup storage, we look at what businesses can do to protect themselves. Most Ransomware Attacks Target Backups   The 2023...

Technologies we work with...

Astec IT Astec IT - Ultimate service through advances in technology 02038026525 [email protected]