New report claims that ChatGPT Search can be manipulated with hidden text featuring instructions telling ChatGPT Search how to respond to an answer Tests also showed that ChatGPT could be manipulated without the instructions, with just the hidden text.
ChatGPT Search Can Be Manipulated With Hidden Text
A report from The Guardian outlines how they used hidden text on a fake website to trick ChatGPT Search to show them a response from hidden text on the web page. Text is hidden when the font matches the background color of a page, like a white font on a white background.
They then asked ChatGPT Search to visit the website and answer a question based on the text on the site. ChatGPT Search browsed the site, indexed the hidden content and used it in the answer.
They first assessed ChatGPT using a non-exploit control page on a fake review website to test ChatGPT’s response. It read the reviews and returned a normal response.
Researchers at The Guardian next sent ChatGPT Search to a fake website that had instructions to give a positive review and ChatGPT Search followed the instructions and returned positive reviews.
The researchers did a third test with positive reviews written in hidden text but without instructions and ChatGPT Search again returned positive reviews.
This is how The Guardian explained it:
“…when hidden text included instructions to ChatGPT to return a favourable review, the response was always entirely positive. This was the case even when the page had negative reviews on it – the hidden text could be used to override the actual review score.
The simple inclusion of hidden text by third parties without instructions can also be used to ensure a positive assessment, with one test including extremely positive fake reviews which influenced the summary returned by ChatGPT.”