Did you know that cats have been to the moon? That it is safe to look at the sun for 15 minutes, or even longer, as long as you have dark skin? Or that to be healthy, you have to eat a small stone a day?
These are some of the latest pearls of wisdom that Google has served up to its US users (we’re not so lucky here in the UK yet). “Let Google do the searching for you,” the search giant promised when it introduced a feature called AI Summaries earlier this month. This integrates Google’s generative-AI model Gemini into its search engine. The answers it generates are displayed above the traditional list of sorted results. And you can’t get rid of them.
AI summaries haven’t had the effect Google was hoping for, to say the least. It certainly gained instant virality online, with people sharing their favorite answers. Not because these are useful, but because they are so funny. For example, when you ask AI Summary for a list of fruits ending in ‘um’, it returns: ‘Applum, Strawberrum and Coconut’. This is what, in AI parlance, is called a ‘hallucination’.
Despite having a market capitalization of $2 trillion and the ability to employ the greatest brains on the planet, Google continues to stumble on AI. Its first attempt to join the AI-Generative Goldrush in February last year was the ill-fated chatbot Bard, which had similar problems with extracting factual inaccuracies. In his first live demonstration, Bard mistakenly claimed that the James Webb Space Telescope, launched only in 2021, had taken the ‘first pictures’ of Earth from outside the solar system. The mistake wiped $100 billion off Google’s market value.
This February, Google had another foray into AI, this time with Gemini, an image and text generator. The problem was that there were railings with very heavy diversity. When asked to produce historically accurate images, it would instead generate black Nazi soldiers, American founding fathers, and a female South Asian pope.
This was ‘a well-intentioned mistake’, it said The Economist. But Google was not unaware of the inherent problems of generative AI. She will have known about her abilities and pitfalls.
Before the current AI craze really took off, analysts had already realized that generative AI was unlikely to improve the user experience and could actually degrade it. That caution was abandoned once investors started piling in.
So why is Google’s AI producing such rotten results? In fact, it works exactly as you would expect. Don’t be fooled by the ‘artificial intelligence’ brand. Basically, AI Overviews is just trying to guess the next word it should use, according to statistical probability, but without any anchoring to reality. The algorithm can’t say ‘I don’t know’ when asked a difficult question, because it doesn’t ‘know’ anything. It cannot even perform simple math, as users have demonstrated, because it has no basic concept of numbers or valid arithmetic operations. Hence the hallucinations and omissions.
This is less of a problem when the output doesn’t matter that much, like when the AI is processing an image and creates a small glitch. Our phones use machine learning every day to process our photos, and we don’t notice or care much about most glitches. But for Google to advise us all to start eating rocks is no small glitch.
Such mistakes are more or less inevitable because of the way the AI is trained. Instead of learning from a curated dataset of precise information, AI models are trained on a large set of virtually open data. Google’s AI and ChatGPT have already scoured the web and, needless to say, a lot of what’s on the web isn’t true. Forums like Reddit are filled with sarcasm and jokes, but these are treated by AI as believable, honest and correct explanations of problems. Programmers have long used the phrase ‘GIGO’ to describe what’s going on here: garbage in, garbage out.
The problem of AI hallucinations is consistent across the board. It almost precludes generative AI from being practically useful in commercial and business applications, where you might expect it to save a lot of time. A new study of generative artificial intelligence in legal work finds that the extra verification steps now required to ensure AI isn’t hallucinating negates the time saved by deploying it in the first place.
‘[Programmers] they are still making the same bone mistakes as before. No one has actually solved hallucinations with big-picture models, and I don’t think we can,” cognitive scientist and veteran AI skeptic Professor Gary Marcus noted last week.
Another problem is emerging now. AI is making an already bad job worse by generating false information, which then pollutes the rest of the internet. “Google learns every piece of garbage it sees on the web, and nothing generates garbage better than AI,” as one X user said.
Last year, major AI companies admitted that, after running out of content to scrape from the web, they had started using synthetic training data – that is, data generated by the generative AI itself. A year ago, OpenAI’s Sam Altman said he was ‘pretty sure that soon all data will be synthetic data’, created by other AIs.
This is a big problem. It essentially causes models to ‘collapse’ and not produce useful results. “Model crash is when the generative AI becomes unstable, unreliable, or stops working. It can happen when generative AI models are trained on AI-generated content rather than humans,’ warned Professor Nigel Shadbolt of the Open Data Institute last December. One researcher, Jathan Sadowski, has called this phenomenon ‘Habsburg AI’, after the Spanish Habsburg dynasty, which died in 1700 as a result of diseases caused by conception.
You could argue that something like this already happens without the help of AI, such as when a false fact is entered into Wikipedia, cited in the media, and then the media citations become the justification for its continued inclusion in Wikipedia.
AI simply automates and accelerates this process of generating lies. This week, Telegraph gave the following example: ‘When Google claimed that there were no African countries beginning with the letter K, its response appeared to be based on a ChatGPT online discussion asking the same question incorrectly. In other words, AI is now using other AI fictions as gospel.’
The most accurate description of this phenomenon comes from some American researchers, who last year coined the phrase ‘Model Autophagy Disorder’ or MAD. They wanted to evoke the practice of introducing bovine prions into the cattle feed, a practice that caused bovine spongiform encephalopathy or mad cow disease. “Our primary conclusion in all scenarios is that without sufficient real fresh data in each generation of an autophagy loop, future generation models are destined to have their quality (precision) or diversity (recall) progressively decline” , they wrote.
Very few people warned about the downsides of generative AI when OpenAI released its ChatGPT tool in November 2022. Now, ChatGPT has polluted the web and poisoned itself and other AI tools. Cleaning this up will be a huge challenge. While the promised benefits of AI remain elusive, the costs are clearly starting to rise.
Andrew Orlowski is a weekly columnist at Telegraph. Visit his site here. Follow him at X: @AndrewOrlowski.