It’s a paradox: while AI tools are getting smarter and more capable, their output is getting dumber. The reason: the internet is largely filled with nonsensical, incorrect or sometimes even misleading AI content, which in turn is used to train new AI models, that produce slightly dumber results, and so on.
In July this year, scientists from Stanford and Berkeley universities noticed that the latest model of the bot, GPT-4, has recently been performing less well than version 3.5. They concluded that constant monitoring of the technology is a must due to the lack of consistency.
The internet is flooded with junk
The internet is becoming increasingly crowded with AI-generated texts, images and videos. For example, Amazon’s ‘young adult’ top 100 turned out to be chock full of books generated by AI and with nonsensical titles such as ‘Apricot bar code architecture’. According to estimates, in three years’ time, around 90% of the total content on the internet will be created by AI…
The problem is that AI has no human frame of reference, so it happily puts photos of a Mexican city in a travel guide about Amsterdam, and posts outdated or incorrect information on the internet. This incorrect information is used to train new language models and so on. This is called ‘Model Collapse’.
And if only 10% of the internet will be ‘trustworthy’, how reliable will the output of the AI models trained on this junk be?
The creator of ChatGPT, OpenAI, does not seem to be bothered by this. It brings the latest version to the market as quickly as possible, without bothering to research the effects. There is one bright spot however: ‘real’ information from people/scientists could become more valuable, and people may start to rely on real, human-made texts. However, there is a minor detail; when you see how many people have ‘done their own research on the internet and discovered the real truth’, basing themselves on myths, nonsense and disinformation…