AI is not the end of the world as we know it

  • Artificial intelligence isn’t intelligent and it isn’t artificial
  • Everything shaping AI comprises the scraped and analysed appropriations of the work of human beings
  • AI companies access, annex, manipulate and monetise data provided by people
  • Consensus is growing that all AI companies should be legally required to be fully and permanently transparent about all data they hold and use – or else

Elon Musk (or “Big X” as he wants to be called this week) believes that the whole world could enjoy an “AI summer” of indeterminate duration – provided that AI is internationally constrained, legislated and regulated. Back in March this year, in ‘Pause Giant AI Experiments: An Open Letter’, Musk, Steve Wozniak (the co-founder of Apple), and more than 33,000 other signatory technologists, business leaders and academics called for an immediate moratorium on the training of AI systems “more powerful than GPT-4”.

The open letter read: “AI developers must work with policymakers to dramatically accelerate development of robust AI governance systems. These should at a minimum include: New and capable regulatory authorities dedicated to AI; oversight and tracking of highly capable AI systems and large pools of computational capability; provenance and watermarking systems to help distinguish real from synthetic and to track model leaks; a robust auditing and certification ecosystem; liability for AI-caused harm; robust public funding for technical AI safety research; and well-resourced institutions for coping with the dramatic economic and political disruptions (especially to democracy) that AI will cause.”

The list of must-do’s makes eminent sense and there can be no doubt that new laws and regulations are required. However, popular media has gone completely over the top in demonising AI to the extent that many readers, listeners and watchers can be forgiven for thinking that humankind is facing the end of the world as we know it – even though artificial intelligence isn’t intelligent any more than it is artificial.

What is popularly and erroneously referred to as AI is, in fact, scraped and analysed appropriations of the work of human beings: Mere mortals; the scientists, artists, musicians, writers and myriad of other individuals and groupings of individuals who had the original thoughts and did the original work. That work now lies accumulated in files in datacentres across the planet waiting for some so-called, self-identifying and self-ordained AI company to access, annex, manipulate and monetise them.

The basis of AI apps and services, such as ChatGPT, Google Bard and others of their ilk, is the large language model (LLM). These are deep-learning algorithms which are ‘trained’ on massive sets of scraped data to predict, when prompted, what might come next in a sequence of words or pictures and then generate the results. What is used for that purpose comes entirely from what is already available and archived on the world wide web. What is produced as output can be convincingly fluent and that fluency, and seeming depth of knowledge, is often taken as being evidence of sentience. It isn’t. It is, in fact, yet another example of humanity’s innate and ancient capability (and determination) to make something meaningful out of observed peculiarities, such as taking patterns of light and shade, and transforming them in our imagination into people or monsters, imagining landscapes and castles in clouds or looking up to our satellite planet and seeing the ‘Man in the Moon’.

In humans, this tendency is called pareidolia, but the trouble today is that the similar behaviour is also being observed in AI. The bigger the data sets scraped and the more questions asked, the further LLMs will interrogate those data sets to embroider answers to prompted questions. The result, as Google says of Bard, is that it “can sometimes generate responses that contain inaccurate or misleading information while presenting it confidently and convincingly.” And then some!

And there it is, LLM AI systems have no understanding of what they are producing, they are one-dimensional and can perform only specific tasks. They have no sense (and sense is the operative word here) of context, history, emotion, nostalgia, friendship, fiction, artistic creativity, love, physical or mental injury or hurt. Unlike humans, they cannot reason by analogy or generalise from the particular. They are incapable of abstract reason. They operate only within the confines of pre-determined parameters that they don’t understand and are ultimately limited by the data they have been trained on. They are machines that operate as a sort of quick-access dictionary-cum-encyclopaedia with no idea of what they are cobbling together in outputs and answers. These machines can rapidly identify a cat, a dog, a whale or an elephant in a picture but they do not understand what those animals are or what their significance is to humanity, to one another or to the planet.

They also have neither emotional intelligence, nor common sense, they cannot infer, they are not nuanced, they cannot reason or interpret situations or human feelings but, depending on the quality and biases of the data on which they have been trained, they can (and do) make wrong decisions and perpetuate distortion and discrimination.

C3PO, say hello to your AI cousin, C2PA

During the five months since the open “Pause Giant AI Experiments” letter was published, there has been a considerable reaction (and even action) in some parts of the world, such as the US, the European Union (EU) and the UK, and absolutely none from others, including China, Russia, Iran, Saudi Arabia and North Korea. Par for the course, of course.

In the US, the Biden administration wants the big tech AI companies to report any and all content that has been created via AI while, within weeks, the EU will require some, as yet unnamed, technology platforms to permanently display “prominent markings” to warn users what images, video and audio have been generated by AI.

The UK is following suit but finds itself in something of a cleft stick, with the British intelligence agencies pushing the government to water-down extant surveillance legislation (passed in 2016) which, it is claimed, places a ‘disproportionately burdensome’ limit on the ability of the likes of MI5, MI6 and GCHQ (Government Communications Headquarters) to train new and developing AI models on the massive amounts of data they hold on UK citizens (and others) in bulk personal datasets (BPDs).

The EU categorises AI technology into four risk groups of unacceptable, high, limited and minimal. The unacceptable and high-risk categories will be the first to be subject to stringent new legislation, while the other two categories will receive closer attention once the riskiest ones are more rigorously regulated and policed.

The emergent transatlantic consensus seems to be that the most important immediate need is to ensure that all AI companies, from little to large, should be legally required to be fully and permanently transparent in relation to all data in general, and to AI training data in particular, given that training data, if not properly vetted and filtered, can result in algorithms that are riddled with the prejudices contained in the original content that has been scraped. Indeed, the aim is that all key algorithms will, by law, have to be open sourced. Severe penalties, financial and penal, will apply in cases where transparency is deliberately obscured to hide risks and divert attention away from malign manipulation.

One initiative that has been gathering momentum since its introduction in 2021 is C2PA. This, the Coalition for Content Provenance and Authenticity, operates under the aegis of the not-for-profit Joint Development Foundation and is an alliance between Adobe, Arm, Intel, Microsoft and Truepic (a platform for requesting and reviewing authenticated photos and videos).

C2PA is an open-sourced internet standard that uses cryptography to encode details of the origins (provenance) of a piece of content. From the outset, the project was designed to be globally applicable and to work across the entire internet. The base computer code is available, free, to anyone or any organisation.

There is hope that AI regulation will evolve into global standards and enforcement but, as the EU’s competition commissioner Margrethe Vestager recently said: “Let’s start working on a UN approach. But we shouldn’t hold our breath. We should do what we can here and now.”

Meanwhile, anger is mounting over the ways that AI companies are scraping copyrighted content without consent and using it in AI training without either acknowledging they have simply taken the content in the absence of an originator’s permission, or paying any compensation to them. The mother and father of all class-action lawsuits are limbering up in the wings and when they move on stage and into litigation, AI companies will have some serious music to face.

The AI genie is out of the bottle and won’t be going back in again. However, the humans that created it are still in charge and will be able to make it do our bidding and not its own. That control will be all the more certain if governments, legislators, standards organisations and regulators work together and co-operate to introduce meaningful controls “at the speed of data”, as the saying has it.

And there’s the problem: As we know from examples such as mobile telephony, the growth of the internet and the insatiable global demand for bandwidth, technology advances very quickly. Meanwhile, legislators and regulators are always playing catchup, and where AI is concerned, delay could be dangerous. Not because indestructible terminator robots and evil computers combine and escape to destroy humanity in a machines-versus-mankind war, but because nations and people will be overwhelmed by misinformation and manipulation to the point that democracy is endangered, and the Earth eventually becomes a global dystopia of our own making.

- Martyn Warwick, Editor in Chief, TelecomTV