The ancient roots of generative AI

With the recent release of GPT-4o from OpenAI, there’s never been a better time for us to dip our toes into the technologies underpinning our newest and most advanced generative AIs.

Artificial Intelligence (AI), as we currently use the term, broadly refers to manmade constructs that can complete tasks that would usually require human reasoning.

Our current most topical examples are generative AIs, like DALL-E, Midjourney, Copilot or ChatGPT, which, propped up by massive amounts of training data, interpret prompts provided to them and generate content in response relatively reliably. They produce statistically likely content, the quality of which depends upon the quality of the training data.


In the context of technological development, logic is a study of the essential principles behind reason — a branch of philosophy closely aligned with mathematics. While this is not the same as the kind of logic used in computing today, it nevertheless underpins it.

Logic has ancient roots in Indian, Chinese and Greek traditions, but when discussing the development of technologies in the west, it’s typically the philosophical giants of classical Greece to whom we refer.

The first ancient building blocks of AI were laid, very appropriately, when we began to formalise our understanding of how it is that people reason, learn and persuade others. This practice underpins not just the development of mathematical logic, but also the idea that human reason is quantifiable and replicable.

Programmable computers

It might surprise you to learn how old programmable computers really are. After all, most of us didn’t consider computers a household item until the 1980s or 1990s and the adoption of the world wide web, but they’re actually a lot older than the internet.

Programming has a history as long and murky as that of logic, in its way. There were incremental developments in punched card technologies over decades, right up until the 1804 Jacquard loom inspired the work of Charles Babbage. Babbage is generally credited with inventing the earliest example of the automatic computer in the 1830s.

In the 1840s, Ada Lovelace wrote the very first computer program in notes she attached to a translation of works relating to Babbage’s engine. She’s now lionised as the first ever computer programmer.

From the 19th century, important ideas in mathematics and computing emerged rapidly. The Turing Test, created in 1950, was the standard by which artificial “intelligence” was popularly measured for some time. Alan Turing, its creator, called it “the imitation game.” It posits that in a natural language text conversation between a person and a machine, if an impartial observer cannot reliably tell which is the person and which is the machine, the machine is said to have passed the test.

Today, chatbot AIs like ChatGPT certainly do pass the Turing Test, but our understanding of what “intelligence” might look like has also evolved accordingly.

 “True” artificial intelligence

Attempts at a computer that can reason in every way like a person — not just to imitate us well enough for the Turing Test, but the effort to reduce every aspect of human intelligence to a form that a machine could simulate — go back only seventy odd years.

In 1955, a group of academic researchers proposed a summer workshop at Dartmouth College in New Hampshire. The proposal came from names we all now recognise: Bell Telephone Laboratories, IBM, Harvard University and Dartmouth College.

You can still find their historic proposal here.

They wanted to explore the concept of true artificial intelligence: “The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.”

Big data

As Rohit Segal puts it over at Forbes, “data is the lifeblood that fuels AI algorithms.”

Today, everything from parking tickets to the complete works of Shakespeare is digitised. There are good reasons for that: it uses less paper, it makes records accessible from anywhere, and it permits the automation of tasks that allows us to cut out a whole lot of manual labour. In recording every last click and impression, we’ve created more data than ever existed previously.

We use the term “big data” to refer to a huge data set that’s so big and complex it can’t really be worked upon by our traditional data management tools. Credit for the term is shared — by Mashey, in his late 90s work at Silicon Graphics, Weiss and Indurkhya in a computer science publication in 1998, and Diebold, in the field of statistics in 2000.

Two things matter in training an AI: the amount of data, and the type of data. The wrong type or amount of data will result in unreliable, inaccurate or low-quality outputs. High-quality data tends to be the kind of stuff you want your AI to reproduce, which usually includes news, books, scientific papers and the like; low-quality data usually includes the stuff on your social media feed (as Microsoft found out in 2016 when Twitter destabilised their bot at the speed of light).

And as for the amount of data, training an AI requires a lot. GPT-3.5 was famously trained on 300 billion words. The need for data is so great, in fact, that in a recent paper, Epoch AI predict, “we will have exhausted the stock of low-quality language data by 2030 to 2050, high-quality language data before 2026, and vision data by 2030 to 2060.”

For the most part, it’s only mass digitisation and the advent and storage of big data that has permitted the training and dissemination of AI models.

Human thought turned out to require more than a summer workshop to reproduce via machine. But what matters to us here, today, is that these researchers not only thought to try, but that they had the tools — built on thousands of years of human ingenuity — to make the attempt.

Blog Categories