Picture yourself standing before a weathered clay tablet covered in symbols that haven’t been read for thousands of years. The script looks tantalizingly familiar, yet remains completely incomprehensible. This is the daily reality for archaeologists and linguists working with humanity’s earliest written records.
For over a century, scholars have struggled to decode these ancient mysteries using traditional methods of comparison and analysis. However, artificial intelligence is transforming the field of ancient history by providing researchers with innovative tools to analyze ancient texts. It is estimated that we have lost more than 75 percent of all languages ever spoken by humans, making this technological breakthrough particularly significant.
From crumbling cuneiform tablets to charred Roman scrolls, AI is breathing new life into humanity’s written heritage in ways previously unimaginable. Be surprised by what these digital archaeologists are uncovering as we dive into this technological revolution.
Machine Learning Takes on the World’s Oldest Writing System

Cuneiform, the wedge-shaped script invented by the Sumerians around 3200 BCE, represents humanity’s first attempt at written communication. To reliably decipher cuneiform is to interpret a vast proportion of humanity’s written history. “People say the first half of human history is only recorded in these cuneiform tablets,” says Enrique Jiménez at Ludwig Maximilians University in Munich, Germany.
However, only a handful of experts can read and decipher cuneiform, and translating it into modern languages is extremely difficult and incredibly time-consuming. The New Scientist cites that just 75 people can read cuneiform fluently. However, there are some 13,000 tablets in the British Museum and many more in Iraq, and most have probably not been touched, let alone translated.
Researchers trained a machine learning system called Deepscribe on 6,000 hand-annotated images from the Persepolis Fortification Archive, which identify some 100,000 signs. The model achieved around 80% accuracy and greatly accelerated the researcher’s translation efforts. An AI-driven system, ProtoSnap, developed by researchers at Cornell and Tel Aviv University, can recognize and reconstruct cuneiform characters with remarkable precision, even accounting for variations in writing styles across different regions and time periods. This breakthrough is already expanding the number of deciphered texts, shedding new light on the economic and social history of ancient Mesopotamia.
How MIT’s Revolutionary Algorithm Deciphers Lost Languages

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have found a way to use machine learning to help us decode dead, “undeciphered” languages – which means we would finally be able to understand the grammar, vocabulary, and syntax underlying the written versions of these lost languages. Developed under the guidance of Professor Regina Barzilay and PhD student Jiaming Luo, this innovative system is designed to decipher languages without requiring prior knowledge of their relationship to other languages. This approach differs fundamentally from traditional methods, which often depend on finding a known language as a comparative reference.
The researchers taught a “decipherment algorithm” various linguistic constraints that occur as languages evolve in predictable ways. It then figured out patterns of language using these constraints. The algorithm can, as a result, categorize words in an ancient language and link them to the equivalents in other related languages. “Our work is about automatic decipherment of lost languages written in an under segmented or unsegmented script – apparently for some ancient languages, word dividers had not been invented, or not consistently applied,” said Jiaming Luo, co-author of the study. “The significance of our work lies in the fact that ours is the first attempt to do such decipherment automatically using machine learning in such challenging situations.”
Success Stories: From Ugaritic to Linear B

In a landmark 2010 experiment, researchers Reginald Smith and others successfully used an AI to decipher Ugaritic. Ugaritic is an ancient language that was already deciphered by humans, making it a perfect test case. The AI was first trained on the grammatical and phonetic patterns of a known related language: Hebrew. It was then set loose on the Ugaritic corpus. By using pattern recognition algorithms, the AI successfully recognized Ugaritic as a Semitic language, synonymous with Hebrew. The AI then map the Ugaritic symbols to their Hebrew sounds, and generated significant part of the language historically with a high degree of accuracy.
Linear B, the script of the Mycenaean Greeks, was deciphered by Michael Ventris, an architect, in 1952. While this was a triumph for humanity, since that time, AI has been invaluable for improving the decipherment. AI has analyzed the whole corpus of Linear B tablets to confirm and improve upon Ventris’ original translations. AI has identified grammatical patterns and vocabulary that could not be seen before, resulting in a more sophisticated understanding of the Mycenaean language.
This experiment was a watershed moment. It proved that an AI, given a sensible starting point (a known related language), could systematically decipher a lost tongue. These case studies demonstrate how AI serves as a powerful tool for historical linguists, enabling analysis at scales previously impossible and providing validation for existing decipherments.
The Vesuvius Challenge: Unrolling 2,000-Year-Old Secrets

Vesuvius Challenge was launched in 2023 to bring the world together to read the Herculaneum scrolls. Along with smaller progress prizes, a Grand Prize was issued for the first team to recover 4 passages of 140 characters from a Herculaneum scroll. Following a year of remarkable progress, the prize was claimed. A team of three students won $700,000 in early 2024 for using artificial intelligence to read passages from an ancient papyrus scroll. The document is one of the more than 800 scrolls known as the Herculaneum papyri that were carbonized by the eruption of Mount Vesuvius in 79 C.E.
This “virtual unrolling” technique starts by scanning the carbonised scroll in a particle accelerator at super-high resolution. Then, the complex structure of the scroll is analysed and virtually flattened – but no text can yet be seen. Seales recognised that the key to discerning the carbon ink from the carbonised papyrus background was AI. Initial experiments confirmed his hypothesis that AI could help recover the elusive ink from the CT scans, but significant challenges remained to unlock the secrets of the scrolls.
Building on the work each had done individually, their AI models revealed 2,000 characters in four full columns – far outstripping the Grand Prize’s criterion of four passages of 140 characters. In February 2024 the Vesuvius Challenge awarded them the $700,000 Grand Prize. The readable text comprises around 5 percent of the first scroll, and it is from the same text as the earlier discoveries. It is a previously unread tract, probably by Philodemus, about pleasure.
Transformer Models: The New Digital Archaeologists

So-called transformer models, which are the same algorithms that well-known platforms such as ChatGPT are based on, can identify ancient texts with up to 62 percent accuracy, compared to just 25 percent accuracy by human experts. And this accuracy increases to 72 percent when they are combined with human input, an encouraging sign that a human touch will still be necessary in the future of archaeology.
Deepmind was also trained with a similar problem to decipher damaged Ancient Greek tablets at scale. The model helped historians restore the texts with 72% accuracy and could predict the date they were written within 30 years of their actual age. It could even predict the region where the texts were written with 71% accuracy. The model, dubbed Ithaca, was trained using around 60,000 ancient Greek texts from across the Mediterranean written between 700 BC and AD 500.
AI models have successfully restored damaged Greek inscriptions, translated ancient Akkadian tablets, and even predicted the origins and dates of previously undated texts. Oxford University’s Ithaca model, for instance, has already helped historians resolve long-standing debates in classical studies. What is most startling for papyrologists, though, is the speed at which the AI pipeline is now finding identifiable letters.
The Persistent Mysteries: Linear A and the Indus Valley Script

As of 2025, Linear A remains completely unreadable. Despite advances in machine learning and pattern recognition, Linear A decipherment remains an unsolved puzzle, its language unknown and its Bronze Age secrets still locked away. AI offers fresh computational approaches to this ancient mystery, but Linear A decipherment faces three insurmountable barriers: an extremely small corpus, no bilingual texts, and a language that appears to be an isolate sharing no connections with documented ancient or modern tongues.
Researchers are applying AI algorithms to examine the patterns and frequencies of symbols in the Indus Valley script, aiming to uncover insights into its structure and meaning. A new generation of scholars and researchers is leveraging AI and advanced techniques to unravel the mysteries of the Indus Valley script, sparking optimism that the ancient language may soon be deciphered through these cutting-edge technologies.
The biggest challenge is data. For many scripts that are undeciphered, we have very few inscriptions for AI models to use for training. For instance, the corpus of the Indus Valley script is comprised mostly of short inscriptions on seals. AI models require large amounts of data to find reliable patterns. Upon application to the Indus script, the program illustrated similarities to established languages such as Tamil and Sanskrit, thus indicating the existence of a linguistic framework within the script itself.
Breaking Through the Technology Barriers

Artificial intelligence is capable of storing many more languages than the human brain and may analyze patterns that appear in text to a more precise degree than is possible manually. This means there has been a paradigm shift in the world of ancient text deciphering, both in terms of speed and accuracy. Modern technology, particularly artificial intelligence (AI), has become pivotal in deciphering ancient scripts. Researchers collaborating with companies like IBM and Google’s DeepMind are leveraging AI to analyze and interpret texts faster than human capabilities allow.
A combination of technologies is also being used to resurrect words from the past. In the volcanic ashes of Pompeii, CT scans have revealed the contents of burned, rolled up texts and AI has been used to decipher the words on the papyrus. One of the first words translated was the one for “purple”, with each new word deciphered unlocking further translations for whole phrases and meanings.
The Vesuvius Challenge was so successful that it attracted major supporters. In January 2025, for example, the Musk Foundation donated $2 million, thereby accelerating the development of scanning technologies. Since the scroll was scanned at the Diamond Light Source in Harwell in July 2024, the UK’s national synchrotron science facility, the Vesuvius Challenge team have worked with AI to piece together the images and enhance the clarity of the text.
The Future of Digital Archaeology

AI will never work in isolation. As of now, the most significant contribution of AI is to confirm hypotheses and broaden knowledge of partially known scripts (like Ugaritic based on Hebrew). The decipherment of any document that is written in a substantially unknown script will, and can only, happen by a combination of AI and human scholars who contribute essential contextual factors like culture and history.
With the right tools, historians may one day reconstruct lost libraries, decode languages that have been silent for millennia, and ask new questions about civilizations long gone. “We believe machine learning could support historians to expand and deepen our understanding of ancient history, just as microscopes and telescopes have extended the realm of science” says Yannis Assael, a research scientist at DeepMind.
Thanks in large part to a group of amateur AI builders, we now have tools for reading the unopened Herculaneum papyri. When we spoke in late 2023, he said his goal for 2024 was to build on the winning team’s approach to read 90 percent of the four scrolls that had by then been scanned using high-energy physics. If successful, this would unlock the hundreds of unopened Herculaneum scrolls. There is also the possibility that the virtual unrolling may prompt new excavations at the Villa of the Papyri (phase four of the Vesuvius Challenge). Many researchers believe that another Greek and Latin library – with its all lost masterpieces – has yet to be unearthed.
Conclusion

The convergence of artificial intelligence and archaeology represents more than just a technological advancement. It’s a renaissance of human curiosity, allowing us to finally hear voices that have been silent for millennia. The ancient world is still speaking. We’re just now learning how to listen.
From cuneiform tablets revealing ancient Mesopotamian business transactions to Roman scrolls discussing Epicurean philosophy, AI is opening doors to knowledge we thought was lost forever. While the technology cannot work in isolation and still requires human expertise for context and interpretation, it has accelerated the pace of discovery in ways that would have seemed impossible just a few years ago.
The potential to recover lost works by ancient authors, understand forgotten civilizations, and piece together the missing chapters of human history has never been more tangible. What ancient secrets do you think AI will unlock next? Tell us in the comments.


