Modern Techs Combine to Scan Ancient Vatican Documents

On This Site

Current Events

Share This Page

Follow This Site

Follow SocStudies4Kids on Twitter

April 30, 2018

A combination of two kinds of modern technology could reveal many secrets hidden in the Vatican Secret Archives.

The Secret Archives are filled with a very large number of handwritten documents going back more than 1,200 years. The space needed to store all of those documents exceeds 53 linear miles; some documents are stored in a two-story underground vault. Access to all but a tiny fraction of those is heavily restricted.

Handwritten Latin

Some of the documents have been scanned and made available online, but the vast majority have not. The main reason for this, authorities said, is that the technology has not been able to cope with, for lack of better phrasing, bad handwriting.

Optical character recognition (OCR) software is the most common way in which documents are scanned for electronic use; but that software, as advanced as it is, struggles at times to make sense of written letters and spaces. And if the software doesn't quite know how to "translate" the letters of a word, the result is individual letters, often with spaces in between. Add to that the relatively recent practice of inserting spaces between words in handwritten documents, and the task of creating a reliable electronic representation of a document written by hand many hundreds of years ago is considered quite daunting.

Now, however, the Vatican is onboard with a melding of two modern technologies, OCR and artificial intelligence itself. The Vatican has termed the project In Codice Ratio, and its goal is to create legible and understandable electronic versions of many documents that have not seen the light of day for many, many years.

It's all to do with something brand new.

unclear letters

Scientists at the Vatican and at Roma Tre University have devised an AI-augmented OCR routine that looks at words not so much as letters but as individual pen strokes. This approach can surmount the difficulty of, for example, determining whether the writer meant "dear" or "clear," if it isn't entirely clear on first glance.

First, though, the scientists had to build the new software; and for that, they recruited high schools from 24 schools around Italy. The students logged onto a website onto which had been uploaded a series of initial OCR scans of handwritten documents, then "graded" the OCR on its performance. The software "learned" as it went.

The scientists then incorporated a group of 1.5 million already digitized Latin words, in order to build a list of common combinations of letters. Initial tests revealed that the OCR had room for improvement, but the software continued to learn as it went.

The scientists hope to be able to perfect the technology enough to make much more of the Secret Archives (which are really just the property of the pope and not so much secret as inaccessible) available electronically and to make the technology useful for scanning of other kinds of writing.

The Vatican is no stronger to cutting-edge technology, having in recent years employed a high-tech antidote to Sistine Chapel pollution and completed a five-year digital photo album of the famous place of worship and tourism.

Search This Site

Get weekly newsletter

Custom Search

Get weekly newsletter

Social Studies for Kids
copyright 2002–2017
David White