Screenshot of the Index Thomisticus Treebank project as of September 2020.

Screenshot of the Index Thomisticus Treebank project as of September 2020.

Detail map of Milano, Lombardia, Italy,Gallarate, Lombardia, Italy Overview map of Milano, Lombardia, Italy,Gallarate, Lombardia, Italy

A: Milano, Lombardia, Italy, B: Gallarate, Lombardia, Italy

Publication of Roberto Busa's Index Thomisticus: Forty Years of Data Processing in the Humanities

1974 to 1980

In 1974 Italian Jesuit priest Roberto Busa of Gallarate and Milan, Italy, published the first volume of his Index Thomisticus, a massive index verborum or concordance of the writings of Thomas Aquinas. The work was complete in 56 printed volumes in 1980. This concordance, which Busa began to conceptualize in 1946, and started developing in 1949, was the pioneering large scale humanities computing, or digital humanities project, though it began before electronic computers were available. Writing in 1951, Busa believed that electric punched card tabulating technology, the technology then available, would enable completion in four years of a work which would otherwise have taken "half a century." In spite of this optimism, the project required further computing advances and 40 years till completion.

"A purely mechanical concordance program, where words are alphabetized according to their graphic forms (sequences of letters), could have produced a result in much less time, but Busa would not be satisfied with this. He wanted to produce a "lemmatized" concordance where words are listed under their dictionary headings, not under their simple forms. His team attempted to write some computer software to deal with this and, eventually, the lemmatization of all 11 million words was completed in a semiautomatic way with human beings dealing with word forms that the program could not handle. Busa set very high standards for his work. His volumes are elegantly typeset and he would not compromise on any levels of scholarship in order to get the work done faster. He has continued to have a profound influence on humanities computing, with a vision and imagination that reach beyond the horizons of many of the current generation of practitioners who have been brought up with the Internet. A CD-ROM of the Aquinas material appeared in 1992 that incorporated some hypertextual features ("cum hypertextibus") and was accompanied by a user guide in Latin, English, and Italian. Father Busa himself was the first recipient of the Busa award in recognition of outstanding achievements in the application of information technology to humanistic research, and in his award lecture in Debrecen, Hungary, in 1998 he reflected on the potential of the World Wide Web to deliver multimedia scholarly material accompanied by sophisticated analysis tools" (Hockey, "The History of Humanities Computing," A Companion to Digital Humanities, Shreibman, Siemens, and Unsworth[eds.] [2004] 4).

In 2005 a web-based version of the Index Thomisticus made its debut, designed and programmed by E. Alarcón and E. Bernot, in collaboration with Busa. In 2006 the Index Thomisticus Treebank project (directed by Marco Passarotti) started the syntactic annotation of the entire corpus.

Timeline Themes

Related Entries