From Pages to Algorithms: Exploring the World of Digital Humanities

Some years ago, I was writing a lot about the proliferation of artists and writers. To be honest, I was kind of obsessed by the concept of information overload. In a fit of Asimovian enthusiasm, I invented the word Logoinformatica to refer to a discipline that would analyze the immense amount of literature and art produced by mankind, thus providing the means to evaluate such a mass of work. And guess what? It already existed and is called Digital Humanities.

Before we start, let me tell you that:

  • I am not a Digital Humanities scholar, only a fan;
  • I am an expert in computational stuff. Some of the methods I have applied for my research are similar to those of the Digital Humanities;
  • I was so excited about it that in 2012 I have attended the Digital Humanities annual meeting in Hamburg, Germany;
  • I am the happy creator of #Biotext — by using computational tools, I have extracted DNA/RNA structures from famous books. Here an example:
The Trial by Kafka as a 2D RNA structure – #BIOTEXT

Now that we got that out of the way, let’s talk about Melville’s masterpiece Moby Dick.

Let’s say that I want to review and compare it to its contemporary books.

First, I need to read Moby Dick’s 600 pages. Take some notes. Read those 600 pages again, maybe three, four, five times even. If I really want to do a great job, I should probably learn that book by heart.

Then, I need to identify other relevant books of Melville’s time, so I can compare them to Moby Dick. How do I do this? How many do I choose? How do I choose them? Well, let’s say that in some way I identify 5-10 relevant books. I need to read also these of course. In fact, if I really want to write a great essay, I need to study them in-depth. 

This is the old way. 

Another way is the Digital Humanities way.

Matthew Jockers ran a computational analysis to analyze 3500 texts published between 1700 and 1900. By doing this, he demonstrated that Moby Dick is statistically a unique book since the Melville cluster is an outlier in the literature map that he obtained from his analysis.

Science fiction? No, that is some impressive Digital Humanities work:

What in the world Is Digital Humanities?

Simple questions are usually the difficult ones to answer. Let me try: Digital Humanities is to do the job of a humanist with computational tools.

A more comprehensive definition is that of the Digital Humanities Manifesto 2.0:

Digital humanities is a diverse and still emerging field that encompasses the practice of humanities research in and through information technology, and the exploration of how the humanities may evolve through their engagement with technology, media, and computational methods.

The thing is — If you ask 800 different Digital Humanists, they will come up with 800 different definitions. I am not joking! But perhaps, we are spending too much time with definitions since, according to Dr. Amanda Visconti, the Digital Humanities is overly defined.

This is not a revolution, or is it?

Let’s see what we got so far.

Digital Humanities is an advanced branch of humanistic disciplines that use computational tools to study old and new topics with the aim of renewing the research approaches of humanities.

For example, you could:

  • write a computer algorithm that generates a critical review of a book,
  • run a statistical analysis of Shakespeare’s work,
  • carry out a mathematical model of the evolution of literature,
  • provide a formal representation of a screenplay.

Pretty cool, huh?

In his article, The Revolutionary Implications of the Digital Humanities, Jim Leach states the following:

It could be that the development of a New Digital Class and the knowledge base made globally available through the digital humanities will provide impetus to civilizing human relations. Knowledge, after all, inoculates against intolerance and serves as a powerful antidote to despotism.

Jim Leach

Now, I do not know if this is real a revolution. But, if there is one, it will probably end with a massacre and the first victim will be the word “digital”. In the end, Digital Humanities will simply be Humanities.

But I digress… Let’s get to the most interesting question:

Do we need The Digital Humanities At all?

Let me ask you a question. Do you know how many books are published every day worldwide?

The answer is: SIX THOUSAND!

It must drive readers and literary critics nuts!

Do they read them all? How do they choose which book to read or review? How do they keep up with trends in literature?

It could be tempting to say that humans publish too many books. But that is like saying: We have too many brands of coffee, I cannot choose! You know what, let’s go Stasi style and make only three brands legal. Problem solved.

I do not believe that the amount of books is the issue.

Think about it: A book is a bit of information, a human input. It is data. Data is never good or bad; in any kind of scientific research, data is gold. It is when the amount of data goes to infinity that we can grasp a perfect image of reality.

In order to digitalize, archive, and analyze the incredible mass of text produced by mankind, we need a revolutionary approach. That approach may be the Digital Humanities. 

Final Thought

Perhaps I have read too many books by Isaac Asimov, and certainly am biased by my scientific background. But I do share the view of Franco Moretti, author of Distant Reading and Professor at Stanford University: Literature cannot be comprehended solely by understanding one text at the time. Literature is not a sum of single pieces, but rather a collective system that needs to be understood as a whole.

With this purpose in mind, the Digital Humanities represents an incredibly precious set of tools that can be of use for the review, analysis, and comprehension of a text within the quasi-infinity of texts produced by mankind.


I took the “Nefertiti with sunglasses” cover photo here in Berlin.
For more check on


From Atoms To Words

Today’s Story Filed Under: