With every industrial revolution, a cultural revolution is imminent. I sat down with my colleagues Pekka Ahtonen and Teemu Vartiainen from DAIN Studios, a Finnish German Start-up trailblazing the AI scene in Northern Europe, to discuss their latest artificial intelligence (AI) collaboration project – Kalevala AI.
Majella: How did you come up with the idea to train a machine to generate new text based on the Finnish Kalevala epoch written by Elias Lönnrot and published in 1835?
Pekka: My connection to what started as a small internal project at DAIN Studios is quite a personal journey. It actually started with this family story in which my great, great, great grandfather, Eljas (also known as Uljaska) Ahtonen founded the village of Rimpi near the border of Russia in the mid-1800s. The story goes that in 1890, the Finnish artist Akseli Gallen-Kallela met my ancestor Uljaska Ahtonen when visiting Rimpi, and became the model for Väinämöinen in Gallen-Kallela’s illustration of the Kalevala.
The script used to train the machine to generate the Kalevala text was based on the script I had used a year ago, when I was generating text for the TV series The Simpsons. I thought it would be a novel idea to transfer the use of this script to the Kalevala.
Majella: How does Kalevala AI promote Finnish Culture and further the knowledge of the Finnish national Epoch?
I think it is important to know ones cultural roots as it supports personal identity. Our education system has gone through many reforms over the years, and the teaching of the Kalevala has also been influenced from educational reform. Many decades ago, the Kalevala was taught in schools through folk song and memorization. These days, it would be hard to find a school student that has memorized the 20,000 words of the Kalevala. Also, I think it would be quite hard to find anyone in Finland that could write in the prose/ poetic form of the Kalevala – it is a talent that has been lost as our educational priorities changed over the years. The paradox is that on one hand, technology has driven many changes in the Finnish educational system to the point that we diminish the purpose of poetry. On the other hand, it is with technology that we can elevate the purpose of poetry, culture and creative learning. Novel engaging learning methods can be used to create a stronger awareness of our cultural foundations, and I hope that the Kalevala AI can be used for this purpose.
Majella: Are there any features of the Kalevala text that make it more appropriate or tractable for AI text generation?
Pekka: Features such as alliteration, parallelism and the poetic meter (Kalevala meter, a variation of the trochaic tetrameter) support text prediction. Also having thousands of words of text is enough data to train the machine.
Majella: How complex is the technology to generate Kalevala AI text?
Pekka: We did not have the technology or the knowledge on how to do this type of AI text generation properly, say five years ago. We are using a highly complex set of algorithms and neural networks to train the machine. And I think it is important to also point out that using AI for text generation is a fairly new science, we are just at the beginning of our knowledge journey into what may be possible in the coming years.
Majella: What are the limitations and possibilities of generating Kalevala text and other texts with AI?
Teemu: Like with all machine learning and artificial intelligence, the text generation is fundamentally based on recognizing patterns in the data. The neural network is simply predicting the next word based on the previous context, and this process is repeated over and over again. This means that although our neural network can generate text that looks poetic and convincing, the algorithm has no understanding of concepts such as the plot of Kalevala or the relationship between characters in the story. Currently, there is no way for an algorithm to generate a story with meaningful plot and characters. This sort of creativity cannot be reduced to a machine learning task, at least not yet.
Although creating truly creative and insightful stories are outside the reach of AI at the moment, the recent advances in natural language processing have been astounding. Consider the work of OpenAi for instance (https://blog.openai.com/better-language-models/). The text generated by their machine learning model seems so authentic, that the researchers are not releasing the work fully to the public due to “concerns about large language models being used to generate deceptive, biased, or abusive language at scale”. OpenAI’s model works the same as ours – it is trained to predict the next word that occurs in the text. Given enough training data, it can learn to generate very convincing text for contexts that are represented well in the training data. Still, it is limited in the sense that it has no semantic data model or ontologies.