Neural Networks of Power: AI Unravels Knots and Tangles in Relationships between Humans, Elves and Hobbits
One of the most popular writers of the last century, John Ronald Reuel Tolkien, was born on January 3rd. Researchers from HSE University, AIRI and MISSIS have used machine learning to explore the social connections between the characters of his Middle-earth universe. The algorithm managed to create an accurate picture of the social structures and dynamics of the characters' relationships, providing a unique map of interactions in the epic world. The researchers believe that this approach can be applied in many areas beyond literature. The results of the work were published in IEEE Xplore.
The analysis of literary works is a complex and time-consuming process. When reading any text, the researcher needs to capture numerous nuances and features — from the author's style and word choice to the relationships between characters and their role in the plot. Most often, this work is done manually by literary critics. Ilya Makarov, Senior Research Fellow at the School of Data Analysis and Artificial Intelligence at the HSE Faculty of Computer Science, head of the ‘AI in Industry’ group at the Artificial Intelligence Research Institute (AIRI), and Anastasia Yaschenko, HSE University graduate, applied computational linguistics and machine learning tools to a series of books by John Ronald Reuel Tolkien about Middle-earth. The AI ‘read’ the books, isolating the key elements: the characters, their belonging to a particular race and their social ties. It demonstrated the results in the form of a graph, which allows us to not only trace the relationship between the characters, but also to see more clearly the structure of their social network.
Senior Research Fellow at the School of Data Analysis and Artificial Intelligence
‘We chose the world of Middle-earth as the basis for our analysis for a number of key reasons. Firstly, J. R. R. Tolkien's texts are widely known and loved by readers around the world, which makes the study universal and global. Secondly, the system of characters in Tolkien's books is very rich and diverse, which creates optimal conditions for such an analysis. Finally, thanks to the long history of studying Tolkien's world, a large set of metadata is available, including detailed descriptions of characters and their race, which facilitates the process of automatic clustering and verification of results.’
The main goal was to create a program that could ‘understand’ human language, analyse literary texts, identify the characters of the book and determine their relationship. This work is based on the concept of social networks. This is an approach widely used in sociology, psychology and more recently in the field of computer science. In the context of literature analysis, each character is considered as a node, and the interactions between them are the edges connecting these nodes. When two characters interact with each other in the text, a connection, or edge, is established between their nodes. The more interactions occur between the characters, the stronger this edge is.
The use of machine learning algorithms has made it possible to automatically analyse texts and identify such interactions between characters, turning literary works into simulated social networks. Named Entity Recognition (NER), a natural language processing technology was used to automatically identify and classify entities in the text, such as names, places and organisations.
This technology helped scientists to create a list of each unique character mentioned in the books. Further semantic analysis allowed them to determine the race of each character. It was conducted by analysing the context and linking each character to a specific race based on the words and phrases that accompany his mention. For example, if a character is often referred to in context with the words ‘elf’ or ‘elvish; the algorithm classifies them as an elf. Due to the large amount of metadata of J. R. R. Tolkien's characters (races, related relationships, belonging to a certain kingdom, etc.) the researchers chose racial characteristic to interpret communities, as every character in the universe belongs to a certain race.
In addition, the use of named entities and semantic analysis of the text allowed researchers to determine not only the connection between the characters, but also the nature of these relationships — friendship, enmity or neutral relations. Artificial intelligence managed to identify complex social relationships between the characters and divide the characters into groups.
It is especially important that this approach is not limited only to The Lord of the Rings, but can be applied to any text, opening up new opportunities for automated research in literature.
‘Our study contains a sequence of steps that can be used to extract named entities and their relationships based on other texts. For example, to identify the relationship between the motives of works by different authors or to analyse complex legal documents,’ said Ilya Makarov.