Taking fiction from broadsheet to broadband
TALK LIKE A … COMPUTATIONAL LITERARY RESEARCHER
Colonial — relating to colonisation: the act of a foreign government claiming control over an area of land and its people
First Nations peoples — culturally-distinct groups who descend from the earliest inhabitants of a place
Oral tradition — communication of culture and ideas through speech and song
Computational — using computers, usually for quantitative, statistical, or algorithmic tasks, or for curating and investigating large amounts of data
Collaborative digital editing (CDE) — process in which multiple people work together creating and editing literary works on computers
Optical character recognition (OCR) — computerised way of turning printed or handwritten material into digital text
Think of the last time you delved into a story: how did you interact with it? Maybe you turned the pages of a paperback, scrolled through an ebook, or plugged in your headphones to listen to an audiobook. If you were living in 19th century Australia, the chances are that the last chapter you read would have been in a newspaper.
Exploring the history of Australian literature, therefore, means digging through a lot of old newspapers. Thankfully, for literary researchers like Professor Katherine Bode at the Australian National University, computational tools are transforming our ability to collect, organise, analyse and even re-publish such stories. By creating new digital archives, Katherine is not only shedding light on her country’s cultural history, but also allowing the public to engage with thousands of unrecorded novels in new and surprising ways.
Why were newspapers so important in Australia?
Before the British colonised Australia in 1788, the continent was home to many different and distinct groups, with diverse cultures and languages. First Nations languages were passed on orally, and stories were not written but told with rich repertoires of signs, sand-drawings, songs and dance. It was, therefore, the English-speaking colonisers who were the first to demand written publications.
In the 19th and early 20th century, newspapers were the main source of reading material because they were economical and accessible. By combining news, advice, opinion, advertising and fiction into a single publication, they appealed to a wide audience in a country that did not have many book publishers. Newspapers provided a connection to Europe and Western culture, for example, through reporting on overseas wars, articles on international scientific developments, or stories about snowy Christmas celebrations published to coincide with Australian summer Christmases. Alongside this, they included local information and community discussion, making them important for day-to-day communication. Katherine says that this interplay between global and local can also be found in the fiction that was included in almost all newspapers at the time. “It was often the case that stories written by overseas authors were adapted to seem local, or that stories advertised as being written especially for that newspaper were published in scores of far-flung newspapers at the same time! Fiction participated in this mapping out of the world,” she explains.
Were newspaper stories dominated by British authors?
The popular understanding of Australian literature in the colonial era is that almost all fiction initially came from Britain, because that is what readers demanded. Katherine has discovered otherwise. The authors she has identified in newspaper fiction came from all over the world, including the US, Canada, Europe, Japan, Russia and South Africa. British authors accounted for only half of publications.
Furthermore, about a quarter of the stories came from local – though not Indigenous – writers, and ‘Australian’ was one of the most common words in story titles. There is still uncertainty, as many stories were published anonymously, but the evidence suggests that locally-authored fiction played a big part in developing Australian identity and culture. Katherine stresses that this new identity “cannot – and should not – be separated from the dispossession of Australia’s First Nations People: an issue commonly explored in local fiction in both horribly racist and surprisingly critical ways”.
How are digital platforms transforming literary studies?
Digital technology allows researchers to store large amounts of data, for example the text of millions of books, and to search quickly for interesting patterns, such as the frequency of different words and phrases over time. This is just the start, though. Katherine goes as far to say that “data and computation are reforming what literature, literacy and literate practices are and can be”.
Katherine’s vision is to move on from seeing texts as data points to be computed. Instead, she is setting up platforms that allow a wide range of people to interact with fiction and with the institutions that curate our cultural archive. This means enabling scholars and the public not just to view stories, but also to edit existing ones, combine historical stories with their own writing, or download data for their own analysis. An example of such a platform is the “To be Continued” database of Australian newspaper fiction.
At “To be Continued”, users can search for stories by title, author, newspaper and date. Each record has a scanned image of the newspaper story as well as a digital text generated by a machine learning technique called optical character recognition (OCR). Because old newspapers are often missing parts or are damaged, and because OCR guesses at the letters on the page, the digital text often contains many errors, which users can correct themselves on the database. One of Katherine’s proudest achievements is that over 2,000 titles and many more thousands of corrections in “To be Continued” were added by members of the public and have been saved by the National Library of Australia. In some cases, people have even rediscovered fiction that was unknown to literary researchers. This collective approach is something she hopes to build on.
What are the next steps?
Katherine plans to add a collaborative digital editing (CDE) tool to the online database. The CDE will let readers create their own collections and even publish them as new ebooks. “We’re calling this participatory literary history, and we are interested in what types of works readers will engage with,” she explains. She expects some readers will even choose to rewrite some texts, perhaps to remove the racism and sexism that was common in writing at the time.
Tools such as the CDE will allow anybody with an internet connection to get involved. This could turn literary history from a private discussion between professional writers and academics into an open forum where a wide range of people have a voice. Katherine hopes this is how her research will have a positive impact on today’s society. For example, by allowing the public to look back at – and write back to – the racist and sexist themes of 19th century fiction, digital archives provide a chance to reflect on the past and reimagine the present and future.
“The digital age holds out the possibility of more distributed and democratic participation in literature,” says Katherine. “If we think of archives as our cultural memory, then participation in them gives us an opportunity to challenge past and ongoing discriminations and oppressions. Enabling people to curate, write, rewrite and remix texts provides the potential for humanities research to make a difference to the way we understand the world.”
PROFESSOR KATHERINE BODE
Professor of Literary and Textual Studies, College of Arts and Social Sciences, Australian National University, Canberra, Australia
Field of research: Computational Literary Studies
Research project: Curating digital platforms for people to interact with 19th and 20th century Australian newspaper fiction
Funder: Australian Research Council (ARC)
ABOUT COMPUTATIONAL LITERARY STUDIES
What words are most commonly used to describe women in 20th century American fiction? Can an algorithm predict the outcome of an Agatha Christie murder-mystery? Could a computer write a new novel in the style of Charles Dickens? These are just some questions that researchers in computational literary studies might be interested in. The field encompasses the study of written prose, poetry and drama, through digitisation and the application of techniques from computer science.
What does a typical day look like?
Researchers in computational literary studies spend time reading literature and writing about it, while also creating datasets and exploring them by writing code or using specialist programs. As university workers, their job also includes meetings with students and colleagues to discuss common interests and plan research programmes.
For Katherine, the combination of literature and computing is an exciting challenge. “I love combining the critical, conceptual, technological, and infrastructural questions raised by literature and computation. It’s fascinating to me to think about how these supposedly very different systems work together and the characteristics they share, as well as the ways each is transforming the other”.
What will the next generation study?
The potential of computational literary studies is huge because it combines a relatively new discipline with a much older one. On the one hand, literary studies is a discipline with centuries of history and a deep expertise in language and its infrastructures (for instance, how class relates to print publishing). On the other hand, developments such as the internet and machine learning have transformed computing in just the last few decades and even years.
Computation used to be mostly relevant to science, but now can be applied across the humanities. So far, though, very few researchers have bridged the gap between the disciplines of computing and literature. The next generation of researchers will have the chance to apply the latest algorithms to learn more than ever before and will likely have a skills advantage having grown up in the digital age. Katherine highlights that the relationship between literature and computing goes both ways: “Critical understandings of language and culture also have the potential to transform computation”.
Pathway from school to computational literary studies
• At school and post-16, studying English/literature, maths and history, philosophy or computer science will provide a good foundation.
• Although computer science and literature are traditionally separate subjects, it is possible to study both after school; look for universities that offer joint degrees or courses in computational literary studies or digital humanities.
• To become a university researcher, you will need to study for a PhD after your undergraduate degree.
Explore careers in computational literary studies
• A good way to start exploring a field is by seeing what support and guidance professional societies offer. The main society for digital humanities, which includes Katherine’s field of computational literary studies, is the Alliance of Digital Humanities Organizations (ADHO).
• The Journal of Cultural Inquiry is an open-access journal where lots of work in computational literary studies is being published. Visit the website to discover the range of research being carried out.
• Katherine says, “I have been involved in curating a series of workshops on the ethics of data curation and data ontologies with Professor Lauren Goodlad at the University of Rutgers, through the Critical AI (artificial intelligence) initiative. Copies of the readings, videos and blogs are all freely available and make a good introduction to someone interested in how AI relates to humanities.” Find out more.
• Katherine is an academic who conducts research projects, but expertise in her field also leads to careers in industry. She explains, “You could study computational literary studies and use the capabilities gained there for multiple careers at the intersection of language-rhetoric and computation-technology, from media communications to policy analysis.”
What were your interests when you were growing up?
I liked reading, movies and solitary sports (like ice skating and swimming).
Who or what inspired you to become a literary researcher?
When I finished my undergraduate degree, I realised that I’d enjoyed university and wanted to study more. At the time (it is less so now) the scholarship for a PhD was only a bit less than what I would have been paid for an entry level job, so I thought it would be a good way to keep studying.
I didn’t explore computational approaches to literature (or literary approaches to computation) in my PhD, which was on how contemporary Australian women writers represent men’s bodies. But when I finished that work, I was frustrated that I had spent a lot of time learning so much about a handful of texts and authors, yet knew little about how they fitted in with other literary trends.
My brother was doing his PhD in mathematical ecology, and I wondered if the dynamics he was exploring and the way he was studying them would be relevant to literary ‘ecosystems’. I would now say that the analogy is too scientific, but I proposed a project that went in this direction and was awarded funding for a postdoctoral research project. As I was finishing my postdoc, jobs were becoming available in this area that was being called digital humanities, so I was able to get a job as a literary researcher and continue pursuing those questions with my own students and colleagues, for which I count myself very lucky.
You have travelled and worked in Europe and Asia. How have these experiences impacted you?
I value the opportunity to experience new cultures and take a break from routine, and both are combined in travel. It has helped me appreciate that things I thought of as normal or natural growing up, can be and often are done differently in different places. That knowledge has enabled me to ask myself, on various occasions, if I’m doing things on autopilot or because I think that’s the way they should be done, and that’s helped me in both my professional and personal life.
What are your proudest career achievements, so far?
My proudest achievement is closing the loop in the digital humanities data cycle. Too often, data is drawn out of cultural institutions and other collections and improved by researchers without there being mechanisms for returning the improved data.
Working with the National Library of Australia to set up a process whereby the work we do with their digitised newspapers – to find, index and edit fiction – is returned to their collection, in the form of unique records linked to the “To be continued” database, ensures that this knowledge is re-embedded in the collections that enable it. This allows others – academic researchers and members of the public – to benefit from (to enjoy, learn from, engage with, experience and, of course, add to and expand) the work we’ve done.
Katherine’s top tip
Don’t silo your interests into separate disciplines – or believe in the division of science and humanities, or social science and creative arts – but, instead, follow them where they go. The disciplines and organisations of knowledge that emerged in the 19th century are inadequate for how the world works today.
Do you have a question for Katherine?
Write it in the comments box below and Katherine will get back to you. (Remember, researchers are very busy people, so you may have to wait a few days.)