NODE guest researcher Helle Sjøvaag and Raul Ferrer, PhD Candidate affiliated with NODE, participated in a Dagstuhl-seminar on “Analysis, Interpretation and Benefit of User-Generated Data: Computer Science meets Communication Studies” , in April. Sjøvaag was co-organizer of the seminar together with Professors Thorsten Quandt and Gottfried Vossen from the University of Münster and Gera Shegalov from Twitter.
Dagstuhl is an established meeting place for informatics scholars and computer scientists. The venue is an old castle that features a crypt as well as its very own ghost. It is located in the south of Germany close to the French border.
Around 20 computer and communication scientists from Europe and the U.S. met for a week to discuss methodological benefits of cross-disciplinary cooperation on research into news and journalism. Media and journalism’s digital production, dissemination and reception mode puts new challenges on researchers to collect, analyse and interpret data in new digital formats. The tools by which we harness online forms of journalism, for instance, require new skillsets that can enable scholars to not only get unprecedented amounts of data for analysis – the digital reality of journalism also poses epistemological and ontological questions about the objects we study. Sjøvaag and NODE researcher Michael Karlsson have recently published on this very topic in a special issue of Digital Journalism they edited togehter, published in January 2016.
One of the major outputs of the Dagstuhl seminar was new insight into how to research data journalism. Data journalism involves using social science methods, databases, ‘big data’ and digital tools to do journalism. Data journalism is primarily practiced in big newsrooms with the resources to allocate staff to data journalism processes. Often, data journalism involves design, data science, statistics as well as traditional journalistic methods to tell data journalism stories. Data used for data journalism typically involves government data, open data, and leaks. As this data frequently comes in unmanageable forms, data journalism requires scraping and visual analytics, and often needs teamwork. Hence, scholars need new methods and perspectives to analyse these novel forms of journalism.
As most data journalism projects are ad-hoc projects, few reproducible workflows exist to guide the work process. Contingency in data storage and scalable workflow models are other problems. For journalists, the main challenge is how to turn unstructured documents into structured documents. For researchers, the challenge is to look beyond the text as object of study. Data journalism is more than text – in fact it often involves very little text. The nature of data journalism thus challenges the way we look at societal communication. To study data journalism, we need a mixture of social science and computer science approaches. This is not only because the nature of the content is different from traditional journalism formats. As data journalism is largely about application development, the empirical focus requires both practice (workflow) and ‘text’ (data).
What we take from the Dagstuhl seminar is insight into the benefits of research collaboration between communication scholars and computer scientists. Data journalism is an emerging area of incquriy in journalism studies, to which the Ander-funded reseach project “Algorithms and Media Organizations”, provides a clear contribution.