CHRISTIAN RIEGEL, KATHERINE M. ROBINSON & ASHLEY HERMAN
Harnessing Quantitative Eye Tracking Data to Create Art: Interdisciplinary Collaboration and Data Visualization
Interdisciplinary collaboration in the use of digital tools serves to illuminate new means to employ humanities and social science technology to produce aesthetic objects. Eye tracking technology permits volumes and types of data that were hitherto unimaginable in cognitive science-based methodologies, and the software tools of an eye tracker such as ours allow for interesting and useful empirically-based understandings of the data. Yet, in our explorations of the data we conclude that a purely empirically-based output has limitations: the data can be put to further uses, pushing into the realms of data visualization, art, as well as into epistemological considerations for the processes involved in managing and exploring data. How can eye-tracking data serve both objectively-based aims and artistic ones? Our specific focus in this paper is to document our deliberate move to shift the data toward 'data art', 'mind art', and other aesthetically-oriented modes as we develop interventions with the large volumes of data that the advanced technology of our eye-tracker produces.
In this paper, we illuminate methodological approaches that facilitate interdisciplinary work and/or collaboration. We do so by examining how the availability of a set of digital tools allowed a group of researchers from different disciplines to develop new research questions, reframe methodologies, and consider what the means and ends of interdisciplinary collaborative research in a technological context can be. Our research originates in the core concerns of the lab we run: the IMPACT Lab1, or Interactive Media, Poetics, Aesthetics, Cognition, and Technology Lab. Our work straddles disciplinary boundaries, connecting disparate domains such as digital humanities, cognitive science, visual art, literary studies, social science methodology, and creativity, which become enmeshed in our practice. We are interested in how the eye-tracking technology we use facilitates research questions, catalyses new directions, and serves to bridge philosophical and methodological differences across the disciplines. In this sense we are concerned with the challenges that are posed by our interdisciplinary and technological approach, and how we can marry technology, the brain and its functions, with the production of aesthetic objects. In the current phase of our research project on measuring eye movements as individuals read conventional and postmodern poetry, our focus is shifted from empirical data collection and data outputs determined by the technology and software of our eye tracker to epistemological concerns. In particular, we ask: How does the availability of the type and volume of data that an eye tracker produces challenge established epistemologies across our core disciplines? How might our collaborative and collective interdisciplinary approaches allow us to redefine how we think of our data and what its potential is? How might our data be visualised in ways that connect brain functions and art?
A key piece of equipment in the IMPACT Lab is a Tobii XT Eye Tracker, which was conceived as a tool for cognitive-measure-based research involving the reading of various kinds of literary language. This has been employed, for example, in language that is more formally organised as poetry, which employs poetic language deliberately and frequently, as well as being used for mathematics research. In practice, the particular configuration of researchers from divergent disciplinary practices, along with their undergraduate and graduate students, discovered that the technology and methodology of the lab opened up interdisciplinary and collaborative possibilities that were not imaginable at the outset of the lab's planning. Undoubtedly, measurement technologies such as eye trackers present challenges to how we conceive of and understand the data the technology enables. We have for the better part of a decade been interested in measuring through eye tracking and other more conventional methods what happens cognitively when people read different kinds of literary language. In our studies of literary language we have found that people read literary or poetic language differently from non-literary/poetic language, sometimes reading the various types of language more quickly or more slowly and sometimes retaining more or less information, depending on what language features we have exposed them to. For example, reading text with more alliteration results in better retention.
Figure 1: Tobii XT Eye Tracker, showing a sample poem on screen with microprojectors activate, creating a reflection pattern of NIR (Near-Infrared) light on the eyes
Our eye tracker and its supplied software package provide useful and interesting data outputs such as heat maps and bee swarms2. However, it also became apparent to us over time that more understanding of the underlying raw data was needed. Furthermore, we felt that this would also pose challenges to existing notions of what empirical data might mean in research contexts. How might we consider the data from the multiple perspectives we each brought to our project, and how might an exploration of the raw data and its potential open up new avenues for presenting, considering, and understanding what the uses of data might be? Recent discussions relating to Big Data are useful to understand the theoretical concerns that inform our investigation. The advent of Big Data has spurred paradigm shifts with the availability of new kinds of data (including large volumes of data) and new data analytics to challenge conventional epistemologies (Kitchin, 2014: 1). Sinan Aral notes that: 'Revolutions in science have often been preceded by revolutions in measurement' (Kitchin, 2014: 1, quoted in Cukier). Boyd and Crawford remark that: 'Big Data creates a radical shift in how we think about research [...] Big Data reframes key questions about the constitution of knowledge, the processes of research, how we should engage with information, and the nature and categorization of reality' (cited in Kitchin, 2014: 1). Big Data and new technologies for generating data have afforded ways of considering a new era of empiricism, 'wherein [it is said] the volume of data, accompanied by techniques that can reveal their inherent truth, enables data to speak for themselves free of theory' (Kitchin, 2014: 3). Big Data, it can be argued, however, does not arise free from context:
The [research] process is guided in the sense that existing theory is used to direct the process of knowledge discovery, rather than simply hoping to identify all relationships within a dataset and assuming they are meaningful in some way (Kitchin, 2014: 6).
The generation and use of data is modulated by assumptions that are supported by 'theoretical and practical knowledge and experience' regarding the ability of 'technologies and their configurations' to produce data that is of use in research contexts. Choices relating to data generation and use are made strategically and '[H]ow these data are processed, managed and analysed is guided by assumptions as to which techniques might provide meaningful insights (Kitchin, 2014: 6).
Figure 2: Eye Tracker Raw Data in Excel Format
Figure 3: Gaze Plot Output generated with Tobii Studio software; gaze plots display gaze data as individual gaze points, fixations and scan paths represented in the coloured dots and lines: each colour represents a different reader
These conclusions about new modes of research are focused on how Big Data changes how research is conducted. However, they also apply to research methods that are enabled by emerging and rapidly changing technologies such as eye tracking, making available new measures, significant volumes of data, and challenges to how we think of and conduct research. In our own technological context (which does not fit precisely with conventional definitions of Big Data) the role that eye tracking technology plays raises similar questions to Big Data research. Specifically, it focuses on the availability of large volumes of data, how we select the data to analyse, what purposes we might use the data for, and how our choices contribute to discussions about the conduct of research in the social sciences, humanities, and fine arts. Eye tracking technology permits volumes and types of data that were hitherto unimaginable, and the software tools of an eye tracker such as ours allow for interesting and useful empirically-based understandings of the data. Yet, in our explorations of the data we conclude that a purely empirically-based output only takes us so far: the data can be put to further uses, pushing into the realms of data visualisation and art, as well as into epistemological considerations for the processes involved in managing and exploring data.
Our paper focuses on a study that measured eye movements in participants who read conventional and post-modern poetry. Our research inherently works at the crossover of disciplines that use different ways to value and consider information. This essentially prioritises objective-based measures and outputs, presenting the challenge involved in dealing with the mass of data that is produced by the eye tracker. For example, the average gaze points along the X and Y axes from 21 participants for the reading of a single short poem produces a two-column, 79,000 line, file. We needed to devise a means to extract individual participant data from the file to work out how to visualise it in order to produce outputs that creatively reflected eye tracks and that account for individual differences between readers. We can view the visualisations that the Tobii Studio software outputs for us, but to shift the data toward 'data art', 'brain/mind art', or other aesthetically-oriented modes, we needed to be able to find interventions with the data to make it visually and aesthetically comprehensible.
Thus, the question became: Where could we insert the creative and the humanist perspective into our outputs? The answer lay in harnessing the data generated by the eye tracker and translating it into screen-based visualisations. Stephen Ramsay concludes that visualisations of text analysis tend to work against interpretive insight in terms of the viewer:
Most of the visualizations one sees in text analysis are there to demonstrate the facts of the case-to prove to the reader that things cluster this way or that, that there are indeed more instances of this feature than of that feature. Relatively few of them are there to offer the reader the open possibilities of interpretive insight. And this is odd, when we consider that the kinds of texts that interest humanists are solidly of the latter variety-less concerned with proving a point, and far more concerned with allowing the reader the intellectual latitude to see something new (2005: 180).
What struck us was that taking an approach to visualisation that emphasises the possibilities of interpretive insight can provide additional potential to the kinds of data, and its representations, that we were collecting in our research studies. This potential had the appeal to reach beyond the type of cognitive science-oriented representations that we typically employ (charts, tables, numbers, heat maps, and so on) to specifically allow a reader and viewer of our results to appreciate the how and why of reading literature more deeply. The move, then, is in part to find ways to visualise data as a means to move beyond narrative expressions. What if conventional social science data, which we were using to illustrate concerns related to literary study and literary history, could be presented as visually oriented data visualisations? Further, what if these visualisations could be motivated by aesthetic concerns, appearing as artistic objects in their own right as well as functioning as interpretations of a body of data?
David-Antoine Williams remarks on the rich possibilities of an interdisciplinary approach such as the one we take:
The attention given in the digital humanities to creating and improving various and diverse digital methods can be seen in one way to promote multi-or interdisciplinarity, by developing a commons of tools and techniques available to researchers in various fields (2015: 280).
He writes that what he calls
the "build it and see" approach, useful as it can be, avoids difficult and contentious questions surrounding disciplinarity and interdisciplinarity. One such difficulty is that, in taking methods for a department of knowledge in and of itself, practitioners in the digital humanities risk mistaking means for ends, approaches to questions for answers, ways of acquiring knowledge for knowledge itself (2015: 280).
Williams's critique is largely built on the conclusion that work in digital humanities can oft serve as 'mere test case[s] for a digital method' (2015: 281). The objections rest upon the notion 'that digital approaches add nothing to our humanistic understanding of the subject matter, and may even blind us to the fine particularities that close and patient reading trains one to sense and attend to' (2015: 290). While our work combines a variety of disciplinary practices as a means to further understanding of the digital technology available to us, part of our goal is to comprehend how the technology can promote new knowledge in literary studies. Further, our move to data visualisation and consequently to finding ways to transform empirical data into aesthetic objects, that refer back to the data and also signal their own separate status, challenges epistemologies in our own separate disciplines and in digital humanities.
We concluded that technologically sophisticated eye-tracking methodologies provided opportunities to conduct research studies with multiple aims: to amplify literary history, provide insight into cognitive processes relating to the reading of literary texts, and to facilitate data visualisations that could serve as alternative interpretive frames to traditional narrative-driven modes of scholarly expression, with the further potential to be conceived as aesthetic objects. The visualisations could fit within the newly emerging brain art and data art realms. The study we designed and conducted thus had multiple purposes, exploring meta-critical issues as well as concrete research questions.
As early as 2004, Martyn Jessop identified the value of data visualisation in the humanities: 'The visualization and analysis of spatial data can provide insights into the nature and meaning of data throughout the humanities' (2004: 335). He noted, however, that technology had proven challenging due to the technical complexity and high cost of hardware and software: 'Humanities scholars therefore often avoid the visualization of spatial data' (2004: 335). Despite these limitations, Jessop correctly underscores the positive outcomes that visualisation can provide when he notes that 'even simple graphics of spatial data can yield valuable insights' (2004: 348). But what if, as we assert, the methodologies themselves allow entirely new modes of discourse? Kathleen Kerr, Bernice L. Hausmann, and Samah Gad explored several different approaches to data visualisation in a study of vaccination practices in 1918-19, concluding that 'purposeful attention to visualization and the methodological conventions that are embedded in particular visualization practices will allow humanists to have more confidence in their interpretations of [...] data' (2013: 25). The type and mode of visualisations themselves offer multiple interpretive possibilities, opening things up rather than restricting understanding: 'different interpretations of the same data encourage different interpretations, that is, persuade differently' (2013: 31).
The notion of data and our interests in what eye movements reveal about how we read intersect usefully with brain or mind art. Perhaps the best known examples of brain art are those that involve EEG units, such as Jody Xiong's 'Mind Art' or Lisa Park's 'Euonia I & II'. Xion attaches an EEG unit to individuals and the brainwaves measured serve as the material to 'paint'. Park uses an EEG similarly, except brainwaves are used to create sound. Park describes her piece 'Euonia' thus:
"Eunoia" is a performance that uses my brainwaves - collected via EEG sensor - to manipulate the motions of water. It derives from the Greek word "ey" (well) + "nous" (mind) meaning "beautiful thinking". EEG is a brainwave detecting sensor. It measures frequencies of my brain activity (Alpha, Beta, Delta, Gamma, Theta) relating to my state of consciousness while wearing it. The data collected from EEG is translated in realtime to modulate vibrations of sound with using software programs. EEG sends the information of my brain activity to Processing, which is linked with Max/MSP to receive data and generate sound from Reaktor. (Mindfully 2015)
A measurable process of the mind, which conventionally only finds expression in the medical pathography, is shifted into the aesthetic realm, simultaneously shifting the interpretive frame of the data and the methodology of its capture (EEG sensor).
Similarly, Xiong uses wearable EEG sensors to capture brainwaves which are employed for purposes other than their original conception. She collaborated with 16 physically disabled people to create a mind-art installation. Participants were asked to choose paint colours, which were then placed in balloons equipped with detonators. The balloons were surrounded by blank canvases and the detonators were triggered by the brainwaves outputted through the EEG sensors, resulting in abstract paintings (Designboom, 2014). Clifford E. Wulfman notes about this newly emerging field that
easy access to enormous quantities of raw data and the ready availability of inexpensive digital sensors have made possible a form of artistic expression called "data art", which draws data from some process or source and passes it through algorithmic filters in order to create an artwork that is functionally, and often dynamically, connected to its source (2014: 96).
A goal of data-artists, argues Lev Manovich, is that 'data visualization artists transform the informational chaos of data packets moving through the network into clear and orderly forms' (quoted. in Wulfman 2014: 97). Wulfman and Manovich point to the potential of the kinds of data that contemporary technological implements and methodologies provide, which we recognised in the data outputs from our eye-tracker, and hence we designed our study taking into account this potential.
Working with the Processing 2 language, we took the raw data from the eye tracker to produce a series of visualisations that operate on a number of levels: as media/data art, as knowledge translation, and as mind art (in the guise of Jody Xiong, for example, who harnesses brain waves with an EEG machine to create painted canvases). For the purposes of this phase of our research project, we emphasised data visualisation as a key concept for how to approach the questions we faced. Furthermore, we wanted to push our data so that it moved beyond pure empiricism to open new interpretive and appreciative modes of understanding. We defined our task as one that should produce visualisations that could stand as aesthetic objects and also serve to provide additional insight into what the empirical data tells us about reading different kinds of poems. We also wondered whether or not the data and processes of eye tracking could drive a turn to aesthetics for its own sake to create visual objects beyond the purely knowledge-driven aims of conventional research.
We set out, thus, to address a series of concerns. We asked whether or not individuals read poetry from distinctively different eras and traditions differently. Did their eye movements reflect varied experiences of the poetry? Did these eye movements relate to our literary historical understanding of how poetry in these eras and traditions is constituted? How could this data be presented as narrative and visually so that it enhances interpretive insight? How could we represent eye movements in ways that can be understood as art?
In our study, we had participants read 10 poems. Five ranged from Shakespeare to Frost, and the other five were from post 1960 poets who can be identified as postmodern in orientation. Participants read the poems one after another and then completed a short questionnaire. Our Tobii XT Eye Tracker was used to track eye movements. A number of data outputs are possible with the Tobii Suite software, including heat maps and bee swarms. Tobii also allows the raw data to be exported into various programs and formats, including SPSS and Excel. It can also be used in visualisation software such as RAW and Tableau or by writing one's own code to create visualisations.
The first set of poems we chose are what most readers would identify as conventional poetry - it is formally restricted by having consistent stanza schemes, line lengths, and syllabic patterns. In addition, its content is relatively easy to access and to understand, and readers can relate content to the world they know - for example, they can relate its perspective to its externals, which are mostly mimetic rather than ideas oriented. The poems range in period from Elizabethan (Shakespeare) to the beginning of the twentieth-century (Robert Frost).
The second set we chose can be identified as unconventional, working against the norms established in earlier periods and movements and can be identified as post-modern in orientation. The poems date from 1960 to clearly establish them within the movement, though earlier examples of the type of poetry we were interested in can be found. These poems are marked by their break from earlier modes They focus on language as a self-referential subject, use popular forms and language, are iconoclastic, and, while form is important, it is not used in terms of tradition. For example, stanza breaks, line lengths, syllabic patterns, and so on, are irregular, following no pre-established formal constraints.
Of most use to us in a more conventional, analysis are the heat maps and bee swarms (bee swarms are not reproduced in the print version due to the large size of the files), which offer some interesting interpretive possibilities. The heat maps visually represent areas of highest focus, and bee swarms visually represent each individual's eye tracks as a coloured dot on the screen: the bee swarm video overlays dots for each individual tested and plays them back simultaneously, allowing for a real-time simulation of the readers' eyes as they move across the screen. We found that with the conventional poems readers tended to focus evenly and consistently on the same areas; they read in a measured manner and attention was generally spread across the poems. With the postmodern poems, we found that readers tended to focus very little on specific spots in the poems, especially in poems - or places in the poems - where the departures from convention are the greatest. In this sense we can say that there is no focus or attention in the poems as such.
Figure 4: Heat Map: Shakespeare Sonnet
Figure 5: Heat Map: John Mack Low
We can conclude that readers become de-familiarised with the postmodern poems, which is perhaps not too surprising. What is surprising is that readers seemed also to disconnect from the text; thus, rather than focus more on what was complicating their reading experience, they seemed to rather move on and lose interest. This might lead to conclusions about the ability of avant-garde poetry to reach an audience that is not itself knowledgeable about what it is experiencing, which as a consequence also suggests that as communicative medium conventional poetry is more effective than avant-garde work is. These conclusions are not intended to be taken as critical of modes or poetry as such, but rather are observations on what the data tells us.
We thus worked to visualise various views of the data to see what it could reveal, writing code to filter and arranging the raw data. As we explored the raw data we discovered that it provides richer information than we had hitherto thought, yet we also concluded that the eye tracker data posed a number of problems for interpretation. The heat maps are illustrative of the issues we noted. They provide good information about areas of focus and give us insight into what people are engaging with. However, the raw data is more fine-grained than the heat maps lead us to believe, yet we struggle to make direct relations between various pieces of information and reliable conclusions. An interpretive, and thus theoretical, lens is required. What might our end goals be, we asked? By working with visualisations, we pushed for an aesthetic perspective, wanting to see how shifting data to an artistic realm might provide us with different insights into the nature of our research, technology, and inter-relationships of humans to research and technology.
So, where did we get with this exercise? We began with the idea that we would take our set of poems and write some code using Processing 2 to strip out the same data that is used to produce the heat maps. Instead of reproducing a heat map, however, we wanted to have the code make larger or smaller the words and phrases that readers focused on most. So, we have a mirror of the heat map, but it works more closely with the core text - the poem itself. We might term this a translation of the heat map data, perhaps, or we might consider it a way of moving away from the empirical emphasis that the heat map conveys.
Figure 6: Shakespeare Sonnet using heat map data coded in Processing 2 to represent focus areas virtually by word size
Figure 7: Birk Sproxton: from 'Headframe:' using the same method as in Figure 6
We also worked with the data differently. We were interested in moving away from the shape of the poem and the way that it was read to create images that could stand on their own. We were still working with essentially the same data filtered from the larger set - strict average gaze points for X and Y axes. As we were working more abstractly - for example, where the image itself doesn't really tell us much about poetry as such - we wanted to work more closely with the notion of the process that was involved in collecting the data where eyes movements are involved. We have an element of perception at work, so we shaped our images to reflect that element as a fundamental concept.
Figure 7: Shakespeare Sonnet using heat map data coded in Processing 2 to emphasize vision as being at the core of eye tracking technology
Figure 8: John Mack Low poem, using the same methodology as in Figure 7
The longer lines in our images represent longer points of focus. So, the two poems look quite different, but still reflect the essence of attention. What is not evident in the images is a sense of time. The raw data reflects the linear process of reading: we read poems across time, from beginning to end. We did extract data relating to time, but we struggled with how to represent it, grappling with what exactly the measures of time and eye movements were telling us (for example, the images we produced seem more random and less open to interpretation than the ones shown here). So, we have something to work on in the next phase of our project.
What does this exercise tell us about working with eye tracking technology and paradigms of research? Quite a bit in fact, and we conclude by touching briefly on some of our findings. Firstly, we confirmed something that we suspected over the first set of studies we conducted: that the eye tracker's software gives us interesting views of the data collected, but that these views are mediated by the manufacturer's choices for how to output and visualise the data. Secondly, these views of the data obscure the richness of the data to a degree: there are things that we cannot easily determine just by looking at the heat maps and bee swarms, for example. Thirdly, the software does allow us access to the raw data, so this data can be used for various interesting purposes. Fourthly, one great potential of the data is to allow us new views of the data but also to participate in discussions about technology, empiricism, the goals and conduct of research, and what sort of outputs or results might be authentic or legitimate. Empirical approaches must be married to theoretical and contextual concerns, in the end, and using empirical approaches in the creation of art is a legitimate enterprise.
Where do we go from here? There are a number of avenues to pursue moving forward. We have access to a software developer package for our eye tracker, so we would like to code the data collection process itself so it emphasises our interests. We would also like to use some of the newly emerging and quite cheap consumer eye trackers ($500-$1000) to move the whole process of data collection and the conduct of research into an installation space. If we can formulate an aim to create more artistic outputs with our data, can we make the move completely into art production? We envision an installation where we have participants come into a room or gallery, sit at a table with a computer and eye tracker and read poetry. The software we write will collect the data and instantly translate it into a visual output on a screen to create mind art. The installation thus would signal our challenges to existing disciplinary epistemologies.
 The work of our lab is supported by research grants from the Canadian Foundation for Innovation (CFI), the Social Sciences and Humanities Research Council of Canada (SSHRC), and the University of Regina President's Research Fund.
 Heat maps are visualisations that show areas of highest focus; bee swarms replay videos showing the gaze points of several subjects simultaneously over time.
Boyd, D. and K. Crawford (2012), 'Critical questions for big data', Information, Communication and Society 15(5): 662-79.
Cukier, Kenneth (2010), 'Data, data everywhere', The Economist 25 (accessed 10 September 2016)
Designboom (2014) http://www.designboom.com/art/artist-people-with-disabilities-mind-generated-painting-10-23-2014/ (Accessed 10 September 2016).
Jessop, M. (2015), 'The Visualization of Spatial Data in the Humanities', Literary and Linguistic Computing 19(3): 335-50.
Kerr, K., Hausman, B.L, and Gad, S (2013), 'Visualization and Rhetoric: Key concerns for utilizing big data in humanities research', 2013 IEEE Conference on Big Data: 25-32.
Kitchin, Rob (2014), 'Big Data, New Epistemologies and Paradigm Shifts', Big Data & Society, April-June: 1-12.
Mindfully Alive (2015), http://www.mindfullyalive.com/blog/2015/1/2/artist-uses-brain-waves-of-her-emotions-to-manipulate-water (Accessed 10 September 2016).
Ramsay, Stephen (2005), 'In praise of pattern', TextTechnology: The Journal of Computer Text Processing 14(2): 177-90.
Williams, DA (2015), 'Method as Tautology in the Digital Humanities', Digital Scholarship in the Humanities 30(2): 280-93.
Wulfman, C. E. (2014), 'The Plot of the Plot: Graphs and visualizations', The Journal of Modern Periodical Studies 5(1): 94-109.
Christian Riegel is Professor English at Campion College, University of Regina. He is a literary scholar and poet who works on data visualisation and eye tracking in the Interactive Media, Poetics, Aesthetics, Cognition, and Technology (IMPACT) Lab at the U of R and is co-director. Katherine M. Robinson is Professor of Psychology at Campion College, University of Regina, working on mathemathical cognition. She is co-director of the IMPACT Lab and specialises in advanced eye tracking methodologies. Ashley Herman recently completed a B.Sc. in Computer Science at the University of Regina and has expertise in data handling, coding, and data visualisation.