Recent posts on the blog have concerned the topic of psychophysiology (or biometrics) and the evaluation of player experience. Based on those posts and the comments that followed, I decided to do a thought experiment.
Imagine that I work for a big software house who want to sell as many games as possible and ensure that their product (which costs on average $3-5 million to develop per platform) is as good as it possibly can be – and one of the suits from upstairs calls and asks me “how should we be using biometrics as part of our user experience evaluation? The equipment is expensive, its labour-intensive to analyse and nobody seems to understand what the data means.” (This sentiment is not exaggerated, I once presented a set of fairly ambiguous psychophysiological data to a fellow researcher who nodded purposefully and said “So the physiology stuff is voodoo.”)
Here’s a list of 10 things I would push for by way of a response.
Just read a very interesting and provocative paper entitled “How emotion is made and measured” by Kirsten Boehner and colleagues. The paper provides a counter-argument to the perspective that emotion should be measured/quantified/objectified in HCI and used as part of an input to an affective computing system or evaluation methodology. Instead they propose that emotion is a dynamic interaction that is socially constructed and culturally mediated. In other words, the experience of anger is not a score of 7 on a 10-point scale that is fixed in time, but an unfolding iterative process based upon beliefs, social norms, expectations etc.
This argument seems fine in theory (to me) but difficult in practice. I get the distinct impression the authors are addressing the way emotion may be captured as part of a HCI evaluation methodology. But they go on to question the empirical approach in affective computing. In this part of the paper, they choose their examples carefully. Specifically, they focus on the category of ‘mirroring’ (see earlier post) technology wherein representations of affective states are conveyed to other humans via technology. The really interesting idea here is that emotional categories are not given by a machine intelligence (e.g. happy vs. sad vs. angry) but generated via an interactive process. For example, friends and colleagues provide the semantic categories used to classify the emotional state of the person. Or literal representations of facial expression (a web-cam shot for instance) are provided alongside a text or email to give the receiver an emotional context that can be freely interpreted. This is a very interesting approach to how an affective computing system may provide feedback to the users. Furthermore, I think once affective computing systems are widely available, the interpretive element of the software may be adapted or adjusted via an interactive process of personalisation.
So, the system provides an affective diagnosis as a first step, which is refined and developed by the person – or even by others as time goes by. Much like the way Amazon makes a series of recommendations based on your buying patterns that you can edit and tweak (if you have the time).
My big problem with this paper was that a very interesting debate was framed in terms of either/or position. So, if you use psychophysiology to index emotion, you’re disregarding the experience of the individual by using objective conceptualisations of that state. If you use self-report scales to quantify emotion, you’re rationalising an unruly process by imposing a bespoke scheme of categorisation etc. The perspective of the paper reminded me of the tiresome debate in psychology between objective/quantitative data and subjective/qualitative data about which method delivers “the truth.” I say ‘tiresome’ because I tend towards the perspectivist view that both approaches provide ‘windows’ on a phenomenon, both of which have advantages and disadvantages.
But it’s an interesting and provocative paper that gave me plenty to chew over.
Just to show how out of touch I am with CHI stuff, I stumbled upon a workshop entitled “evaluating affective interfaces – innovative approaches” this afternoon. Only 4 years after the actual event. Here’s a link to the web page with details of all papers.
There’s a nice article in todays Guardian by Charles Arthur regarding user gullibility in the face of technological systems. In this case, he’s talking about the voice risk analysis (VRA) software used by local councils and insurance companies to detect fraud (see related article by same author), which performs fairly poorly when evaluated, but is reckoned by those bureaucrats who purchased the system to be a huge money-saver. The way it works is this – operator receives a probability that the claimant is lying (based on “brain traces in the voice” – in reality probably changes in the fundamental frequency and pitch of the voice), and on this basis, may elect to ask more detailed questions.
Charles Arthur makes the point that we’re naive and gullible when faced with a technological diagnosis. And this is fair point, whether it’s the voice analysis system or a physiological computing system providing feedback that you’re happy or tired or anxious. Why do we tend to yield to computerised diagnosis? In my view, you can blame science for that – in our positivist culture, cold objective numbers will always trump warm subjective introspection. The first experimental psychologist, Wilhem Wundt (1832-1920) pointed to this dichotomy when he distinguished between mediated and unmediated consciousness. The latter is linked to introspection whereas the former demands the intervention of an instrument or technology. If you go outside on an icy day and say to yourself “it’s cold today” – your consciousness is unmediated. If you supplement this insight by reading a thermometer “wow, two degrees below zero” – that’s mediated consciousness. One is broadly true from that person’s perspective whereas the other is precise from point of view of almost anyone.
The main point of today’s article is that we tend to trust technological diagnosis even when the scientific evidence supporting system performance is flawed (as is claimed in the case of the VRA system). Again, true enough – but in fairness, most users of the VRA didn’t get the chance to review the system evaluation data. The staff are trained to believe the system by the company rep who sold the system and trained them how to use it. From the perspective of the customers, insurance staff may have suddenly started to ask them a lot of detailed questions, which indicated their stories were not believed, which probably made the customers agitated and anxious, therefore raising the pitch of the voice and turning themselves from possibles to definites. The VRA system works very well in this context because nobody really knew how it worked or even whether it worked.
What does all this mean for physiological computing? First of all, system designers and users must accept that psychophysiological measurement will never give a perfect, isomorphic, one-to-one model of human experience. The system builds a model of the user state, not a perfect representation. Given this restriction, system designers must be clever in terms of providing feedback to the user. Explicit and continuous feedback from the system is likely to undermine the credibility of the system in the eyes of the user. Users of physiological computing systems must be sufficiently informed to understand that feedback from the system is an educated assessment.
The construction of physiological computing systems is a bridge-building exercise in some ways – a link between the nervous system and the computer chip. Unlike similar constructions, this bridge is unlikely to ever meet in the middle. For that to happen, the user must rely his or her gullibility to make the necessary leap of faith to close the circuit. Unrealistic expectation will lead to eventual disappointment and disillusionment, conservative cynicism and suspicious will leave the whole physiological computing concept stranded at the starting gate – it’s up to designers to build interfaces that lead the user down the middle path.