I originally coined the term ‘physiological computing’ to describe a whole class of emerging technologies constructed around closed-loop control. These technologies collected implicit measures from the brain and body of the user, which informed a process of intelligent adaptation at the user interface.
If you survey research in this field, from mental workload monitoring to applications in affective computing, there’s an overwhelming bias towards the first part of the closed-loop – the business of designing sensors, collecting data and classifying psychological states. In contrast, you see very little on what happens at the interface once target states have been detected. The dearth of work on intelligent adaptation is a problem because signal processing protocols and machine learning algorithms are being developed in a vacuum – without any context for usage. This disconnect both neglects and negates the holistic nature of closed-loop control and the direct link between classification and adaptation. We can even generate a maxim to describe the relationship between the two:
the number of states recognised by a physiological computing system should be minimum required to support the range of adaptive options that can be delivered at the interface
This maxim minimises the number of states to enhance classification accuracy, while making an explicit link between the act of measurement at the first part of the loop with the process of adaptation that is the last link in the chain.
If this kind of stuff sounds abstract or of limited relevance to the research community, it shouldn’t. If we look at research into the classic ‘active’ BCI paradigm, there is clear continuity between state classification and corresponding actions at the interface. This continuity owes its prominence to the fact that the BCI research community is dedicated to enhancing the lives of end users and the utility of the system lies at the core of their research process. But to be fair, the link between brain activation and input control is direct and easy to conceptualise in the ‘active’ BCI paradigm. For those systems that working on an implicit basis, detection of the target state is merely the jumping off point for a complicated process of user interface design.
Back in 2003, Lawrence Hettinger and colleagues penned this paper on the topic of neuroadaptive interface technology. This concept described a closed-loop system where fluctuations in cognitive activity or emotional state informs the functional characteristics of an interface. The core concept sits comfortably with a host of closed-loop technologies in the domain of physiological computing.
One great insight from this 2003 paper was to describe how neuroadaptive interfaces could enhance communication between person and system. They argued that human-computer interaction currently existed in an asymmetrical form. The person can access a huge amount of information about the computer system (available RAM, number of active operations) but the system is fundamentally ‘blind’ to the intentions of the user or their level of mental workload, frustration or fatigue. Neuroadaptive interfaces would enable symmetrical forms of human-computer interaction where technology can respond to implicit changes in the human nervous system, and most significantly, interpret those covert sources of data in order to inform responses at the interface.
Allowing humans to communicate implicitly with machines in this way could enormously increase the efficiency of human-computer interaction with respects to ‘bits per second’. The keyboard, mouse and touchscreen remain the dominant modes of input control by which we translate thoughts into action in the digital realm. We communicate with computers via volitional acts of explicit perceptual-motor control – the same asymmetrical/explicit model of HCI holds true for naturalistic modes of input control, such as speech and gestures. The concept of a symmetrical HCI based on implicit signals that are generated spontaneously and automatically by the user represents a significant shift from conventional modes of input control.
This recent paper published in PNAS by Thorsten Zander and colleagues provides a demonstration of a symmetrical, neuroadaptive interface in action.
Everyone who used MS Office between 1997 and 2003 remembers Clippy. He was a help avatar designed to interact with the user in a way that was both personable and predictive. He was a friendly sales assistant combined with a butler who anticipated all your needs. At least, that was the idea. In reality, Clippy fell well short of those expectations, he was probably the most loathed feature of those particular operating systems; he even featured in this Time Magazine list of world’s worst inventions, a list that also includes Agent Orange and the Segway.
In an ideal world, Clippy would have responded to user behaviour in ways that were intuitive, timely and helpful. In reality, his functionality was limited, his appearance often intrusive and his intuition was way off. Clippy irritated so completely that his legacy lives on over ten years later. If you describe the concept of an intelligent adaptive interface to most people, half of them recall the dreadful experience of Clippy and the rest will probably be thinking about HAL from 2001: A Space Odyssey. With those kinds of role models, it’s not difficult to understand why users are in no great hurry to embrace intelligent adaptation at the interface.
In the years since Clippy passed, the debate around machine intelligence has placed greater emphasis on the improvisational spark that is fundamental to displays of human intellect. This recent article in MIT Technology Review makes the point that a “conversation” with Eugene Goostman (the chatter bot who won a Turing Test competition at Bletchley Park in 2012) lacks the natural “back and forth” of human-human communication. Modern expectations of machine intelligence go beyond a simple imitation game within highly-structured rules, users are looking for a level of spontaneity and nuance that resonates with their human sense of what other people are.
But one of the biggest problems with Clippy was not simply intrusiveness but the fact that his repertoire of responses was very constrained, he could ask if you were writing a letter (remember those?) and precious little else.
The phrase “smart technology” has been around for a long time. We have smart phones and smart televisions with functional capability that is massively enhanced by internet connectivity. We also talk about smart homes that scale up into smart cities. This hybrid between technology and the built environment promotes connectivity but with an additional twist – smart spaces monitor activity within their confines for the purposes of intelligent adaptation: to switch off lighting and heating if a space is uninhabited, to direct music from room to room as the inhabitant wanders through the house.
If smart technology is equated with enhanced connectivity and functionality, do those things translate into an increase of machine intelligence? In his 2007 book ‘The Design Of Future Things‘, Donald Norman defined the ‘smartness’ of technology with respect to the way in which it interacted with the human user. Inspired by J.C.R. Licklider’s (1960) definition of man-computer symbiosis, he claimed that smart technology was characterised by a harmonious partnership between person and machine. Hence, the ‘smartness’ of technology is defined by the way in which it responds to the user and vice versa.
One prerequisite for a relationship between person and machine that is cooperative and compatible is to enhance the capacity of technology to monitor user behaviour. Like any good butler, the machine needs to increase its awareness and understanding of user behaviour and user needs. The knowledge gained via this process can subsequently be deployed to create intelligent forms of software adaptation, i.e. machine-initiated responses that are both timely and intuitive from a human perspective. This upgraded form of human-computer interaction is attractive to technology providers and their customers, but is it realistic and achievable and what practical obstacles must be overcome?
Way back in 2008, I was due to go to Florence to present at a workshop on affective BCI as part of CHI. In the event, I was ill that morning and missed the trip and the workshop. As I’d prepared the presentation, I made a podcast for sharing with the workshop attendees. I dug it out of the vaults for this post because gaming and physiological computing is such an interesting topic.
The work is dated now, but basically I’m drawing a distinction between my understanding of BCI and biocybernetic adaptation. The former is an alternative means of input control within the HCI, the latter can be used to adapt the nature of the HCI. I also argue that BCI is ideally suited certain types of game mechanics because it will not work 100% of the time. I used the TV series “Heroes” to illustrate these kinds of mechanics, which I regret in hindsight, because I totally lost all enthusiasm for that show after series 1.
The original CHI paper for this presentation is available here.
[iframe width=”400″ height=”300″ src=”http://player.vimeo.com/video/32983880″]
[iframe width=”400″ height=”300″ src=”http://player.vimeo.com/video/32915393″]
Last month I gave a presentation at the Annual Meeting of the Human Factors and Ergonomics Society held at Leeds University in the UK. I stood on the podium and presented the work, but really the people who deserve most of the credit are Marjolein van der Zwaag (from Philips Research Laboratories) and my own PhD student at LJMU Elena Spiridon.
You can watch a podcast of the talk above. This work was originally conducted as part of the REFLECT project at the end of 2010. This work was inspired by earlier research on affective computing where the system makes an adaptation to alleviate a negative mood state. The rationale here is that any such adaptation will have beneficial effects – in terms of reducing duration/intensity of negative mood, and in doing so, will mitigate any undesirable effects on behaviour or the health of the person.
Our study was concerned with the level of anger a person might experience on the road. We know that anger causes ‘load’ on the cardiovascular system as well as undesirable behaviours associated with aggressive driver. In our study, we subjected participants to a simulated driving task that was designed to make them angry – this is a protocol that we have developed at LJMU. Marjolein was interested in the effects of different types of music on the cardiovascular system while the person is experiencing a negative mood state; for our study, she created four categories of music that varied in terms of high/low activation and positive/negative valence.
The study does not represent an investigation into a physiological computing system per se, but is rather a validation study to explore whether an adaptation, such as selecting a certain type of music when a person is angry, can have beneficial effects. We’re working on a journal paper version at the moment.
[iframe width=”400″ height=”300″ src=”http://player.vimeo.com/video/25081038″]
Some months ago, I wrote this post about the REFLECT project that we participated in for the last three years. In short, the REFLECT project was concerned with research and development of three different kinds of biocybernetic loops: (1) detection of emotion, (2) diagnosis of mental workload, and (3) assessment of physical comfort. Psychophysiological measures were used to assess (1) and (2) whilst physical movement (fidgeting) in a seated position was used for the latter. And this was integrated into the ‘cockpit’ of a Ferrari.
The idea behind the emotional loop was to have the music change in response to emotion (to alleviate negative mood states). The cognitive loop would block incoming calls if the driver was in a state of high mental workload and air-filled bladders in the seat would adjust to promote physical comfort. You can read all about the project here. Above you’ll find a promotional video that I’ve only just discovered – the reason for my delayed response in posting this is probably vanity, the filming was over before I got to the Ferrari site in Maranello. The upside of my absence is that you can watch the much more articulate and handsome Dick de Waard explain about the cognitive loop in the film, which was our main involvement in the project.
It has been said that every cloud has a silver lining and the only positive from chronic jet lag (Kiel and I arrived in Vancouver yesterday for the CHI workshop) is that it does give you a chance to catch up with overdue tasks. This is a post I’d been meaning to write for several weeks about my involvement in the REFLECT project.
For the last three years, our group at LJMU have been working on a collaborative project called REFLECT funded by the EU Commission under the Future and Emerging Technology Initiative. This project was centred around the concept of “reflective software” that responds implicitly to changes in user needs and in real-time. A variety of physiological sensors are applied to the user in order to inform this kind of reflective adaptation. So far, this is regular fare for anyone who’s read this blog before, being a standard set-up for a biocybernetic adaptation system.
I came across an article in a Sunday newspaper a couple of weeks ago about an artist called xxxy who has created an installation using a BCI of sorts. I’m piecing this together from what I read in the paper and what I could see on his site, but the general idea is this: person wears a portable EEG rig (I don’t recognise the model) and is placed in a harness with wires reaching up and up and up into the ceiling. The person closes their eyes and relaxes – presumably as they enter a state of alpha augmentation, they begin to levitate courtesy of the wires. The more that they relax or the longer they sustain that state, the higher they go. It’s hard to tell from the video, but the person seems to be suspended around 25-30 feet in the air.
This article in New Scientist on Project Natal got me thinking about the pros and cons of monitoring overt expression via sophisticated cameras and covert expression of psychological states via psychophysiology. The great thing about the depth-sensing cameras (summarised nicely by one commentator in the article as like having a Wii attached to each foot, hand and your hand) is that: (1) it’s wireless technology, (2) interactions are naturalistic, and (3) it’s potentially robust (provided nobody else walks into the camera view). Also, because it captures overt expression of body position/posture or changes in facial expression/voice tone (the second being muted as a phase two development), it measuring those signs and signals that people are usually happy to share their fellow humans – so the feel of the interaction should be as naturalistic as a regular discourse.
So why bother monitoring psychophysiology in real time to represent the user? Let’s face it – there are big question marks over its reliability, it’s largely unproven in the field and normally involves attaching wires to the person – even if they are wearable.
But to view a face-off between the two approaches in terms of sensor technology is missing the point. The purpose of depth cameras is to give computer technology a set of eyes and ears to perceive & respond to overt visual or vocal cues from the user. Whilst psychophysiological methods have been developed to capture covert changes that remain invisible to the eye. For example, a camera system may detect a frown in response to an annoying email whereas a facial EMG recording will often detect increased activity from the corrugator or frontalis (i.e. the frown muscles) regardless of any change on the person’s face.
One approach is geared up to the detection of visible cues whereas the physiological computing approach is concerned with invisible changes in brain activity, muscle tension and autonomic activity. That last sentence makes the physiological approach sound superior, doesn’t it? But the truth is that both approaches do different things, and the question of which one is best depends largely on what kind of system you’re trying to build. For example, if I’m building an application to detect high levels of frustration in response to shoot-em-up gameplay, perhaps overt behavioural cues (facial expression, vocal changes, postural changes) will detect that extreme state. On the other hand, if my system needed to resolve low vs. medium vs. high vs. critical levels of frustration, I’d have more confidence in psychophysiological measures to provide the necessary level of fidelity.
Of course both approaches aren’t mutually exclusive and it’s easy to imagine naturalistic input control going hand-in-hand with real-time system adaptation based on psychophysiological measures.
But that’s the next step – Project Natal and similar systems will allow us to interact using naturalistic gestures, and to an extent, to construct a representation of user state based on overt behavioural cues. In hindsight, it’s logical (sort of) that we begin on this road by extending the awareness of a computer system in a way that mimics our own perceptual apparatus. If we supplement that technology by granting the system access to subtle, covert changes in physiology, who knows what technical possibilities will open up?