Jeff Bezemer

Diggit Profile: Jeff Bezemer on multimodality and ethnography

Interview
Jan Blommaert
15/04/2020

Jeff Bezemer is Professor of Communication at the UCL Institute of Education (IoE) and its Knowledge Lab. As a student and close collaborator of the late Gunther Kress, he is in the forefront of multimodality research. His incredibly rich ethnographic analyses of communication patterns in operation theaters are profoundly innovative, tying together multimodality, communication studies and social semiotics in contexts marked by complex but highly organized interactions between people, objects and technologies.

From multilingualism to multimodality

Jan Blommaert: Jeff, summarizing your trajectory isn't simple. So let me just cherry-pick a few things. You did your PhD in Tilburg, on multilingualism in schools; you then became a close collaborator of the late Gunther Kress in London, developed into a leading researcher on multimodal analysis especially in medical settings; you now work at the IoE and the UCL Knowledge Lab, and you also coordinate the very successful summer course on 'Ethnography, Language and Communication' (ELC), which has been going for - how long now? Fifteen years?

Jeff Bezemer: Yes, for my PhD I explored how primary schools accommodate multilingualism. I spent weeks in a classroom recording interactions between teachers and students and talking to teachers. At the time (around the year 2000), Dutch newspapers were full of dystopian opinion pieces fueled by survey research pointing to widening gaps between children whose parents were born in the Netherlands, and children with parents who had immigrated to the Netherlands. Educationalists, politicians, parents, they all weaned in with their own guidance to teachers: what subject areas to prioritize, whether students’ home languages should be taught and if so, for what purposes, and so on. I wanted to find out how all this played out in the everyday lives of teachers and students, in a reading lesson or on the playground.

Gunther Kress gave me a really powerful language to begin to talk about those artefacts and the ways in which people interact with them.

I knew that colleagues in the UK (Gunther Kress, Carey Jewitt, Jill Bourne, Anton Franks) were doing similar research in secondary schools there. Their project was called ‘The Production of School English’, and their approach was a little different in that they video recorded classroom interactions (all I ever did in my case study was audio recording). When they started publishing from their work I was struck by the possibilities for analysis opened up by their approach. Long story short, after I had completed my PhD I received a scholarship from the Netherlands Organization for Scientific Research (‘NWO’) to work with Gunther and his team, learning how to do ‘multimodal’ analysis, i.e. how to account for gesture alongside speech, image alongside writing, and so on. As someone who was trained as an applied linguist, that was a significant step towards widening my analytical horizon.

Of course people before them had been doing some great stuff on classroom interactions using video – think of Ray McDermott’s and Fred Erickson’s work. Yet Gunther’s proposition was (and remains) different. To begin with, he wasn’t focusing on spontaneous interactions; As a semiotician he was interested in all traces of social interaction, from fleeting sounds to permanent marks set in stone. That actually chimed very well with the ethnographic training I had had from Sjaak Kroon and Jan Sturm: when doing fieldwork, grab what you can, and take it seriously as a source of information, indeed as evidence of the experiences and understandings of the people you’re observing. Gunther gave me a really powerful language to begin to talk about those artefacts and the ways in which people interact with them. We’ve done a lot of work together, looking at school textbooks, digital media texts, and clinical work.

Multimodality in an operation theater

Jan Blommaert: 'Clinical work' is to be taken literally here, isn't it?

Jeff Bezemer: It is. Just as the shift to multimodality has been a critical moment in my career, so has the shift to medical settings. Until around 2009 I had only ever looked at social interactions in school contexts, and then only mainstream primary and secondary education (no vocational education, for example). Hospitals are a different world: they’re highly complex institutions, and social action and interaction there can have immediate consequences for the safety of patients. Staff work under enormous pressures. And so you can’t walk into that space as a distant ethnographic fieldworker, do some observations, and bugger off again to write a monograph for fellow social scientists. You have to roll up your sleeves and work with staff and patients to support them.

You can’t walk into an operation theater as a distant ethnographic fieldworker, do some observations, and bugger off again to write a monograph for fellow social scientists.

Jan Blommaert: I remember that some of your studies showed very strange modes of interaction in operation theaters, for instance when microsurgery was used and medical staff inside the theater looked at screens rather than at the patient and colleagues. They sometimes talked about what was going on while turning their backs on each other. How did you make sense of such quite unnatural contexts for communication?

Jeff Bezemer: Laparoscopic surgeons have been using video technology for about 30 years now to look inside body cavities. Instead of looking at the part of the body that they’re operating on directly, they now watch the patient’s body and their own operative actions on a screen. Everyone else in the room has the same view of what’s happening inside the patient’s body. If you go to YouTube, and you search for ‘laparoscopy’ you’ll find plenty of examples. The outsider interpreting these videos will notice ’ strange' features of the body, and ‘ strange’ instruments making ‘strange’ movements, that have ‘strange'  effects on the body. The insider - say, an experienced surgeon - sees something quite different: they see features that stand for, e.g., anatomical and pathological categories, opportunities for dissection, or a surgeon's level of dexterity. Other insiders, e.g. scrub nurses, see something different again. To them, a surgeon tying knots is a sign of an upcoming need for a pair of stitch scissors, for example. As you spend time in operating theatres, observing and participating, you learn to read patient’s bodies and medical interventional actions like a surgeon, or a nurse, or an anaesthetist. 

As an ethnographer, you can gain insight in these meaning making practices by observing and talking to the insiders, but obviously opportunities for participant observation are very limited, and the videos don’t tell us much about many of the invisible features of the patient body, which are often critical signifiers for the surgeon having to judge how to handle tissue and so on.

Jan Blommaert: This is a very complex example of technology-mediated communication in which people, objects and movements are more prominent than talk, correct?

Jeff Bezemer: Absolutely, it’s an example of social actors working together to accomplish practical tasks. That’s very different from everyday conversational activity, such as having a chat over a coffee in a cafe. Communication is multimodal in both types of activity, but in the cases I examine it often proceeds perfectly well without speech, which is why multimodalists like me are particularly drawn to it. How do large teams of professionals with different roles, experiences, and responsibilities, manage to synchronize their work and complete complex surgical procedures, with some intermittent use of speech at best?

How do large teams of professionals with different roles, experiences, and responsibilities, manage to synchronize their work and complete complex surgical procedures, with some intermittent use of speech at best?

The answer lies in their ability to read each other’s bodily conduct, and the way they handle implements and manipulate objects. Of course, when newcomers joint the action the more experienced professionals are more likely to make explicit some of what would otherwise be taken for granted, so as to calibrate their understandings. And at some critical points in an operation even the experienced surgeon is likely to verbalise what they see and what they’re planning to do. On those occasions speech does become more prominent.

The ELC course, which we set up almost 15 years ago (with Ben Rampton, yourself, Adam Lefstein, Celia Roberts, and Carey Jewitt) covers both types of activity: We work through a job interview, classroom activity, and clinical work, applying the same micrological and ethnographic rigor. It’s an intensive, one-week course that has opened many people’s eyes to the potential of sociolinguistics, multimodality and ethnography. We start with the basics of interactional analysis, then we move on to multimodal analysis, and then we go in to trans-contextual analysis, to consider the role of institutions and organisations on social interaction.

The key to new complex modes of interaction

Jan Blommaert: Can I take you back to Gunther Kress’ work? It seems to me that he prefigured a lot of the technology-mediated communication and its peculiarities in the current online-offline nexus we inhabit. The new kinds of literacy required from digital natives, I remember, were things he was deeply fascinated by, and I had frequent discussions with him on these topics while I was at the IoE.

Jeff Bezemer: Gunther was fascinated by the rise of image as a mode of representation and communication across different social domains, both informal and formal. For example, we did a project together on textbooks made for secondary schools in England. We compared Science, Maths and English textbooks from the 1930s, 1980s and 2000s, and found that in that period, image had become a central means of representing the natural and cultural worlds that students are inducted into and of positioning the student-readers in those worlds. It’s not just a matter of there being more images in the contemporary textbook, although that is indeed the case; it’s that the communicative functions served by image have expanded dramatically. It’s no longer used to embellish a written text, it frequently carries the main message.

Even learning in school has become dependent on the ability to design and read image. Remarkably, very few curricula recognize the need to develop these abilities.

Textbooks show that this is not just a feature of the ‘phatic' communication we witness on the text messaging apps and the image and video sharing platforms that we’re all familiar with. Even learning in school has become dependent on the ability to design and read image. Remarkably, very few curricula recognize the need to develop these abilities. You’ll have to sign up to a dedicated bachelor’s or master’s degree to learn to design digital media texts, or to reflect critically on and talk about such texts.

Jan Blommaert: So we're witnessing a momentous shift in what it takes to learn?

Jeff Bezemer: Yes, and Gunther was keen to give recognition to the skills that people, not least young people, demonstrate as they build on the distinct possibilities of technologies, media, platforms, genres, and modes to make and read multimodal text. He sought to evidence these skills through semiotic analysis. What that approach presumes is that the analyst has a good understanding of the possibilities of technology, medium, platform, genre, and mode; and that the author's semiotic work can be inferred from their text.

That brings us to a productive tension in our collaboration. As an ethnographer, I put to him that while we can develop strong hypotheses on the basis of a semiotic, multimodal analysis, we need to also observe how text is made, what tools are being used, what can and cannot be done on a given platform, both technically and according to the social norms of the platform and its users, and so on. Insight in these conditions for and processes of making are best obtained through observation and interviewing.

When linguists turn to multimodality, there’s a risk that they describe all modes in terms of language,

Jan Blommaert: What I take from this is that you see ethnography as a major tool for getting beyond theoretical and methodological anachronisms. I mean: we address new modes of communication very much on the basis of criteria and issues derived from earlier modes of communication. And ethnography enables us to get to the core of things. Is that a correct summary?

Jeff Bezemer: It is indeed. When linguists turn to multimodality, there’s a risk that they describe all modes in terms of language, while overlooking some of the unique possibilities for expression of other modes. What’s exciting about social semiotics is that it tries to identify both the distinct differences and the general principles of meaning making across modes. To illustrate this, Gunther often referred to the different resources of modes for intensification: In speech, we can play with loudness and pitch movement; in writing, we can underline parts or use a larger font in bold; in gesture, we can change the speed or reach of our strokes, and so on. That’s a fairly general list; if we did some ethnographic work on the use of moving image among teenagers, we could establish how they realize intensity with the resources available to them.

There’s another ‘risk’ in multimodality that ethnography can address. Multimodalists think in terms of modes. They have lists of what they assume to be modes, which often become an analytical checklist. Looking at a concrete instance of meaning making, they ask, what modes do we have here, and how are they used? We need ethnography to check that these modal categories resonate with the meaning maker’s experiences, understandings, practices. Did the sign maker who took a picture on a phone and sent it off to a friend indeed consider the ‘modes' identified by the analyst, say the gaze of its represented participants? Or was the sign maker motivated by other kinds of considerations? What choices did they make in the process of production, editing and dissemination, and what was decided for them by the camera settings, the represented participants, the platforms, and so on? If we can tackle these questions we have pretty solid grounds for making reliable claims about social interaction, communication and meaning making.