Gunther Kress and Theo van Leeuwen describe the concept of multimodality. They challenge their readers to consider the varied forms of meaning making that extend beyond language and enhance the semiotic process.
For some time now, there has been, in Western culture, a distinct preference for monomodality. The most highly valued genres of writing (literary novels, academic treatises, official documents and reports, etc.) came entirely without illustration, and had graphically uniform, dense pages of print. Paintings nearly all used the same support (canvas) and the same medium (oils), whatever their style or subject. In concert performances all musicians dressed identically and only conductor and soloists were allowed a modicum of bodily expression. The specialised theoretical and critical disciplines which developed to speak of these arts became equally monomodal: one language to speak about language (linguistics), another to speak about art (art history), yet another to speak about music (musicology), and so on, each with its own methods, its own assumptions, its own technical vocabulary, its own strengths and its own blind spots.
More recently this dominance of monomodality has begun to reverse. Not only the mass media, the pages of magazines and comic strips for example, but also the documents produced by corporations, universities, government departments etc., have acquired colour illustrations and sophisticated layout and typography. And not only the cinema and the semiotically exuberant performances and videos of popular music, but also the avant-gardes of the ‘high culture’ arts have begun to use an increasing variety of materials and to cross the boundaries between the various art, design and performance disciplines, towards multimodal Gesamtkunstwerke, multimedia events, and so on.
The desire for crossing boundaries inspired twentieth-century semiotics. The main schools of semiotics all sought to develop a theoretical framework applicable to all semiotic modes, from folk costume to poetry, from traffic signs to classical music, from fashion to the theatre. Yet there was also a paradox. In our own work on visual semiotics (Reading Images, 1996), we, too, were in a sense ‘specialists’ of the image, still standing with one foot in the world of monomodal disciplines. But at the same time we aimed at a common terminology for all semiotic modes, and stressed that, within a given social-cultural domain, the ‘same’ meanings can often be expressed in different semiotic modes.
… [Now] we explore the common principles behind multimodal communication. We move away from the idea that the different modes in multimodal texts have strictly bounded and framed specialist tasks, as in a film where images may provide the action, sync sounds a sense of realism, music a layer of emotion, and so on, with the editing process supplying the ‘integration code’, the means for synchronising the elements through a common rhythm. Instead we move towards a view of multimodality in which common semiotic principles operate in and across different modes, and in which it is therefore quite possible for music to encode action, or images to encode emotion. This move comes, on our part, not because we think we had it all wrong before and have now suddenly seen the light. It is because we want to create a theory of semiotics appropriate to contemporary semiotic practice. In the past, and in many contexts still today, multimodal texts (such as films or newspapers) were organised as hierarchies of specialist modes integrated by an editing process. Moreover, they were produced in this way, with different, hierarchically organised specialists in charge of the different modes, and an editing process bringing their work together.
Today, however, in the age of digitisation, the different modes have technically become the same at some level of representation, and they can be operated by one mufti-skilled person, using one interface, one mode of physical manipulation, so that he or she can ask, at every point:’Shall I express this with sound or music?’,’Shall l say this visually or verbally?’, and so on. Our approach takes its point of departure from this new development, and seeks to provide the element that has so far been missing from the equation: the semiotic rather than the technical element, the question of how this technical possibility can be made to work semiotically, of how we might have, not only a unified and unifying technology, but also a unified and unifying semiotics.
Let us give one specific example. In ReadingImages (1996) we discussed’framing’ as specific to visual communication. By ‘framing’ we meant, in that context, the way elements of a visual composition may be disconnected, marked off from each other, for instance by framelines, pictorial framing devices (boundaries formed by the edge of a building, a tree, etc.), empty space between elements, discontinuities of colour, and so on. The concept also included the ways in which elements of a composition may be connected to each other, through the absence of disconnection devices, through vectors, and through continuities and similarities of colour, visual shape and so on. The significance is that disconnected elements will be read as, in some sense, separate and independent, perhaps even as contrasting units of meaning, whereas connected elements will be read as belonging together in some sense, as continuous or complementary. …
In an era when monomodality was an unquestioned assumption (or rather, when there simply was no such question, because it could not yet arise), all the issues clustering around the idea of design—a deliberateness about choosing the modes for representation, and the framing for that representation—were not only not in the foreground, they were not even about. Language was (seen as) the central and only full means for representation and communication, and the resources of language were available for such representation. Where now we might ask, ‘Do you mean language as speech or as writing?’, there was then simply ’language’. Of course there was attention to ’style’, to the manner in which the resources of ‘language’ were to be used on particular occasions. And of course there were other modes of representation, though they were usually seen as ancillary to the central mode of communication and also dealt with in a monomodal fashion. Music was the domain of the composer; photography was the domain of the photographer, etc. Even though a multiplicity of modes of representation were recognised, in each instance representation was treated as monomodal: discrete, bounded, autonomous, with its own practices, traditions, professions, habits. By contrast, in an age where the multiplicity of semiotic resources is in focus, where multimodality is moving into the centre of practical communicative action.