Date post: | 11-Nov-2023 |
Category: |
Documents |
Upload: | lmu-munich |
View: | 0 times |
Download: | 0 times |
In: Hoffman, Christian R. (ed) (2010) Narrative Revisited. Amsterdam: Benjamins, 245-265.
FILM DISCOURSE COHESION
Richard W. Janney
Prologue
More than three decades ago, the film semiotician Christian Metz remarked that “film is hard
to explain because it is easy to understand” (1974: 69). In saying this, he was alluding to the
mysterious, almost mindless way film images seem to communicate as compared with
sentences of written language. This can be illustrated by comparing Figures 1 and 2 below.
Both represent the same information about relations between A, B, C, and D. Figure 1,
however, represents the information conceptually, in a manner analogous to that of sentences
of language, Figure 2, perceptually, in a manner analogous to that of shots in film sequences.
Figure 1. Conceptual relations. Figure 2. Perceptual relations.
We recognize at once that although the sequences in both figures are cohesive, they are not
equally coherent. That is, it takes more effort to understand relations between the parts in
Figure 1 than in Figure 2. This is confirmed if people are shown the figures separately and
asked to signal as soon as they know ‘which is smallest’. A response to Figure 1 usually takes
at least 20 to 30 seconds, while the response to Figure 2 is spontaneous. Why? And what does
this have to do with film discourse cohesion?
1. Cohesion in Film Discourse
This paper addresses the notion of ‘relations between parts’ in film. It is an experiment in
approaching film discourse cohesion from a point of view that might be compatible with
A
C D B
(A > B) (C > D) (B > D) (B < C)
2
approaches to language discourse in linguistics. Despite obvious differences between
conceptual representation in language and perceptual representation in film, language and
film share sequential characteristics that make it possible in principle to consider describing
principles of organization in shot sequences with concepts adapted from studies of intra- and
intersentential links in language discourse. A bridging concept in this connection is the
concept of cohesion.
Just as speech is more than just spoken writing, film is also more than audio-envisualized
script (film script made audible and visible). A glance at any film script shows what is lost in
terms of cohesion and coherence without visual information. In Figure 3, the column on the
left presents a stretch of dialogue from Jim Jarmusch’s film Ghost Dog (1999); the column on
the right describes the content of the camera shots accompanying the dialogue.
Figure 3. A sequence from Jim Jarmusch’s Ghost Dog (1999).
Aside from the exchange between Frank and Louise about wine, the script in Figure 3 is neither
cohesive nor coherent. It can be understood only with the help of the visual content on the
Script
Voice: Come in! One, two, three, four,
five, six, seven...
Frank: Have some more wine, Louise.
Louise: I don't want any more wine.
Frank: What the fuck? What do you want?
You want my Rolex? Whatever the
fuck...
Louise: Did my father send you here to do this?
It's a good book. Ancient Japan was a
pretty strange place. You can have it.
I'm finished with it.
Shots
A room: Frank watching an animated
cartoon on tv, Louise reading a book.
Frank turns to Louise.
Louise drinks more wine and throws
her book on the floor. A killer
appears.
The killer shoots Frank, gazes down
at the book, picks it up, and looks
back to Louise.
3
right. Thus, the unifying ties making this sequence a filmic text are cinematographic. The visual
images fill out the ellipses in the written script.
Both linguistic and cinematic texts require cohesive ties between their parts. In order to make
sense, the individual units of information, whether stored mentally as concepts or percepts,
have be related to each other in an interpretable fashion. It is in this sense, Beaugrande (1980:
17) notes, that the stability of a text depends on continuities of occurrence in its participating
systems: texts require sequential connectivity. The fact that sequential connectivity is central
to understanding all forms of discourse makes the forging of cohesive ties no less a
fundamental concern of filmmakers than it is of writers. Already in the 1930s, Sergei
Eisenstein, a pioneer of Soviet silent film, expressed “the need for connected and sequential
exposition of the theme, the material, the plot, the action, the movement within the film
sequence and within the drama a whole,” and complained that even “the simple matter of
telling a connected story” was becoming lost in the works of some of his contemporaries
(Eisenstein, 1942: 3).
Film depends to a large extent for its perceptual connectivity on the presence of cohesive
visual ties between frames in shots, shots in sequences, and sequences in larger narrative
units. It depends on the filmmaker’s ability to ‘put the pieces together’, as it were, in ways
that make the individual shots in a narrative sequence interpretable on the basis of
interpretations of the shots and sequences preceding and following them. Here, film and
language discourse are similar. Just as writers have lexical and grammatical techniques for
maintaining discourse cohesion, filmmakers have repertoires of shot composing and editing
techniques for maintaining cohesion in film narratives; and in film, as in writing, a
prerequisite for cohesion is the co-presence and co-referentiality of the elements joined
together by these techniques (Halliday and Hasan, 1976: 3). Stretches of film discourse, like
stretches of language, can be characterized in terms of the kinds of cohesive ties that link their
units of meaning together.
2. Units of Film Discourse
In modeling cohesion in film, it is necessary to provide a description of the filmic units linked
together in cohesive discourse, a typology of cohesive relations that can exist between the
filmic units, and a framework of cinematic techniques by which units are combined into
4
cohesive sequences. Most film theorists since Metz agree that the main structural units of the
dramatic film are the frame, the shot, the sequence, the episode, and the narrative (see Figure
4). If these units are compared with units of language discourse, certain parallels emerge
between cohesive structures in the two modes.
Figure 4. Units of Film Discourse.
2.1 Frame
At the most basic level, films are comprised of frames – the photographic images on the film
strip. Recorded and projected at a rate of 24 frames per second, these create the illusion of
natural movement on the screen. Although frames usually are not perceived individually
(unless frozen for dramatic effect), they have important functions in film discourse. Their
most relevant function is their powerful referentiality, which has interested semioticians ever
since Roland Barthes’ influential remarks on the photograph as ‘the This’ in Camera Lucida
(1981: 5): Barthes regarded reference as the founding order of the photographic image. The
photograph, he said, “is never anything but an antiphon of ‘Look’, ‘See’, ‘Here it is’; it points
a finger at a certain vis-à-vis, and cannot escape this pure deictic language... [The camera]
points at something, and [says]: that, there it is, lo! but says nothing else.” Because of their
important indexical role in pointing to the things in the camera’s field of vision, frames can be
regarded as the smallest meaningful units of film discourse structure.1
Figure 5. Frames refer.
SHOT FRAME SEQUENCE EPISODE NARRATIVE
5
Consider the frame in Figure 5 from Jim Jarmusch’s Ghost Dog (1999). It denotes a man
standing in front of a truck. If we wished to describe the referents of this frame in words, we
would have a large inventory of lexical items to choose from: the man could be described as
large, unsmiling, African-American, wearing a dark hooded sweat-shirt, in a dark duffel coat;
the truck could be described as old, French, an ice-cream truck; the spatial relationship
between the man and truck could be described as in front of, behind, and so forth. But
regardless of which particular items were chosen, the description would consist largely of
attributive adjectival forms of the sort commonly found in noun phrases of English
sentences.2 Frames, it seems, have functions in shots that can be likened to the functions of
noun phrases in sentences of language.
2.2 Shot
Combined into shots, the images in the individual frames are perceived to move. The addition
of movement to the film image increases the amount of information it communicates and
expands its meaning potential.3 Through movement, shots, in addition to referring to their
referents (via frames), seem to make something analogous to visual assertions about them.4
Notice what happens in Figure 6 when the frame in Figure 5 is put together with other frames
in the shot from which it was originally taken.
Figure 6. Shots predicate.
6
If we were to describe the sequence in words, we might say something like ‘a man standing in
front of a truck walks into the street’. That is, in addition to describing the shot’s referents
(the man, the truck), the description would elaborate on what happens: how the man moves,
where he goes, etc; the scope of expression would expand beyond the subject into a type of
statement about the action performed by the subject. Shots are thus more than simply the
sums of their references (the individual frames); they seem also to have predicational
functions analogous to those of verb phrases in English sentences.
We perhaps also recognize two further things about the shot in Figure 6: first, that the visual
reference to the man is, in a sense, ‘carried along’ within the individual frames into the
predication. Reference is embedded within predication. There is no clear distinction in camera
shots between reference and predication – both occur simultaneously. This obviously
distinguishes shots from sentences of language, where reference and predication are separate
functions delegated to distinct structures in the sequence (NP, VP). Second, note that the
shot’s ‘predicational meaning’ is not conveyed by any individual frame in the sequence but
arises from relations between the frames. I will come back to this point later.
2.3 Sequence
The next larger unit of film discourse structure beyond the shot is the sequence: a longer,
edited segment of film in which individual shots are combined into stretches of narrative
action. When shots are combined into sequences, film begins to take on characteristics of
extended discourse or text. The structuring of smaller units into progressively larger ones is
film’s counterpart to discourse syntax (see Metz, 1974: 177 ff).
In Figure 7, the shot in Figure 6 is seen in the sequence in which it originally appeared.
Viewed in succession, the shots form a cohesive series of filmic predications: (1) Ghost Dog
(the man in front of the truck) walks into the street [medium shot of Ghost Dog]; (2) Louie is
standing in the distance on the street [long shot of Louie]; (3) Ghost Dog and calls to Louie,
What is this Louie - High Noon? [long shot of Ghost Dog from Louie’s perspective]; (4)
Ghost Dogs says, The final shootout scene? [long shot of Louie from Ghost Dog’s
perspective]; (5) Louie answers from the distance, Yeah, I guess it is [long shot of Louie
continues]; (6) Ghost Dog replies, Yeah, well it’s very dramatic [medium shot of Ghost Dog].
7
Figure 7. Sequences narrate.
Such a sequence automatically triggers narrative inferences.5 Among the inferences triggered
here are the following: independently of film co-text, Louie in the dialogue in frame 3 refers
anaphorically back to the man in frame 2. At the same time, the dialogue and the gaze of
Ghost Dog in frame 3 refer cataphorically forward to Louie in frame 4. The alternating views
of the respective ends of the street on which the men are standing (shot/reverse shots 2-3), and
the buildings in the background, delineate the spatial boundaries of the event and mark Ghost
Dog’s and Louie’s locations relative to each other. The shot/reverse shots of Ghost Dog and
Louie imply the temporal continuity of their interaction, and so on. Such juxtapositions create
cohesion.
To repeat, film narratives, like language narratives, result from the progressive integration of
smaller discourse units into larger ones. Frames are combined into moving images in shots;
shots are combined into cohesive units of action in sequences; sequences are combined into
narrative units with beginnings, middles, and ends in episodes; and episodes are combined
into the larger narrative structures telling the story.6 When the units are properly combined,
the spectator experiences a smooth progression of audio-visual filmic events.
What is this Louie - High Noon?
Yeah, well it’s very dramatic The final shootout scene? Yeah, I guess it is...
4 5 6
3 2 1
8
3. Cohesive Relations in Film Discourse
As said earlier, in order for a film to be understandable, the discourse presenting it must be
cohesive. Visual cohesion is a binding force of film meaning. At the technical level, this is
achieved by editing the individual shots into longer sequences that seem to be perceptually
interconnected in space and time in the dramatic action. Barring artistic surprises, what the
spectator sees and hears at any given moment in a film has to be connectable in some fashion
back to what already has happened in the narrative, and forward to what will take place later.
Film coherence depends to a considerable extent on the construction of cohesive relations
between shots in sequences, sequences in episodes, and episodes in larger narrative structures.
In the early days of silent film, Hollywood filmmaker D.W. Griffith invented the continuity
system to solve this problem. Still relied on today, the continuity system is a repertoire of
editing techniques designed to produce coherent, cohesive successions of shots presenting a
dramatic narrative. Its purpose is to guarantee the smooth unfolding of envisualizations of
events in space, time, and action on the screen (cf. Bordwell and Thompson, 2008: 231 ff).
Continuity editing follows principles that are surprisingly similar to principles of cohesion in
language. Linguistics provides categories for describing these principles. Five classes of
cohesive ties are usually said to operate across sentence boundaries in language: reference,
substitution, ellipsis, conjunction, reiteration, and collocation (cf. Halliday and Hasan, 1976).
Analogies to most of these are found in film.
3.1 Reference
In studies of discourse cohesion, reference is regarded as an endophoric relation between
words in texts. Continuity of reference keeps track of the identities of topics (Halliday and
Hasan, 1976: 31). In the utterance sequence John is here – He came five minutes ago, for
example, the pronoun he in the second utterance refers anaphorically back to the same
referent as the one referred to by the proper noun John in the first, creating a relation of
identity between the two referents (Lyons, 1977: 660). Reference cohesion occurs whenever
one item in a stretch of discourse points anaphorically or cataphorically to another for its
interpretation. Forms often associated with reference cohesion are personal pronouns (I, you,
her, etc), demonstrative pronouns (this, that, here, there, etc), and comparatives (similarly,
differently, more, less, larger, smaller, etc) (Halliday and Hasan, 1976: 37). These latter two
9
types of reference – demonstrative reference and comparative reference – play a central role
in film discourse cohesion.
Demonstrative Reference
Nominal and adverbial demonstrative forms in language locate referents on scales of spatial
nearness or distance relative to the deictic locus of discourse; they distinguish these referents
here in the text from those there. Although film lacks such forms, it compensates by signaling
proximal and distal spatial relations visually via actors’ gazes and pointing gestures, and by
near/far camera positions and the relative sizes of images shots (close ups vs. long shots), and
so forth.
The sequence in Figure 9 from Fred Zinnemann’s High Noon (1952) is an example of how
this works. Here, spatial cohesion is realized via a combination of gaze direction (eyeline
match), camera distance (proximal/distal juxtapositions), and shifts of camera position
between the points of view of the protagonists (shot/reverse shot juxtapositions).
Figure 9. Demonstrative reference (gaze) in Zinnemann’s in High Noon (1952).
After walking out of his office onto the street, the Sheriff gazes to his left (close up 1) and
sees two women approaching from the distance in a wagon (extreme long shot 2). As the
wagon passes by, the women (low angle, medium close up 3) and Sheriff (reverse shot, close
up 4) exchange glances. As the women leave town, one of them gazes back over her right
shoulder toward the Sheriff (long shot 5) and watches him recede into the distance, standing
alone in the empty street (reverse shot, extreme long shot 6). The spatial coordinates of this
1 2 3
4 5 6
10
sequence are plotted diagonally along an axis of action running diagonally from one end of
the street (2) to the other (6), past the Sheriff, who is the focal point of the action in the
middle (4). All shots in the sequence are made from the same side of the imaginary line traced
by the wagon’s movement along this axis.
Nearly throughout the sequence, the camera alternates between the Sheriff’s and the women’s
points of view (shot/reverse shots 3-4, 4-5, 5-6). Notice that although this technique could
conceivably cause deictic confusion (the deictic zero-point of the narrating camera eye
changes places), we tend as spectators not to find the dislocations deictically confusing.
Notice also that through this technique, in fact, a subtle form of cohesion develops between
cataphoric and anaphoric visual references in the sequence. In shot 1, our attention is directed
by the Sheriff’s gaze cataphorically forward in the sequence to the women. In shot 5, it is
directed by the woman’s gaze anaphorically backward to the sheriff. Together, the cataphoric
and anaphoric visual references serve here to bracket the central event of the sequence – the
gazes between the Sheriff and the women in the middle. We also see in this sequence first a
spatial convergence, then a divergence, of the chacters’ visual standpoints along the axis of
action (distal, proximal, distal) that moves smoothly from left to right across the screen
(match on action). By such means, filmmakers maintain cohesive spatial relations in
sequences despite changing camera positions and perspectives.7
Comparative Reference
Certain classes of adjectives and adverbs in language signal comparative relations of identity,
similarity, or difference – items like same, similar, different (general comparisons), better,
worse, larger, smaller, etc (particular comparisons) (Halliday and Hasan, 1976: 76 ff).
Likeness and difference are thus referential properties insofar as things are ‘alike’ or
‘different’ only in relation to other things – comparison requires a tertium comparationis.
Although film lacks lexical means of expressing comparative concepts, it has means of
envisualizing likenesses and differences as percepts (see Section 4). For example,
comparative reference is used as a frame compositional technique to emphasize similarities
throughout Jean-Luc Godard’s Breathless (1960) (see Figure 10),
11
Figure 10. Comparative reference (similarity) in Godard’s Breathless (1960).
and it figures prominently in shot/reverse shot juxtapositions emphasizing differences in Fritz
Lang’s M (1931) in Figure 11.
Figure 11. Comparative reference (difference) in Lang’s M (1931).
3.2 Ellipsis
Ellipsis is a cohesive relation by which something left unexpressed is understood nonetheless,
as in the sentence sequence They have three daughters. Julie is the cleverest (Halliday and
Hasan, 1976: 142). Robert de Beaugrande (1980: 133) defines ellipsis as “the omission of ...
expressions whose conceptual content is nonetheless carried forward and expanded or
modified by means of noticeably incomplete expressions.” Although film lacks direct
equivalents of lexis or grammar, it nevertheless manages to establish cohesion through ellipsis
in the broad sense above. A common example is the shortening of movement sequences,
where a character is shown entering a room, taking a few steps, and suddenly standing on the
other side of the room without literally having been shown getting there. Showing the
beginnings, middles, and ends of actions is a standard continuity editing technique (match on
action) for eliminating parts of shots that do not contribute to the dramatic development of the
story. In the sequence from Alfred Hitchcock’s North by Northwest (1959) in Figure 12, for
12
example, parts of the predicational structures of the two comprising master shots – a man
watching from the window (frames 1, 3, 5) as a second man enters the room and crosses to the
other side (frames 2, 4, 6) – are cut out, creating breaks in the flow of action in both shots.
The missing frames are replaced by nothing (cf Halliday and Hasan, 1976: 88). Edited
together in the sequence, however, the shots create the impression of a single event. The
perceptual content of the two shots is carried forward despite the absence of its direct visual
expression.
Figure 12. Ellipsis in Hitchcock’s North by Northwest (1959).
A famous ellipsis occurs at the end of Hitchcock’s North by Northwest, in a sequence that
begins with the hero reaching out his hand to his colleague (1) as she clings to the side of a
cliff on Mt. Rushmore (see Figure 13). As he pulls her up (2), the setting unexpectedly shifts
to a sleeping car in a train traveling back to New York (3). In a single, unbroken movement,
he appears to lift the woman from the face of the mountain up into bed – and from the
dialogue, we learn that during the ellipsis they have married.
Figure 13. Ellipsis in Hitchcock’s North by Northwest (1959).
4 5 6
1 2 3
1 2 3
13
3.3 Junction
Conjunctive cohesion, or simply ‘junction’, as Beaugrande and Dressler (1981) call it, is one
of the main forms of cohesion in language. Junction involves the use of markers like and, yet,
so, nevertheless, because, etc to signal conceptual relations between clauses, sentences, and
paragraphs (Halliday and Hasan, 1976). Lexical items fulfilling junctive functions do not refer
directly to other items in the text but rather mark logical, junctive relations between items.
Beaugrande and Dressler (1981) distinguish four types of junctive relations: conjunction
(‘and’ relations), subordination (‘because’ relations), disjunction (‘or’ relations), and
contrajunction (‘but’ relations). Although film does not have logical forms expressing
concepts of additivity, causality, adversativity, or alternativity, it does manage to
communicate such relations perceptually. Many analogies to linguistic junctive relations are
found in shot sequences.
Conjunction (additive - ‘and’ relations)
Conjunction is an additive relation signaled linguistically by conjuncts like and, also,
furthermore, additionally, besides, etc (Beaugrande and Dressler, 1981). In film, additive
relations are often inferred from shot juxtapositions. As said earlier, given the sequential
nature of film, shots inevitably appear to follow each other additively even if they have
nothing in common other than their simple contingency on the film strip. Rudolf Arnheim
(1957: 21) claimed that “the succession of separate events implies a corresponding sequence
of time.” Conjunctive linkage thus almost has the status of a default perceptual expectation in
film discourse.
Continuity editing makes use of this assumption in various ways. As pointed out earlier, the
match on action technique is often used to represent continuations of events: something
begins happening in one shot and we see its continuation or completion in the next. In Orson
Welles’ Citizen Kane (1941), for example, an object falls from Kane’s hand and rolls to the
top of a stairway in one shot, then tumbles down the stairs in a second shot, and bursts in a
third (see Figure 14).
14
Figure 14. A conjunctive relation in Welles’ Citizen Kane (1941).
The shot/reverse shot technique is often used together with the match on action technique to
film actions from different, juxtaposed points of view. In Howard Hawk’s The Big Sleep
(1946), for example, a butler opens the door and greets an arriving guest in one shot, and the
guest is shown entering the room and returning the greeting in the next (see Figure 15).
Figure 15. A conjunctive relation in Hawk’s The Big Sleep (1946).
Disjunction (alternative -‘or’ relations)
Disjunction is an alternative relation established via or (sometimes either/or or whether⁄or)
between two or more concepts in a text where only one can pertain in the textual world
(Beaugrande and Dressler, 1981: 71 ff). It often expresses uncertainty, alternatives, or
afterthoughts. In film, disjunctive relations exist when only one of two or more alternative
images can pertain in the sequence. It is often used to suggest dream states, hallucinations, or
states of confusion in which characters are not sure of what they perceive, remember, wish,
etc. Figure 16, from Charles Chaplin’s The Circus (1928), portrays a tramp’s dilemma as he
sits, watching his girlfriend speaking with a handsome new tightrope star.8 He wants to stand
up, go over to the circus star, and kick him in the behind; but instead he sits, suffers, and does
nothing. In the sequence, both alternatives are depicted simultaneously via a double exposure.
Butler: ‘Good evening” Guest: ‘Good evening”
15
Figure 16. A disjunctive relation in Chaplin’s The Circus (1928).
There is a sequence in Roman Polanski’s The Tenant (1976) in which a psychotic man living
alone in a tenement building experiences a hallucination (see Figure 17). He goes to the
window, looks out toward a window across from his apartment, and sees himself standing in
it looking back at him. In the final shot he is ‘there’ as well as ‘here’.
Figure 17. A disjunctive relation in Polanski’s The Tenant (1976).
Contrajunction (adversative - ‘but’ relations)
Contrajunction is an adversative relation between causes and unanticipated effects. The
function of contrajunctive markers – but, yet, nevertheless, etc – is to smooth over transitions
at points where seemingly improbable combinations of events or situations occur
(Beaugrande and Dressler, 1981: 71-73). In film, contrajunctive relations are often
represented by shot combinations that seem unexpected, incongruous, or incompatible within
the sequence. In Jim Jarmusch’s Ghost Dog (1999) in Figure 18, for example, a killer is about
to shoot his victim when the screen goes black for a few seconds; the killer lowers his rifle to
discover that a small bird has landed on the muzzle, blocking his vision.
16
Figure 18. A contrajunctive relation in Jarmusch’s Ghost Dog (1999).
Subordination (causal -‘because’ relations)
Subordination is a causal relation that exists whenever the interpretation of one clause or
sentence in a stretch of discourse depends on the interpretation of another (Beaugrande and
Dressler, 1981: 71-74). Subordinating junctives – because, since, as, thus, therefore, etc –
express preconditions, causes, or reasons for items being joined together in the text. In film,
causal relations occur frequently between shots. Often, the match on action technique is used
to represent them, as in the scene of the massacre of civilians in Oddessa by Cossack soldiers
during the Russian Revolution in Sergei Eisenstein’s Battleship Potemkin (1925) in Figure 19.
Here, the Cossacks fire in one shot, and a woman is hit in the next.
Figure 19. A causal relation in Eisenstein’s Battleship Potemkin (1925).
3.4 Reiteration
Reiteration is a form of cohesion in which lexical items are said to enter into cohesive
relations by anaphorically or cataphorically calling attention to each others’ senses (Halliday
and Hasan, 1976; Beaugrande and Dressler, 1981). “The repeated elements,” according to
Beaugrande (1980: 133), “may have the same, different, or overlapping reference, and the
extent of conceptual content they can be used to activate varies accordingly.” Reiterative
17
patterns create metonymic, metaphorical, synonymic, antonymic, and other ties (Halliday and
Hasan, 1976: 278-279). Film establishes reiterative cohesion in a number of ways.
Metonymy
Metonymy is a part-for-whole relation in which reference to a part of something stands for the
whole. A famous instance of filmic metonymy occurs at the end of the ‘Dawn of Man’
sequence in Stanley Kubrick’s 2001: A Space Odyssey (1968) (see Figure 20), where a
prehistoric ape-man euphorically throws a bone into the air after realizing that it can be used
as a weapon. At the apex of its arc, the bone turns into a space craft. In this historic match cut,
both the bone and the space craft function as metonymies for the concept of tools.9
Figure 20. Metonymy in Kubrick’s 2001: A Space Odyssey (1968).
Metaphor
Metaphor is a comparative relation in which references to two or more objects combine to
signify some shared overriding concept, characteristic, or quality. Charles Chaplin’s Modern
Times (1936), for example, begins with a metaphorical juxtaposition of shots of a sheep herd
and of factory workers coming out of a subway to go to work (see Figure 21).
Figure 21. Metaphor in Chaplin’s Modern Times (1936).
18
Synonymy
Synonymy in language is a semantic relation between lexical items referring to the same
object, as in woman, female, lady. In film, it is created by montage sequences combining
different images with the same significance. Lang’s M (1931), for example, begins with a
scene in which a child has failed to come home to lunch after school. As her distraught
mother calls her name from the window, we see a progression of shots of ‘emptiness’ that
operate metonymically as synonyms for the child’s absence: she is not in the stairway, not in
the attic, and not at the table (see Figure 22).
Figure 22. Synonymy in Lang’s M (1931).
Antonymy
Antonymy in language is a semantic relation between lexical items referring to objects with
opposite meanings. In film, visual analogies to antonyms are created by montage sequences
combining images of opposites, as in Figure 23, where an image of a rich woman on the
Odessa stairway in Eisenstein’s Battleship Potemkin (1925) is contrasted with the image of a
legless beggar.
Figure 23. Antonymy in Eisenstein’s Battleshp Potemkin (1925).
19
3.5 Collocation
Collocation in linguistics refers to ties between lexical items that stand to each other in some
recognizable, significant relation without being directly connected in a semantic sense. As
with synonyms, a collocational relation exists without explicit reference to the other item. The
relation is based on associations involving knowledge of the subject fields referred to by the
other items in the collocation. Knowledge of collocational relations in film are built up co-
textually through repetitions of activities, images, or motifs in connection with each other. In
Jarmusch’s Ghost Dog (1999), for example, we find patterns of reference in which the ‘bad’
characters in the narrative - the gangsters - are systematically collocated with television
cartoons, as in Figure 24,
Figure 24. Cartoon watchers in Jarmusch’s Ghost Dog (1999)
while the ‘good’ characters - Ghost Dog and his friends - are collocated with books, as in
Figure 25.
Figure 25. Book readers in Jarmusch’s Ghost Dog (1999)
4. Film Discourse Cohesion Revisited
The purpose of the preceding section has not been to provide a full account of cohesive
relations in film but to illustrate some surprising similarities between cohesive relations in
20
language and film texts. It is instructive, I think, simply to recognize that such
correspondences exist despite the obvious formal differences between the two types of
discourse and their respective ways of generating coherent text. The ease with which
categories used to describe cohesive relations in language can be applied to describing film
cohesion is in itself intriguing; but pointing this out alone does not bring us further in
unraveling the implications of Christian Metz’s paradoxical observation, quoted at the
beginning of the paper, that “film is hard to explain because it is easy to understand” (1974:
69). Indeed, perceptions of cohesive relations in film often seem easier to understand than to
explain.
Thinking about the shot juxtapositions discussed in Section 3, we are almost tempted to ask
what there is to ’understand’ in many of them in the first place. The cohesive relations seem
so simple and self-evident that they require no thought at all. We simply see them. For
example, reconsider the frames in Figure 26:
Figure 26. A small man looks up; a large man looks down.
With little cognitive effort, we see two men – one small, one large – looking at each other.
This perceptual judgment is spontaneous, like the judgment of the sizes of the boxes in Figure
2 at the beginning of the paper. And, indeed, it is necessary, for without it, we could not go on
to the next step and cognitively categorize the relation between the two frames as
‘comparative’. But if we look at the images themselves, we see that there is nothing in them
individually that in any way marks, points to, or operates as a cue to, the comparative relation
we have just perceived and categorized. The perceived relation is clear without the help of
explicit markers. Why?
Paul Watzlawick et al. (1967: 61 ff) would say that the answer lies in the analogical nature of
film images as opposed to the digital nature of language: words and visual images
21
communicate by different means and each has strengths and weaknesses in communicating
different types of information. Language reduces information to discrete, binary, digital
(either/or), conceptual units, which are combined additively according to the logical syntax of
language to yield ‘sums’ or ‘products’ of the operations performed on the individual units in
the sequence (cf Janney and Arndt, 1994: 444). Film, on the other hand, represents
information holistically, without reducing it to discrete conceptual units, and it has nothing
comparable to the logical syntax and semantics of language. Its messages are relational,
figurative, and analogical (gradient, matters of degree). As a result, according to Watzlawick
et al. (1976), while abstract concepts and logical propositions are better ‘said’, relational
orientations and affective states are better ‘shown’: “Digital language has a highly complex
and powerful logical syntax but lacks adequate semantics in the field of relations, while
analogic language possesses the [relational] semantics but has no adequate syntax for ...
unambiguous definition“ (1976: 66-67). Indeed, returning to Figure 26, we can see the
relational impoverishment of a small man looks up; a large man looks down compared with
the wealth of relational information communicated visually by the two images.
Considered in this light, perhaps the ease of understanding film is not quite as difficult to
explain as Metz suggested (1974: 69). The problem may simply be a consequence of Metz’s
presupposition that film understanding requires a linguistic explanation. Other solutions are
suggested in cognitive psychology (cf Hochberg, 1998). There is the possibility, for example,
that cohesion in film discourse is not primarily a conceptual phenomenon at all but rather
originally a perceptual one. Cognitive psychologists have argued for decades about the proper
relations between perception and cognition, but most agree that perception precedes
cognition. That is, prior to becoming ‘objects of thought’, perceptual experiences are first
stored in the mind in schemata together with other experiences following similar patterns.
These primary perceptual schemata – percepts – are precognitive mental representations of
family resemblances between perceptual experiences, and in a sense they are the stuff of
thought. The categorization and cross-networking of perceptual schemata into more complex
schemata is a task of higher cognition. “To cognize,” Harnad (2005: 20) claims, “is to
categorize.” In other words, concepts are categorizations of percepts.
Perhaps we need no concepts whatsoever to perceive the differences in size between the two
boxes in Figure 27. After all, the percept ‘A is larger than B’ precedes both the
conceptualization of this percept as ‘A > B’ and the conclusion that we are dealing here with
22
an instance of ‘comparative reference’. These are present in the percept of the sequential
relation between the two images prior to its cognitive categorization.
Comparative Reference
[‘larger than’]
Concept Percept
A > B
Figure 27. ‘Larger than’ as concept and percept.
It seems that the mental processes underlying concepts of cohesive relations are somehow
different than those underlying percepts of them. If we return to the informal test in Figures 1
and 2 at the beginning of the paper, we see now that the odds against interpreters of Figure 1
were greater than those against interpreters of Figure 2. Determining ‘which is smallest’
required more cognitive effort in Figure 1 than in Figure 2. While Figure 2 required only a
perceptual judgment, Figure 2 required interpreters first to translate the conceptualized
linguistic information in the figure first into relational visual terms, and then back again into
conceptual terms, before they could answer. Translation of the digital into the analogical –
and vice versa – always involves in a loss of information (cf Watzlawick et al., 1967: 66). We
may conclude from this that while linguistics provides valuable means of describing cohesive
relations in film discourse, we may need to look elsewhere – perhaps to cognitive film science
– for viable ways of explaining how we recognize these relations when we see them.
References
A B
23
+Arnheim, Rudolf (1957) Film as art. Berkeley: University of California Press.
+Barthes, Roland (1981/2000) Camera lucida. London: Vintage.
+Beaugrande, Robert de (1980) Text, discourse, and process: Toward a multidisciplinary
science of texts. London: Longman.
+Beaugrande, Robert de and Dressler, Wolfgang (1981) Introduction to text linguistics.
London: Longman.
+Bordwell, David and Thompson, Kristin (2008) Film art: An introduction. New York:
McGraw-Hill.
+Coulmas, Florian (1989) The writing systems of the world. London: Blackwell.
+Deleuze, Gilles (1986) Cinema I. The movement image. H. Tomlinson and B. Habberjam
(trans). Minneapolis: University of Minnesota Press.
+Eisenstein, Sergei M. (1942) The film sense. J. Leyda (trans). New York: Harcourt, Brace &
Company.
+Halliday, M.A.K. and Hasan, Ruqaiya (1976) Cohesion in English. London: Longman.
+Harnad, Steven (2005) “To cognize is to categorize: Cognition is categorization.” In Cohen,
H. and Lefebvre, C. (eds) Handbook of categorization in cognitive science. Amsterdam:
Elsevier: 20-45.
+Hochberg, Julian (ed) (1998) Perception and cognition at century’s end: History,
philosophy, theory. London: Academic Press.
+Janney, Richard W. and Arndt, Horst (1994) “Can a picture tell a thousand words?
Interpreting sequential vs. holistic graphic messages.” In Nöth, W. (ed) Origins of semiosis:
Sign evolution in nature and culture. Berlin: Mouton de Gruyter: 439-453.
+Lyons, John (1977) Semantics. Vol 2. Cambridge: Cambridge University Press.
+Metz, Christian (1974/1991) Film language. A semiotics of cinema. M. Taylor (trans).
Chicago: University of Chicago Press.
+Monaco, James (2000) How to read a film. 3rd
Edition. Oxford: Oxford University Press.
+Peters, Jan M. (1981) Pictorial signs and the language of film. Amsterdam: Rodopi.
+Watzlawick, Paul, Beavin, Janet H. and Jackson, Don D. (1967) Pragmatics of human
communication. New York: Norton.
Endnotes
24
1 Here, however, it should be noted that the camera’s scope of reference is generally much
broader than the semantic scopes of indexical expressions in language. Hence the adage ‘a
picture tells a thousand words’. Frames can be full of information or relatively empty.
Regardless of this, according to Deleuze (1986: 12), “the frame teaches us that the image is
not just given to be seen. It is legible as well as visible.” That is, it is not only ‘viewable’ but
also ‘readable’. 2 For an interesting discussion of visual analogies to attributive and predicative adjuncts in
film, see Peters (1981: 29-37). 3 Strictly speaking, according to Deleuze (1986: 2), “cinema does not give us an image to
which movement is added, it ... gives us a movement-image.” 4 Many film theorists have suggested this analogy in the past. Peters (1981: 29-30), for
example, claims that a shot “expresses a ‘visual thought’, a percept, a perceptual judgment,
which may function as the predicate of a proposition.” Metz (1974: 66) states that “at a
certain point in the division into units, the shot, a ‘complete assertive statement’, as
Benveniste would call it, is equivalent to an oral sentence.” And according to Monaco (2000:
160), “we could say that a film shot is something like a sentence, since it makes a statement
and is sufficient in itself...” 5 “A film audience,” Eisenstein (1942: 7) says, “draws a definite inference from the
juxtaposition of two strips of film cemented together.” 6 Some film theorists, for example Metz (1974), distinguish a further structural unit between
the shot and the sequence called the ‘scene’ (a segment of narrative action taking place at one
time in one space), but this unit, which is derived from theatrical tradition, is sometimes
analytically problematical in modern films and will not be used. 7 It is interesting to note here that two types of reference can be imagined in film discourse:
first, direct or first hand reference made by the camera as an autonomous ‘eye’ viewing the
action; second, indirect or second hand reference made by the camera as a type of prosthetic
extension of a character’s-eye view of the action. 8 The entire sequence is not shown here. In it, we first see the tramp sitting dejectedly alone,
then the girlfriend and tightrope walker, then the tramp emerging from his body and kicking
the tightrope walker. He then returns to his body, and in the final shot he is shown once again
sitting dejectedly alone. 9 Together, they portray the evolution of tools from the bone age to the space age. This is one
of the longest ellipses in film history – a gap of approximately 2 million years.