Film discourse cohesion

In: Hoffman, Christian R. (ed) (2010) Narrative Revisited. Amsterdam: Benjamins, 245-265.

FILM DISCOURSE COHESION

Richard W. Janney

Prologue

More than three decades ago, the film semiotician Christian Metz remarked that “film is hard

to explain because it is easy to understand” (1974: 69). In saying this, he was alluding to the

mysterious, almost mindless way film images seem to communicate as compared with

sentences of written language. This can be illustrated by comparing Figures 1 and 2 below.

Both represent the same information about relations between A, B, C, and D. Figure 1,

however, represents the information conceptually, in a manner analogous to that of sentences

of language, Figure 2, perceptually, in a manner analogous to that of shots in film sequences.

Figure 1. Conceptual relations. Figure 2. Perceptual relations.

We recognize at once that although the sequences in both figures are cohesive, they are not

equally coherent. That is, it takes more effort to understand relations between the parts in

Figure 1 than in Figure 2. This is confirmed if people are shown the figures separately and

asked to signal as soon as they know ‘which is smallest’. A response to Figure 1 usually takes

at least 20 to 30 seconds, while the response to Figure 2 is spontaneous. Why? And what does

this have to do with film discourse cohesion?

1. Cohesion in Film Discourse

This paper addresses the notion of ‘relations between parts’ in film. It is an experiment in

approaching film discourse cohesion from a point of view that might be compatible with

A

C D B

(A > B) (C > D) (B > D) (B < C)

2

approaches to language discourse in linguistics. Despite obvious differences between

conceptual representation in language and perceptual representation in film, language and

film share sequential characteristics that make it possible in principle to consider describing

principles of organization in shot sequences with concepts adapted from studies of intra- and

intersentential links in language discourse. A bridging concept in this connection is the

concept of cohesion.

Just as speech is more than just spoken writing, film is also more than audio-envisualized

script (film script made audible and visible). A glance at any film script shows what is lost in

terms of cohesion and coherence without visual information. In Figure 3, the column on the

left presents a stretch of dialogue from Jim Jarmusch’s film Ghost Dog (1999); the column on

the right describes the content of the camera shots accompanying the dialogue.

Figure 3. A sequence from Jim Jarmusch’s Ghost Dog (1999).

Aside from the exchange between Frank and Louise about wine, the script in Figure 3 is neither

cohesive nor coherent. It can be understood only with the help of the visual content on the

Script

Voice: Come in! One, two, three, four,

five, six, seven...

Frank: Have some more wine, Louise.

Louise: I don't want any more wine.

Frank: What the fuck? What do you want?

You want my Rolex? Whatever the

fuck...

Louise: Did my father send you here to do this?

It's a good book. Ancient Japan was a

pretty strange place. You can have it.

I'm finished with it.

Shots

A room: Frank watching an animated

cartoon on tv, Louise reading a book.

Frank turns to Louise.

Louise drinks more wine and throws

her book on the floor. A killer

appears.

The killer shoots Frank, gazes down

at the book, picks it up, and looks

back to Louise.

3

right. Thus, the unifying ties making this sequence a filmic text are cinematographic. The visual

images fill out the ellipses in the written script.

Both linguistic and cinematic texts require cohesive ties between their parts. In order to make

sense, the individual units of information, whether stored mentally as concepts or percepts,

have be related to each other in an interpretable fashion. It is in this sense, Beaugrande (1980:

17) notes, that the stability of a text depends on continuities of occurrence in its participating

systems: texts require sequential connectivity. The fact that sequential connectivity is central

to understanding all forms of discourse makes the forging of cohesive ties no less a

fundamental concern of filmmakers than it is of writers. Already in the 1930s, Sergei

Eisenstein, a pioneer of Soviet silent film, expressed “the need for connected and sequential

exposition of the theme, the material, the plot, the action, the movement within the film

sequence and within the drama a whole,” and complained that even “the simple matter of

telling a connected story” was becoming lost in the works of some of his contemporaries

(Eisenstein, 1942: 3).

Film depends to a large extent for its perceptual connectivity on the presence of cohesive

visual ties between frames in shots, shots in sequences, and sequences in larger narrative

units. It depends on the filmmaker’s ability to ‘put the pieces together’, as it were, in ways

that make the individual shots in a narrative sequence interpretable on the basis of

interpretations of the shots and sequences preceding and following them. Here, film and

language discourse are similar. Just as writers have lexical and grammatical techniques for

maintaining discourse cohesion, filmmakers have repertoires of shot composing and editing

techniques for maintaining cohesion in film narratives; and in film, as in writing, a

prerequisite for cohesion is the co-presence and co-referentiality of the elements joined

together by these techniques (Halliday and Hasan, 1976: 3). Stretches of film discourse, like

stretches of language, can be characterized in terms of the kinds of cohesive ties that link their

units of meaning together.

2. Units of Film Discourse

In modeling cohesion in film, it is necessary to provide a description of the filmic units linked

together in cohesive discourse, a typology of cohesive relations that can exist between the

filmic units, and a framework of cinematic techniques by which units are combined into

4

cohesive sequences. Most film theorists since Metz agree that the main structural units of the

dramatic film are the frame, the shot, the sequence, the episode, and the narrative (see Figure

4). If these units are compared with units of language discourse, certain parallels emerge

between cohesive structures in the two modes.

Figure 4. Units of Film Discourse.

2.1 Frame

At the most basic level, films are comprised of frames – the photographic images on the film

strip. Recorded and projected at a rate of 24 frames per second, these create the illusion of

natural movement on the screen. Although frames usually are not perceived individually

(unless frozen for dramatic effect), they have important functions in film discourse. Their

most relevant function is their powerful referentiality, which has interested semioticians ever

since Roland Barthes’ influential remarks on the photograph as ‘the This’ in Camera Lucida

(1981: 5): Barthes regarded reference as the founding order of the photographic image. The

photograph, he said, “is never anything but an antiphon of ‘Look’, ‘See’, ‘Here it is’; it points

a finger at a certain vis-à-vis, and cannot escape this pure deictic language... [The camera]

points at something, and [says]: that, there it is, lo! but says nothing else.” Because of their

important indexical role in pointing to the things in the camera’s field of vision, frames can be

regarded as the smallest meaningful units of film discourse structure.1

Figure 5. Frames refer.

SHOT FRAME SEQUENCE EPISODE NARRATIVE

5

Consider the frame in Figure 5 from Jim Jarmusch’s Ghost Dog (1999). It denotes a man

standing in front of a truck. If we wished to describe the referents of this frame in words, we

would have a large inventory of lexical items to choose from: the man could be described as

large, unsmiling, African-American, wearing a dark hooded sweat-shirt, in a dark duffel coat;

the truck could be described as old, French, an ice-cream truck; the spatial relationship

between the man and truck could be described as in front of, behind, and so forth. But

regardless of which particular items were chosen, the description would consist largely of

attributive adjectival forms of the sort commonly found in noun phrases of English

sentences.2 Frames, it seems, have functions in shots that can be likened to the functions of

noun phrases in sentences of language.

2.2 Shot

Combined into shots, the images in the individual frames are perceived to move. The addition

of movement to the film image increases the amount of information it communicates and

expands its meaning potential.3 Through movement, shots, in addition to referring to their

referents (via frames), seem to make something analogous to visual assertions about them.4

Notice what happens in Figure 6 when the frame in Figure 5 is put together with other frames

in the shot from which it was originally taken.

Figure 6. Shots predicate.

6

If we were to describe the sequence in words, we might say something like ‘a man standing in

front of a truck walks into the street’. That is, in addition to describing the shot’s referents

(the man, the truck), the description would elaborate on what happens: how the man moves,

where he goes, etc; the scope of expression would expand beyond the subject into a type of

statement about the action performed by the subject. Shots are thus more than simply the

sums of their references (the individual frames); they seem also to have predicational

functions analogous to those of verb phrases in English sentences.

We perhaps also recognize two further things about the shot in Figure 6: first, that the visual

reference to the man is, in a sense, ‘carried along’ within the individual frames into the

predication. Reference is embedded within predication. There is no clear distinction in camera

shots between reference and predication – both occur simultaneously. This obviously

distinguishes shots from sentences of language, where reference and predication are separate

functions delegated to distinct structures in the sequence (NP, VP). Second, note that the

shot’s ‘predicational meaning’ is not conveyed by any individual frame in the sequence but

arises from relations between the frames. I will come back to this point later.

2.3 Sequence

The next larger unit of film discourse structure beyond the shot is the sequence: a longer,

edited segment of film in which individual shots are combined into stretches of narrative

action. When shots are combined into sequences, film begins to take on characteristics of

extended discourse or text. The structuring of smaller units into progressively larger ones is

film’s counterpart to discourse syntax (see Metz, 1974: 177 ff).

In Figure 7, the shot in Figure 6 is seen in the sequence in which it originally appeared.

Viewed in succession, the shots form a cohesive series of filmic predications: (1) Ghost Dog

(the man in front of the truck) walks into the street [medium shot of Ghost Dog]; (2) Louie is

standing in the distance on the street [long shot of Louie]; (3) Ghost Dog and calls to Louie,

What is this Louie - High Noon? [long shot of Ghost Dog from Louie’s perspective]; (4)

Ghost Dogs says, The final shootout scene? [long shot of Louie from Ghost Dog’s

perspective]; (5) Louie answers from the distance, Yeah, I guess it is [long shot of Louie

continues]; (6) Ghost Dog replies, Yeah, well it’s very dramatic [medium shot of Ghost Dog].

7

Figure 7. Sequences narrate.

Such a sequence automatically triggers narrative inferences.5 Among the inferences triggered

here are the following: independently of film co-text, Louie in the dialogue in frame 3 refers

anaphorically back to the man in frame 2. At the same time, the dialogue and the gaze of

Ghost Dog in frame 3 refer cataphorically forward to Louie in frame 4. The alternating views

of the respective ends of the street on which the men are standing (shot/reverse shots 2-3), and

the buildings in the background, delineate the spatial boundaries of the event and mark Ghost

Dog’s and Louie’s locations relative to each other. The shot/reverse shots of Ghost Dog and

Louie imply the temporal continuity of their interaction, and so on. Such juxtapositions create

cohesion.

To repeat, film narratives, like language narratives, result from the progressive integration of

smaller discourse units into larger ones. Frames are combined into moving images in shots;

shots are combined into cohesive units of action in sequences; sequences are combined into

narrative units with beginnings, middles, and ends in episodes; and episodes are combined

into the larger narrative structures telling the story.6 When the units are properly combined,

the spectator experiences a smooth progression of audio-visual filmic events.

What is this Louie - High Noon?

Yeah, well it’s very dramatic The final shootout scene? Yeah, I guess it is...

4 5 6

3 2 1

8

3. Cohesive Relations in Film Discourse

As said earlier, in order for a film to be understandable, the discourse presenting it must be

cohesive. Visual cohesion is a binding force of film meaning. At the technical level, this is

achieved by editing the individual shots into longer sequences that seem to be perceptually

interconnected in space and time in the dramatic action. Barring artistic surprises, what the

spectator sees and hears at any given moment in a film has to be connectable in some fashion

back to what already has happened in the narrative, and forward to what will take place later.

Film coherence depends to a considerable extent on the construction of cohesive relations

between shots in sequences, sequences in episodes, and episodes in larger narrative structures.

In the early days of silent film, Hollywood filmmaker D.W. Griffith invented the continuity

system to solve this problem. Still relied on today, the continuity system is a repertoire of

editing techniques designed to produce coherent, cohesive successions of shots presenting a

dramatic narrative. Its purpose is to guarantee the smooth unfolding of envisualizations of

events in space, time, and action on the screen (cf. Bordwell and Thompson, 2008: 231 ff).

Continuity editing follows principles that are surprisingly similar to principles of cohesion in

language. Linguistics provides categories for describing these principles. Five classes of

cohesive ties are usually said to operate across sentence boundaries in language: reference,

substitution, ellipsis, conjunction, reiteration, and collocation (cf. Halliday and Hasan, 1976).

Analogies to most of these are found in film.

3.1 Reference

In studies of discourse cohesion, reference is regarded as an endophoric relation between

words in texts. Continuity of reference keeps track of the identities of topics (Halliday and

Hasan, 1976: 31). In the utterance sequence John is here – He came five minutes ago, for

example, the pronoun he in the second utterance refers anaphorically back to the same

referent as the one referred to by the proper noun John in the first, creating a relation of

identity between the two referents (Lyons, 1977: 660). Reference cohesion occurs whenever

one item in a stretch of discourse points anaphorically or cataphorically to another for its

interpretation. Forms often associated with reference cohesion are personal pronouns (I, you,

her, etc), demonstrative pronouns (this, that, here, there, etc), and comparatives (similarly,

differently, more, less, larger, smaller, etc) (Halliday and Hasan, 1976: 37). These latter two

9

types of reference – demonstrative reference and comparative reference – play a central role

in film discourse cohesion.

Demonstrative Reference

Nominal and adverbial demonstrative forms in language locate referents on scales of spatial

nearness or distance relative to the deictic locus of discourse; they distinguish these referents

here in the text from those there. Although film lacks such forms, it compensates by signaling

proximal and distal spatial relations visually via actors’ gazes and pointing gestures, and by

near/far camera positions and the relative sizes of images shots (close ups vs. long shots), and

so forth.

The sequence in Figure 9 from Fred Zinnemann’s High Noon (1952) is an example of how

this works. Here, spatial cohesion is realized via a combination of gaze direction (eyeline

match), camera distance (proximal/distal juxtapositions), and shifts of camera position

between the points of view of the protagonists (shot/reverse shot juxtapositions).

Figure 9. Demonstrative reference (gaze) in Zinnemann’s in High Noon (1952).

After walking out of his office onto the street, the Sheriff gazes to his left (close up 1) and

sees two women approaching from the distance in a wagon (extreme long shot 2). As the

wagon passes by, the women (low angle, medium close up 3) and Sheriff (reverse shot, close

up 4) exchange glances. As the women leave town, one of them gazes back over her right

shoulder toward the Sheriff (long shot 5) and watches him recede into the distance, standing

alone in the empty street (reverse shot, extreme long shot 6). The spatial coordinates of this

1 2 3

4 5 6

10

sequence are plotted diagonally along an axis of action running diagonally from one end of

the street (2) to the other (6), past the Sheriff, who is the focal point of the action in the

middle (4). All shots in the sequence are made from the same side of the imaginary line traced

by the wagon’s movement along this axis.

Nearly throughout the sequence, the camera alternates between the Sheriff’s and the women’s

points of view (shot/reverse shots 3-4, 4-5, 5-6). Notice that although this technique could

conceivably cause deictic confusion (the deictic zero-point of the narrating camera eye

changes places), we tend as spectators not to find the dislocations deictically confusing.

Notice also that through this technique, in fact, a subtle form of cohesion develops between

cataphoric and anaphoric visual references in the sequence. In shot 1, our attention is directed

by the Sheriff’s gaze cataphorically forward in the sequence to the women. In shot 5, it is

directed by the woman’s gaze anaphorically backward to the sheriff. Together, the cataphoric

and anaphoric visual references serve here to bracket the central event of the sequence – the

gazes between the Sheriff and the women in the middle. We also see in this sequence first a

spatial convergence, then a divergence, of the chacters’ visual standpoints along the axis of

action (distal, proximal, distal) that moves smoothly from left to right across the screen

(match on action). By such means, filmmakers maintain cohesive spatial relations in

sequences despite changing camera positions and perspectives.7

Comparative Reference

Certain classes of adjectives and adverbs in language signal comparative relations of identity,

similarity, or difference – items like same, similar, different (general comparisons), better,

worse, larger, smaller, etc (particular comparisons) (Halliday and Hasan, 1976: 76 ff).

Likeness and difference are thus referential properties insofar as things are ‘alike’ or

‘different’ only in relation to other things – comparison requires a tertium comparationis.

Although film lacks lexical means of expressing comparative concepts, it has means of

envisualizing likenesses and differences as percepts (see Section 4). For example,

comparative reference is used as a frame compositional technique to emphasize similarities

throughout Jean-Luc Godard’s Breathless (1960) (see Figure 10),

11

Figure 10. Comparative reference (similarity) in Godard’s Breathless (1960).

and it figures prominently in shot/reverse shot juxtapositions emphasizing differences in Fritz

Lang’s M (1931) in Figure 11.

Figure 11. Comparative reference (difference) in Lang’s M (1931).

3.2 Ellipsis

Ellipsis is a cohesive relation by which something left unexpressed is understood nonetheless,

as in the sentence sequence They have three daughters. Julie is the cleverest (Halliday and

Hasan, 1976: 142). Robert de Beaugrande (1980: 133) defines ellipsis as “the omission of ...

expressions whose conceptual content is nonetheless carried forward and expanded or

modified by means of noticeably incomplete expressions.” Although film lacks direct

equivalents of lexis or grammar, it nevertheless manages to establish cohesion through ellipsis

in the broad sense above. A common example is the shortening of movement sequences,

where a character is shown entering a room, taking a few steps, and suddenly standing on the

other side of the room without literally having been shown getting there. Showing the

beginnings, middles, and ends of actions is a standard continuity editing technique (match on

action) for eliminating parts of shots that do not contribute to the dramatic development of the

story. In the sequence from Alfred Hitchcock’s North by Northwest (1959) in Figure 12, for

12

example, parts of the predicational structures of the two comprising master shots – a man

watching from the window (frames 1, 3, 5) as a second man enters the room and crosses to the

other side (frames 2, 4, 6) – are cut out, creating breaks in the flow of action in both shots.

The missing frames are replaced by nothing (cf Halliday and Hasan, 1976: 88). Edited

together in the sequence, however, the shots create the impression of a single event. The

perceptual content of the two shots is carried forward despite the absence of its direct visual

expression.

Figure 12. Ellipsis in Hitchcock’s North by Northwest (1959).

A famous ellipsis occurs at the end of Hitchcock’s North by Northwest, in a sequence that

begins with the hero reaching out his hand to his colleague (1) as she clings to the side of a

cliff on Mt. Rushmore (see Figure 13). As he pulls her up (2), the setting unexpectedly shifts

to a sleeping car in a train traveling back to New York (3). In a single, unbroken movement,

he appears to lift the woman from the face of the mountain up into bed – and from the

dialogue, we learn that during the ellipsis they have married.

Figure 13. Ellipsis in Hitchcock’s North by Northwest (1959).

4 5 6

1 2 3

1 2 3

13

3.3 Junction

Conjunctive cohesion, or simply ‘junction’, as Beaugrande and Dressler (1981) call it, is one

of the main forms of cohesion in language. Junction involves the use of markers like and, yet,

so, nevertheless, because, etc to signal conceptual relations between clauses, sentences, and

paragraphs (Halliday and Hasan, 1976). Lexical items fulfilling junctive functions do not refer

directly to other items in the text but rather mark logical, junctive relations between items.

Beaugrande and Dressler (1981) distinguish four types of junctive relations: conjunction

(‘and’ relations), subordination (‘because’ relations), disjunction (‘or’ relations), and

contrajunction (‘but’ relations). Although film does not have logical forms expressing

concepts of additivity, causality, adversativity, or alternativity, it does manage to

communicate such relations perceptually. Many analogies to linguistic junctive relations are

found in shot sequences.

Conjunction (additive - ‘and’ relations)

Conjunction is an additive relation signaled linguistically by conjuncts like and, also,

furthermore, additionally, besides, etc (Beaugrande and Dressler, 1981). In film, additive

relations are often inferred from shot juxtapositions. As said earlier, given the sequential

nature of film, shots inevitably appear to follow each other additively even if they have

nothing in common other than their simple contingency on the film strip. Rudolf Arnheim

(1957: 21) claimed that “the succession of separate events implies a corresponding sequence

of time.” Conjunctive linkage thus almost has the status of a default perceptual expectation in

film discourse.

Continuity editing makes use of this assumption in various ways. As pointed out earlier, the

match on action technique is often used to represent continuations of events: something

begins happening in one shot and we see its continuation or completion in the next. In Orson

Welles’ Citizen Kane (1941), for example, an object falls from Kane’s hand and rolls to the

top of a stairway in one shot, then tumbles down the stairs in a second shot, and bursts in a

third (see Figure 14).

14

Figure 14. A conjunctive relation in Welles’ Citizen Kane (1941).

The shot/reverse shot technique is often used together with the match on action technique to

film actions from different, juxtaposed points of view. In Howard Hawk’s The Big Sleep

(1946), for example, a butler opens the door and greets an arriving guest in one shot, and the

guest is shown entering the room and returning the greeting in the next (see Figure 15).

Figure 15. A conjunctive relation in Hawk’s The Big Sleep (1946).

Disjunction (alternative -‘or’ relations)

Disjunction is an alternative relation established via or (sometimes either/or or whether⁄or)

between two or more concepts in a text where only one can pertain in the textual world

(Beaugrande and Dressler, 1981: 71 ff). It often expresses uncertainty, alternatives, or

afterthoughts. In film, disjunctive relations exist when only one of two or more alternative

images can pertain in the sequence. It is often used to suggest dream states, hallucinations, or

states of confusion in which characters are not sure of what they perceive, remember, wish,

etc. Figure 16, from Charles Chaplin’s The Circus (1928), portrays a tramp’s dilemma as he

sits, watching his girlfriend speaking with a handsome new tightrope star.8 He wants to stand

up, go over to the circus star, and kick him in the behind; but instead he sits, suffers, and does

nothing. In the sequence, both alternatives are depicted simultaneously via a double exposure.

Butler: ‘Good evening” Guest: ‘Good evening”

15

Figure 16. A disjunctive relation in Chaplin’s The Circus (1928).

There is a sequence in Roman Polanski’s The Tenant (1976) in which a psychotic man living

alone in a tenement building experiences a hallucination (see Figure 17). He goes to the

window, looks out toward a window across from his apartment, and sees himself standing in

it looking back at him. In the final shot he is ‘there’ as well as ‘here’.

Figure 17. A disjunctive relation in Polanski’s The Tenant (1976).

Contrajunction (adversative - ‘but’ relations)

Contrajunction is an adversative relation between causes and unanticipated effects. The

function of contrajunctive markers – but, yet, nevertheless, etc – is to smooth over transitions

at points where seemingly improbable combinations of events or situations occur

(Beaugrande and Dressler, 1981: 71-73). In film, contrajunctive relations are often

represented by shot combinations that seem unexpected, incongruous, or incompatible within

the sequence. In Jim Jarmusch’s Ghost Dog (1999) in Figure 18, for example, a killer is about

to shoot his victim when the screen goes black for a few seconds; the killer lowers his rifle to

discover that a small bird has landed on the muzzle, blocking his vision.

16

Figure 18. A contrajunctive relation in Jarmusch’s Ghost Dog (1999).

Subordination (causal -‘because’ relations)

Subordination is a causal relation that exists whenever the interpretation of one clause or

sentence in a stretch of discourse depends on the interpretation of another (Beaugrande and

Dressler, 1981: 71-74). Subordinating junctives – because, since, as, thus, therefore, etc –

express preconditions, causes, or reasons for items being joined together in the text. In film,

causal relations occur frequently between shots. Often, the match on action technique is used

to represent them, as in the scene of the massacre of civilians in Oddessa by Cossack soldiers

during the Russian Revolution in Sergei Eisenstein’s Battleship Potemkin (1925) in Figure 19.

Here, the Cossacks fire in one shot, and a woman is hit in the next.

Figure 19. A causal relation in Eisenstein’s Battleship Potemkin (1925).

3.4 Reiteration

Reiteration is a form of cohesion in which lexical items are said to enter into cohesive

relations by anaphorically or cataphorically calling attention to each others’ senses (Halliday

and Hasan, 1976; Beaugrande and Dressler, 1981). “The repeated elements,” according to

Beaugrande (1980: 133), “may have the same, different, or overlapping reference, and the

extent of conceptual content they can be used to activate varies accordingly.” Reiterative

17

patterns create metonymic, metaphorical, synonymic, antonymic, and other ties (Halliday and

Hasan, 1976: 278-279). Film establishes reiterative cohesion in a number of ways.

Metonymy

Metonymy is a part-for-whole relation in which reference to a part of something stands for the

whole. A famous instance of filmic metonymy occurs at the end of the ‘Dawn of Man’

sequence in Stanley Kubrick’s 2001: A Space Odyssey (1968) (see Figure 20), where a

prehistoric ape-man euphorically throws a bone into the air after realizing that it can be used

as a weapon. At the apex of its arc, the bone turns into a space craft. In this historic match cut,

both the bone and the space craft function as metonymies for the concept of tools.9

Figure 20. Metonymy in Kubrick’s 2001: A Space Odyssey (1968).

Metaphor

Metaphor is a comparative relation in which references to two or more objects combine to

signify some shared overriding concept, characteristic, or quality. Charles Chaplin’s Modern

Times (1936), for example, begins with a metaphorical juxtaposition of shots of a sheep herd

and of factory workers coming out of a subway to go to work (see Figure 21).

Figure 21. Metaphor in Chaplin’s Modern Times (1936).

18

Synonymy

Synonymy in language is a semantic relation between lexical items referring to the same

object, as in woman, female, lady. In film, it is created by montage sequences combining

different images with the same significance. Lang’s M (1931), for example, begins with a

scene in which a child has failed to come home to lunch after school. As her distraught

mother calls her name from the window, we see a progression of shots of ‘emptiness’ that

operate metonymically as synonyms for the child’s absence: she is not in the stairway, not in

the attic, and not at the table (see Figure 22).

Figure 22. Synonymy in Lang’s M (1931).

Antonymy

Antonymy in language is a semantic relation between lexical items referring to objects with

opposite meanings. In film, visual analogies to antonyms are created by montage sequences

combining images of opposites, as in Figure 23, where an image of a rich woman on the

Odessa stairway in Eisenstein’s Battleship Potemkin (1925) is contrasted with the image of a

legless beggar.

Figure 23. Antonymy in Eisenstein’s Battleshp Potemkin (1925).

19

3.5 Collocation

Collocation in linguistics refers to ties between lexical items that stand to each other in some

recognizable, significant relation without being directly connected in a semantic sense. As

with synonyms, a collocational relation exists without explicit reference to the other item. The

relation is based on associations involving knowledge of the subject fields referred to by the

other items in the collocation. Knowledge of collocational relations in film are built up co-

textually through repetitions of activities, images, or motifs in connection with each other. In

Jarmusch’s Ghost Dog (1999), for example, we find patterns of reference in which the ‘bad’

characters in the narrative - the gangsters - are systematically collocated with television

cartoons, as in Figure 24,

Figure 24. Cartoon watchers in Jarmusch’s Ghost Dog (1999)

while the ‘good’ characters - Ghost Dog and his friends - are collocated with books, as in

Figure 25.

Figure 25. Book readers in Jarmusch’s Ghost Dog (1999)

4. Film Discourse Cohesion Revisited

The purpose of the preceding section has not been to provide a full account of cohesive

relations in film but to illustrate some surprising similarities between cohesive relations in

20

language and film texts. It is instructive, I think, simply to recognize that such

correspondences exist despite the obvious formal differences between the two types of

discourse and their respective ways of generating coherent text. The ease with which

categories used to describe cohesive relations in language can be applied to describing film

cohesion is in itself intriguing; but pointing this out alone does not bring us further in

unraveling the implications of Christian Metz’s paradoxical observation, quoted at the

beginning of the paper, that “film is hard to explain because it is easy to understand” (1974:

69). Indeed, perceptions of cohesive relations in film often seem easier to understand than to

explain.

Thinking about the shot juxtapositions discussed in Section 3, we are almost tempted to ask

what there is to ’understand’ in many of them in the first place. The cohesive relations seem

so simple and self-evident that they require no thought at all. We simply see them. For

example, reconsider the frames in Figure 26:

Figure 26. A small man looks up; a large man looks down.

With little cognitive effort, we see two men – one small, one large – looking at each other.

This perceptual judgment is spontaneous, like the judgment of the sizes of the boxes in Figure

2 at the beginning of the paper. And, indeed, it is necessary, for without it, we could not go on

to the next step and cognitively categorize the relation between the two frames as

‘comparative’. But if we look at the images themselves, we see that there is nothing in them

individually that in any way marks, points to, or operates as a cue to, the comparative relation

we have just perceived and categorized. The perceived relation is clear without the help of

explicit markers. Why?

Paul Watzlawick et al. (1967: 61 ff) would say that the answer lies in the analogical nature of

film images as opposed to the digital nature of language: words and visual images

21

communicate by different means and each has strengths and weaknesses in communicating

different types of information. Language reduces information to discrete, binary, digital

(either/or), conceptual units, which are combined additively according to the logical syntax of

language to yield ‘sums’ or ‘products’ of the operations performed on the individual units in

the sequence (cf Janney and Arndt, 1994: 444). Film, on the other hand, represents

information holistically, without reducing it to discrete conceptual units, and it has nothing

comparable to the logical syntax and semantics of language. Its messages are relational,

figurative, and analogical (gradient, matters of degree). As a result, according to Watzlawick

et al. (1976), while abstract concepts and logical propositions are better ‘said’, relational

orientations and affective states are better ‘shown’: “Digital language has a highly complex

and powerful logical syntax but lacks adequate semantics in the field of relations, while

analogic language possesses the [relational] semantics but has no adequate syntax for ...

unambiguous definition“ (1976: 66-67). Indeed, returning to Figure 26, we can see the

relational impoverishment of a small man looks up; a large man looks down compared with

the wealth of relational information communicated visually by the two images.

Considered in this light, perhaps the ease of understanding film is not quite as difficult to

explain as Metz suggested (1974: 69). The problem may simply be a consequence of Metz’s

presupposition that film understanding requires a linguistic explanation. Other solutions are

suggested in cognitive psychology (cf Hochberg, 1998). There is the possibility, for example,

that cohesion in film discourse is not primarily a conceptual phenomenon at all but rather

originally a perceptual one. Cognitive psychologists have argued for decades about the proper

relations between perception and cognition, but most agree that perception precedes

cognition. That is, prior to becoming ‘objects of thought’, perceptual experiences are first

stored in the mind in schemata together with other experiences following similar patterns.

These primary perceptual schemata – percepts – are precognitive mental representations of

family resemblances between perceptual experiences, and in a sense they are the stuff of

thought. The categorization and cross-networking of perceptual schemata into more complex

schemata is a task of higher cognition. “To cognize,” Harnad (2005: 20) claims, “is to

categorize.” In other words, concepts are categorizations of percepts.

Perhaps we need no concepts whatsoever to perceive the differences in size between the two

boxes in Figure 27. After all, the percept ‘A is larger than B’ precedes both the

conceptualization of this percept as ‘A > B’ and the conclusion that we are dealing here with

22

an instance of ‘comparative reference’. These are present in the percept of the sequential

relation between the two images prior to its cognitive categorization.

Comparative Reference

[‘larger than’]

Concept Percept

A > B

Figure 27. ‘Larger than’ as concept and percept.

It seems that the mental processes underlying concepts of cohesive relations are somehow

different than those underlying percepts of them. If we return to the informal test in Figures 1

and 2 at the beginning of the paper, we see now that the odds against interpreters of Figure 1

were greater than those against interpreters of Figure 2. Determining ‘which is smallest’

required more cognitive effort in Figure 1 than in Figure 2. While Figure 2 required only a

perceptual judgment, Figure 2 required interpreters first to translate the conceptualized

linguistic information in the figure first into relational visual terms, and then back again into

conceptual terms, before they could answer. Translation of the digital into the analogical –

and vice versa – always involves in a loss of information (cf Watzlawick et al., 1967: 66). We

may conclude from this that while linguistics provides valuable means of describing cohesive

relations in film discourse, we may need to look elsewhere – perhaps to cognitive film science

– for viable ways of explaining how we recognize these relations when we see them.

References

A B

23

+Arnheim, Rudolf (1957) Film as art. Berkeley: University of California Press.

+Barthes, Roland (1981/2000) Camera lucida. London: Vintage.

+Beaugrande, Robert de (1980) Text, discourse, and process: Toward a multidisciplinary

science of texts. London: Longman.

+Beaugrande, Robert de and Dressler, Wolfgang (1981) Introduction to text linguistics.

London: Longman.

+Bordwell, David and Thompson, Kristin (2008) Film art: An introduction. New York:

McGraw-Hill.

+Coulmas, Florian (1989) The writing systems of the world. London: Blackwell.

+Deleuze, Gilles (1986) Cinema I. The movement image. H. Tomlinson and B. Habberjam

(trans). Minneapolis: University of Minnesota Press.

+Eisenstein, Sergei M. (1942) The film sense. J. Leyda (trans). New York: Harcourt, Brace &

Company.

+Halliday, M.A.K. and Hasan, Ruqaiya (1976) Cohesion in English. London: Longman.

+Harnad, Steven (2005) “To cognize is to categorize: Cognition is categorization.” In Cohen,

H. and Lefebvre, C. (eds) Handbook of categorization in cognitive science. Amsterdam:

Elsevier: 20-45.

+Hochberg, Julian (ed) (1998) Perception and cognition at century’s end: History,

philosophy, theory. London: Academic Press.

+Janney, Richard W. and Arndt, Horst (1994) “Can a picture tell a thousand words?

Interpreting sequential vs. holistic graphic messages.” In Nöth, W. (ed) Origins of semiosis:

Sign evolution in nature and culture. Berlin: Mouton de Gruyter: 439-453.

+Lyons, John (1977) Semantics. Vol 2. Cambridge: Cambridge University Press.

+Metz, Christian (1974/1991) Film language. A semiotics of cinema. M. Taylor (trans).

Chicago: University of Chicago Press.

+Monaco, James (2000) How to read a film. 3rd

Edition. Oxford: Oxford University Press.

+Peters, Jan M. (1981) Pictorial signs and the language of film. Amsterdam: Rodopi.

+Watzlawick, Paul, Beavin, Janet H. and Jackson, Don D. (1967) Pragmatics of human

communication. New York: Norton.

Endnotes

24

1 Here, however, it should be noted that the camera’s scope of reference is generally much

broader than the semantic scopes of indexical expressions in language. Hence the adage ‘a

picture tells a thousand words’. Frames can be full of information or relatively empty.

Regardless of this, according to Deleuze (1986: 12), “the frame teaches us that the image is

not just given to be seen. It is legible as well as visible.” That is, it is not only ‘viewable’ but

also ‘readable’. 2 For an interesting discussion of visual analogies to attributive and predicative adjuncts in

film, see Peters (1981: 29-37). 3 Strictly speaking, according to Deleuze (1986: 2), “cinema does not give us an image to

which movement is added, it ... gives us a movement-image.” 4 Many film theorists have suggested this analogy in the past. Peters (1981: 29-30), for

example, claims that a shot “expresses a ‘visual thought’, a percept, a perceptual judgment,

which may function as the predicate of a proposition.” Metz (1974: 66) states that “at a

certain point in the division into units, the shot, a ‘complete assertive statement’, as

Benveniste would call it, is equivalent to an oral sentence.” And according to Monaco (2000:

160), “we could say that a film shot is something like a sentence, since it makes a statement

and is sufficient in itself...” 5 “A film audience,” Eisenstein (1942: 7) says, “draws a definite inference from the

juxtaposition of two strips of film cemented together.” 6 Some film theorists, for example Metz (1974), distinguish a further structural unit between

the shot and the sequence called the ‘scene’ (a segment of narrative action taking place at one

time in one space), but this unit, which is derived from theatrical tradition, is sometimes

analytically problematical in modern films and will not be used. 7 It is interesting to note here that two types of reference can be imagined in film discourse:

first, direct or first hand reference made by the camera as an autonomous ‘eye’ viewing the

action; second, indirect or second hand reference made by the camera as a type of prosthetic

extension of a character’s-eye view of the action. 8 The entire sequence is not shown here. In it, we first see the tramp sitting dejectedly alone,

then the girlfriend and tightrope walker, then the tramp emerging from his body and kicking

the tightrope walker. He then returns to his body, and in the final shot he is shown once again

sitting dejectedly alone. 9 Together, they portray the evolution of tools from the bone age to the space age. This is one

of the longest ellipses in film history – a gap of approximately 2 million years.

Date post:	11-Nov-2023
Category:	Documents
Upload:	lmu-munich
View:	0 times
Download:	0 times

Film discourse cohesion

Documents