recent change in american and british english, a corpus...

40
Recent change in American and British English: a corpus-driven approach Paul Baker Lancaster University

Upload: others

Post on 10-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 2: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Why British and American English?

• The “special relationship” • Two nations divided by a common language • To what extent is the British “lag” in evidence? • Are the varieties getting more similar or different? • Which forms of English are likely to influence global

usage? • Quite a bit of disagreement or uncertainty (Juola,

Hebblethwaite, Leech, Finegan) • A lot of studies have focussed on existing hypotheses

or questions about specific types of language (e.g. corpus-based).

Page 3: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

The Brown Family

• 8 matched corpora, each 1 million words, 500 samples of 2000 words each

• 15 genres represented

• American and British written published English

• Sampling points 1931, 1961, 1991/2, 2006

Page 4: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Sampling framework Text category letter and description Files

A Press: Reportage 44

B Press: Editorial 27

C Press: Reviews 17

D Religion 17

E Skills, Trades and Hobbies 36

F Popular Lore 48

G Belles Lettres, Biographies, Essays 75

H Miscellaneous: Government

documents, industrial reports etc

30

J Academic prose in various disciplines 80

K General Fiction 29

L Mystery and Detective Fiction 24

M Science Fiction 6

N Adventure and Western 29

P Romance and Love story 29

R Humour 9

Page 5: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Health warnings

• Findings only relate to standard published English only (which is not at the coalface of innovation)

• 1 million words is relatively small – so I have focussed on high frequency patterns

• With four sampling points we must take care not to assume straightforward linear changes (we can only infer what was happening at other points)

Page 6: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Average Word length

4.4

4.45

4.5

4.55

4.6

4.65

4.7

4.75

4.8

1920 1940 1960 1980 2000 2020

American

British

Page 7: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Average Sentence length

17.2

17.4

17.6

17.8

18

18.2

18.4

18.6

18.8

19

19.2

19.4

1920 1940 1960 1980 2000 2020

American

British

Page 8: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Proportion of different words used (type token ratio)

43

43.5

44

44.5

45

45.5

46

46.5

1920 1940 1960 1980 2000 2020

American

British

Page 9: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Features examined

• Single words (e.g. money) • 2, 3, 4 and 5 word clusters (on the other hand) • Single part of speech tags (NN1 = singular

common noun) • 2, 3, 4 and 5 word sequences of POS tags (NP1

NP1 = proper noun, proper noun) • Semantic tags via Wmatrix (G3: defence and

warfare) • 2, 3, 4 and 5 letter sequences within words (-ology, -fess-)

Page 10: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Two methods

• 4 sets of 2- way keyword comparisons (e.g. AE 1931 vs BE 1931)…

Tag 1931 1961 1991 2006

AT article (the, no) ✓

DDQ wh-determiner (which, what) ✓ ✓ ✓

EX existential there ✓ ✓ ✓

NNB preceding noun of title (e.g. Mr) ✓ ✓

RG degree adverb (very, so, too) ✓ ✓ ✓

RR general adverb ✓ ✓

Page 11: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

…and the Coefficient of Variance

• CV = a word’s standard deviation divided by its mean, multiplied by 100

• around: 110, 245, 407, 630 (CV = 64)

• money: 306, 325, 306, 332 (CV = 4)

Page 12: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

around (high cv) and money (low cv)

0

100

200

300

400

500

600

700

1920 1940 1960 1980 2000 2020

around

money

Page 13: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Cut-offs

• I am interested in relatively high frequency phenomena: for most features, I considered items that occurred 1000 times in either the 4 American or the 4 British corpora (for single words I also went down to 100 times)

• Focus mostly on items which showed a constant increase or decrease over time.

• Rather than look at every word, I have concentrated on those with the highest and lowest CVs or keyness scores (usually the top 10, 20 or 50 cases)

Page 14: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Densification American

English

British English Both

Increasing

that’s, children’s,

Mom, ensure

*ism

phone

I’m, onto, Mum

UK, NHS, BBC

it’s, didn’t, don’t,

The ‘s’-genitive

Dad kids, TV

*ology

Et al

NN1 NN1 NN1

NN1 NN1 NN2

[Mean word length]

Decreasing

let us

automobile

upon, cannot, need not

more or less, at any rate

two or three, in view of

on the other hand

from time to time

II31 II32 II33 AT

Great Britain

The of-genitive

any one

on the part of

[Mean sentence

length]

Page 15: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Colloquialisation

American

English

British

English

Both

Increasing guys

okay

I

my

your

don’t want to

a kind of

I’m going to

VVGK

guy

a bit of

I have to

apostrophe use

Dad

kids

TV

then

you have to

Taboo

language

Decreasing I do not

Page 16: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Sexual and excretory swear words

0

50

100

150

200

250

300

350

400

1920 1940 1960 1980 2000 2020

American

British

Page 17: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Profane uses of language

0

50

100

150

200

250

300

1920 1940 1960 1980 2000 2020

American

British

Page 18: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Democratisation

American

English

British

English

Both

Increasing feminist

of women in

might

gender access to

support for

could

can

Decreasing men

of man

Mr and Mrs

Colonel

*fess

NNB

Sir

Rev.

shall

must

Mr

Mrs

NNB NP1 NP1

Page 19: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Spelling American

spelling example

British spelling

example

or/our color colour

re/er centre center

ize/ise organize organise

ization/isation civilization civilisation

yze/yse analyze analyse

og/ogue catalog catalogue

e/ae anemia anaemia

e/oe fetal foetal

ce/se defense defence

l/ll canceled cancelled

ction/xion connection connexion

-/e aging ageing

toward/towards toward towards

-/st while whilst

gray/grey gray grey

Page 20: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

-ize vs -ise

-100

-80

-60

-40

-20

0

20

40

60

80

100

1931 1961 1991 2006

American

British

Preference for -ise

Preference for -ize

Page 21: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Summary of findings

• American English – almost 100% adherence in 9 out of 17 cases

• British English – almost 100% adherence in 6 out of 17 cases

• British English – weakening grasp on colour, practise as verb, travelling, queueing

• American English – switched to amoeba, weakening grasp on queuing

Page 22: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Summary of all spellings

-88.9 -85.4 -85.8 -86.4

89.6 93.1 92.3 91.9

-100

-80

-60

-40

-20

0

20

40

60

80

100

1931 1961 1991 2006

American

British

Page 23: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Is automisation of language responsible?

• Spell-check can be set to American or British English.

• Grammar and style checks may also influence writing e.g. avoid clichéd language: on the part of, the fact that, so far as, as to the, the spirit of, for the most part

• Avoid passive sentences – big decreases in taken, given and made (in passive cases), increases in I. Also decreases in dummy pronouns it and there as well as decreases in BE, VVN

Page 24: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Part of Speech tags

• Decreasing modality and hedging – less use of modal

verbs, gradable adverbs (AE leading) • 96% of RG consists of: very, so, as, too, about, quite,

over, rather, far and pretty • Rather – often used to ‘understate’ a negative

evaluation rather unseemly, rather unsightly, rather disappointing

• Very – to strengthen positive evaluation good, great, useful, important etc.

• This could be a move towards densification too?

Page 25: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Gradable adverbs

0

1000

2000

3000

4000

5000

6000

1920 1940 1960 1980 2000 2020

American (CV=11)

British (CV=17)

Page 26: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Translation Table (Telegraph 2 September 2013)

What the British say What the British mean What Foreigners

understand

That is a very brave

proposal

You are insane He thinks I have courage

Quite good A bit disappointing Quite good

I would suggest Do it or be prepared to

justify yourself

Think about the idea, but

do what you like

Very interesting That is clearly nonsense They are impressed

You must come for

dinner

It’s not an invitation, I’m

just being polite

I will get an invitation

soon

Could we consider some

other opinions

I don’t like your idea They have not yet

decided

Page 27: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Decline of DDQ (which/what)

0

1000

2000

3000

4000

5000

6000

7000

8000

1931 1941 1951 1961 1971 1981 1991 2001

American

British

Page 28: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Are we using relative clauses differently?

• Theory 1 – which is declining due to other ways of writing relative clauses (e.g. that or the zero clause)

• Theory 2 - which is declining because relative clauses are declining generally

• Theory 3 – which is declining despite changes to relative clauses

• *_AT* (_{A})? *_N* (_{N})? (that|which)

Page 29: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

The “which” relative clause

0

500

1000

1500

2000

2500

1931 1941 1951 1961 1971 1981 1991 2001

American

British

Page 30: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

The “that” relative clause

0

500

1000

1500

2000

2500

1931 1941 1951 1961 1971 1981 1991 2001

American that

British that

American which

British which

Page 31: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Finding zero relative clauses

_A* (_{A})* (_{N})* _N* (he|she|they|you|it|we)

False positives (around 10%)

• Shake the hand of a squaddie They deserve our thanks

• To a certain extent it is horribly dangerous

The majority replace that not which

• Apart from anything else , there was a feeling he had become a joke figure

• by the time you get this Mary will be turned three

Page 32: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

The zero relative clause

0

500

1000

1500

2000

2500

1931 1941 1951 1961 1971 1981 1991 2001

American

British

Page 33: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Grammar/style checkers

Page 34: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Key Semantic tags

Key in American English 1931 1961 1991 2006

G2.1 Law and Order ✓ ✓ ✓ ✓

G3 Warfare, defence and army;

weapons

✓ ✓ ✓ ✓

S5+ Belonging to a group ✓ ✓ ✓

Y2 Information, technology and

computing

✓ ✓ ✓ ✓

Key in British English 1931 1961 1991 2006

A3+ Existing ✓ ✓ ✓ ✓

A13.3 Degree: Boosters ✓ ✓ ✓

A13.5 Degree: Compromisers ✓ ✓ ✓

Page 35: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Qualitative analysis

• Information, technology and computing is key in AmE due to the high frequency of program(s) tagged this way, whereas programme(s) is not.

• Warfare – common in the press – Vietnam, Gulf War, War on Terror. The firearms debate in the US.

Page 36: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

G2.1 Law and Order

0

1000

2000

3000

4000

5000

6000

1931 1941 1951 1961 1971 1981 1991 2001

American

British

Page 37: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

S5+ Belonging to a group

0

1000

2000

3000

4000

5000

6000

1931 1941 1951 1961 1971 1981 1991 2001

American

British

Page 38: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Conclusions

• General trends in densification, democratisation, colloquialisation

• More cases where American English seems to have led change, especially for grammatical tags and tag sequences

• Moves towards convergence in the latter time period.

• But spelling differences are likely to hold in the future (word processing options?)

Page 39: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Future predictions

• Further densification: e.g. “zero” forms. This. Because science, missing apostrophes, acronyms, emojis

• Vanishing titles and references to gender: gender-neutral pronouns.

• Less hedging, more active forms, more first and second person pronouns, but more nominalisation.

• Taboo language more common (except for religious taboo)

• Wider vocabulary

Page 40: Recent change in American and British English, a corpus ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-02... · E Skills, Trades and Hobbies 36 F Popular Lore 48 G Belles Lettres,

Thank you

• Baker, P. (2011) 'Times may change but we'll always have money: a corpus driven examination of vocabulary change in four diachronic corpora.' Journal of English Linguistics 39: 65-88.

• Finegan, E. (2004) American English and its distinctiveness. In E. Finegan and J. R. Rickford (eds) Language in the USA. Cambridge: Cambridge University Press, pp. 18-38.

• Leech, Geoffrey. 2002. Recent grammatical change in English: data, description, theory. In Karin Aijmer & Bengt Altenberg (eds.), Proceedings of the 2002 ICAME Conference, 61-81. Gothenburg.

• Potts, A. and Baker. P. (2012) 'Does semantic tagging identify cultural change in British and American English?' International Journal of Corpus Linguistics 17:3 295-324.

• Smith, Nicholas. 2002. Ever moving on? The progressive in recent British English. In Pam Peters, Peter Collins & Adam Smith (eds.), New frontiers of corpus research, 317-330. Amsterdam: Rodopi.