Comparing OED2 and OED3

On 13 March 2008, OED3 published its first batch of cross-alphabet revisions (see information at OED3, in our section OED Editions). Up to that point, the revision had worked its way steadily through the alphabet, beginning at the letter m and (by the previous quarter, i.e. 13 December 2007) reaching part-way through r. This meant that it was relatively easily to compare OED2’s and OED3’s treatment of selected date ranges, or large sets of individual quotation sources, by isolating the revised stretch of entries in each case and looking at them side by side.

Since this new pattern of revision began, however, it has been virtually impossible to compare OED2 and OED3, given the difficulty of picking out the revised entries, now scattered across the alphabet. The problem here is that – most unfortunately – OED3 is electronically merged with OED2. Consequently one cannot search OED3 independently of OED2, i.e. as a separate entity, in order to analyse and compare the two sets of data systematically on any large scale.

Fortunately, however, EOED searched for the respective quotation totals of some 18th-century male and female writers in the first few days of March 2008, confining its comparison to the alphabet-range at that stage revised by OED3, i.e. m-quit shilling. The results are given below.1

OED3’s treatment of some female sources over the alphabet range M-quit shilling (data gathered 2-6 March 2008)
AuthorTotal OED2 quotationsOED2 quotations over M‑quit shillingOED3 quotations over M‑quit shillingIncrease in OED3 quotations% increase
F. Burney1,94733556523069
M. Edgeworth1,12923534911449
A. Radcliffe1,10316227711571
W. Montagu675131143129
C. Smith324901526269
H. More20243722967
S. Fielding84184426144
M. Wollstonecraft7817142125735
C. Macaulay2143300
A. Bannerman0011N/A

In all cases, the number of quotations from these female writers had risen, ranging from significant increases (in terms of absolute numbers, if not percentages) for Burney, Edgeworth, Radcliffe, and Wollstonecraft, down to a few tens of quotations or less for the other writers.

At first sight, this looks to be a cheering development in the OED’s treatment of female-authored sources, although the differential rate of quotation is perplexing. One assumes that Burney, Edgeworth, Radcliffe, and Wollstonecraft were identified as sources of special importance – but why go to the trouble of reading Charlotte Macaulay, Penelope Aubin, and Anne Bannerman and not quote from them more intensively, given that female-authored quotations are so few anyway? This seems an inefficient use of lexical research. And is the difference in treatment due to the linguistic characteristics of these texts or the perceived cultural (or literary) importance of the authors?

Comparison with male-authored sources

However, comparing OED3’s treatment of a handful of male-authored sources of the 18th century puts the data in a different perspective, especially given that all these male writers are already heavily quoted in OED:

OED3’s treatment of some male sources over the alphabet range M-quit shilling (data gathered 2-6 March 2008)
AuthorTotal OED2 quotationsOED2 quotations over M‑quit shillingOED3 quotations over M‑quit shillingIncrease in OED3 quotations% increase
H. Fielding1,926340806466137

The discrepancy between the numbers of quotations from male and from female authors is very nearly as striking in OED3 as in OED2. This is mainly because OED3 is carrying over the first edition’s vast quantities of male-authored quotations into the new edition, so that – given that none of the female sources are being as intensively mined for the third edition as male sources were for the first edition – the existing male-to-female proportions are being preserved. Additionally, however, it looks as if the OED lexicographers were, at that stage (i.e. March 2008), continuing to give some male-authored sources quite significantly preferential treatment over female-authored ones: for example, both Fielding and Defoe, already handsomely cited in the first edition of the Dictionary, had been given far more attention by the revisers than any female authors of the period.

Comparing the two tables on this page, and contemplating the differences in quotation rate between different writers, it is hard to feel that linguistic considerations alone are at work here. Cultural and social values seem to be asserting themselves as well. The chief editor of OED3, John Simpson, has explained in his Preface to OED Online that the Reading Programme for OED3 is exploring a much wider range of sources than its predecessors during the course of their revision, the implication being that OED3 will correct the first edition’s biases in favour of male-authored over female-authored and literary over non-literary sources, and against the 18th century:

In addition to the ‘traditional’ canon of literary works, today’s Reading Programme covers women’s writing and non-literary texts which have been published in recent times, such as wills, probate inventories, account books, diaries, and letters. The programme also covers the eighteenth century, since studies have shown that the original Oxford English Dictionary reading in this period was less extensive than it was for the previous two centuries.

‘The Reading Programme’, OED Online (, accessed July 2019, quoted more extensively at OED3 quotation sources)

Where both gender and literary bias are concerned, however, it is difficult to see how any such correction can be achieved unless the lexicographers prune, quite significantly, OED’s enormous banks of quotations from canonical male authors – and try to find new quotations from female rather than from male authors, especially the male authors already much quoted in the Dictionary.

But throwing away good lexical evidence goes against the grain for any historical linguist. And it seems particularly perverse to do so now, given that online publication would appear to remove many of the practical and financial constraints which forced the first lexicographers to restrict their account of the history of the language in the first place (Murray complained to the Philological Society in 1890 that the ruthless culling of quotations was ‘a sorrowful necessity’, required so as to keep the Dictionary’s size in check; nevertheless, ‘as the quotations are the essence of the work, it is like shearing Samson’s locks’; K. M. E. Murray 1977: 274). Just as importantly, many of OED’s users are literary scholars who would be appalled if the OED reneged on its predecessor’s function of ‘literary instrument’, i.e. acting as a tool to explain and contextualize the vocabulary of major and minor literary writers (see EOED page on Writers and dictionaries).

Title page from the first American edition of A Vindication of the Rights of Woman, 1792. Source: Wikipedia

Is the solution for the lexicographers to keep these quotations from Pope, Cowper and the like, as they appear to be doing, but greatly increase their quotations from other types of source – from female-authored literary works and from non-literary works, whether by males or females, of a wide range of genres?

In some very small way, it appears that OED tried at an earlier stage in its history to correct the under-quotation of female sources. During the course of compiling his four-volume Supplement of 20th-century updatings to OED, published over 1972-86, R. W. Burchfield slipped in a few hundreds of quotations from the novels, letters or journals of Dorothy Wordsworth, Jane Austen, and Maria Edgeworth, despite the fact that their 18th- and 19th-century origins would appear to have made them ineligible for inclusion at this stage [see Brewer 2015a: 751-4, downloadable on our Library page]. Was he trying to redress an imbalance in the parent dictionary? If OED3 were to extend Burchfield’s policy (if this was what it was) and examine such female-authored sources – which are abundant – more widely and more exhaustively, it could at the same time move towards compensating entirely, rather than only partially, for the short-fall in 18th-century sources quoted in the first edition of OED and still perceptible in the third.

This policy could most productively be extended to other periods in the Dictionary, for example the 19th century, where OED1/2 citations from Dickens (c. 8,200), Tennyson (c. 6,700), Carlyle (c. 6,250), Macaulay (c. 5,450) and others dwarf those from female writers, for example George Eliot (by far the highest quoted female author, with c. 3,100 citations), Harriet Martineau (c. 1,650), Mary Braddon (c. 1,500), or even Jane Austen (c. 1,050). See further EOED pages on Top sources and Top female sources.

The question of what the correct balance of quotation might be between male and female sources (as between different centuries) is a formidably knotted one and requires protracted research and analysis. Should it reflect the proportion of male to female speakers? or writers? or published writers? or some other ratio? It seems unlikely, however, that OED’s present balance is just. As the front page of its website tells us, this great dictionary is the ‘definitive record of the language’. Since it is necessarily based on written sources for much of the historical period it covers, it would seem appropriate to bring its proportions of male to female quotations up to those of the available source literature as an absolute minimum (and it could also be argued that OED ought to represent female-authored texts as much as possible over the earlier periods, given that the proportion of texts written by women is so out of step with the gender proportions of the literate population as a whole). But whatever decision the OED3 lexicographers arrive at, it is vital – in view of the fact that their dictionary furnishes the first port of call for virtually all historical research on English – that they set out and explain the basis on which they select their quotation sources where gender, or indeed any other category of language, is concerned.

