Latest news

Chronological coverage in OED2 and OED3: new scholarship or old?

EOED has just completed its series of pages on the OED’s treatment – both past and present – of different periods in the language. The series starts at Period coverage

Here is a summary of the main points.

  • The OED3 revision (begun in 2000, now about halfway through the alphabet) is adding enormous quantities of new quotations to its predecessors’ record of the language. Quotations form the evidential basis for the OED, so this increase indicates that the Dictionary’s account of the English language – on both a large and small scale – is changing significantly
  • Unfortunately, we cannot see which these new quotations are! This is because the OED website searches do not differentiate between revised and original entries. Instead, however, we can count up the total of quotations per decade in the current hybrid version of the OED, OED Online (i.e. a mixture of old and new scholarship), and compare the results with the equivalent totals in the pre-revision version of the OED (i.e. OED2)
  • Comparison of this sort tells us that, very broadly speaking, over 1500-1989, the revised OED appears to be reproducing the chronological biases of the old OED (click on the link to Chart 3 in the right-hand image below). So there is a bulge of quotations over the late 16th/early 17th centuries (Chart 8), a dip in quotation evidence in the early to mid-18th-century (Chart 13), and a steep rise towards the 1880s or so (Chart 19). 20th-century coverage is more uneven (Charts 21 and 37)
  • In a decisive departure from the practice of the first edition of OED, OED3 is no longer gathering huge numbers of quotations from major literary and cultural writers as evidence for the history of vocabulary in English. Instead, the revising lexicographers are raiding vast electronic databases of multi-authored sources for its new quotations – newspapers, journals, and periodicals. See discussions at Top sources in OED3, 1800-1929 in OED3, 1930 onwards in OED3
  • Notwithstanding this changed practice, Shakespeare, Chaucer, Milton, Caxton, Dryden, Dickens and hundreds of other male literary canonical writers continue to dominate the list of most quoted sources in today’s OED. This is because the OED3 has simply retained much of the original OED1 quotation evidence rather than archiving the original Dictionary and starting again. In this respect, the OED3 revision is producing a 21st-century dictionary bolted onto a Victorian one (click on links above and Which edition contains what?
  • As a result, women writers remain significantly under-represented in the OED. As of the June 2020 update, OED Online’s own list of top 1,000 quotation sources includes just 28 women. It is impossible to search OED quotations by gender of author, but inferentially the vast majority of contributors to the newspapers, journals and periodicals that OED3 is now favouring as quotation sources will also be male. On OED’s under-quotation of female sources, see further 1700-1899 in OED3 (Chart 14 and the discussion of Frances Burney beneath; Caroline Herschel and Philosophical Transactions) and information on the top individually-authored sources from 1930 onwards in OED3. Preliminary evidence and notes can also be found at Top female sources, currently in preparation. EOED’s 2009 study of the under-representation of 18th-century women writers (funded by the Leverhulme) is under Topics.

To read more, click on the pages below

Period coverage

1150-1499 in OED1/2

1150-1499 in OED3

1500-1699 in OED1/OED2

1500-1699 in OED3

1700-1799 in OED1/OED2

1700-1799 in OED3

1800-1929 in OED1/OED2

1800-1929 in OED3

1930-1989 in OED2

1930 onwards in OED3

OED Text Visualizer tool and the current state of OED Online

OED Online has recently put up a new tool on its website at

The case for visualization tools such as these is that they represent different categories of quantitatively assessed data in a visually striking way. They are especially useful when they indicate groupings or relationships between constituent elements of the data that researchers might not previously have noticed or considered.

The OED Text Visualizer certainly has the potential to do this. Users can type in text of up to 500 words long to see the etymological source of each words (Germanic, Romance, etc) and when it first entered the language.

In its present form, however, the tool is problematic. The major issue is as follows. As its accompanying text explains, the Text Visualizer draws on two important components of OED Online entries: etymological origin of a word, and date of first recorded usage. What is not explained is that just under half of these two sets of OED data are significantly out of date, in some cases by a hundred years and more, since the entries from which they are derived are as yet wholly or partially unrevised. 

It follows that the results produced by the Text Visualizer represent an undifferentiated mixture of internally inconsistent lexicography, some of it significantly out of date. The tool needs to be reconfigured so that users can distinguish between results derived from modern lexicographical scholarship (i.e. 2000 onwards) from those based on entries first published in earlier stages of the Dictionary (stretching from 1884 to 1989). In its current form the Text Visualizer delivers results which are not yet appropriate for use in academic research. 

The Text Visualizer also provides information on the frequency of use of a word, both in the year the user has assigned to the text and in ‘modern English’. This is valuable, but no account or reference is made to the source of this information, which we may guess to have been Google N-grams, presumably manipulated or adapted in some way. Users of the tool need to know the source of the figures cited so that they can understand the assumptions on which they have been produced. This is a basic requirement for academic research.

One excellent feature of the new tool, nevertheless, is that its results are produced in csv and other formats and hence are far easier to work with for research purposes than the search results currently available on OED Online (see under Search tools below).

A more general comment is as follow. Setting aside the criticisms above, the OED visualization tools so far produced (e.g., geographical origin of vocabulary in English over time) have been captivating but over-determined. That is, they make assumptions about what researchers are interested in. By contrast, it is a widely acknowledged truism that good research comes out of giving researchers free and unfettered access to primary data, so that they can explore and think about it independently. The range of search tools on OED Online already provides a generous range of possibilities for new types of research, though of course we would all like more tools and more/better data to be available (for example, the currently provided information on frequency of head words is unsatisfactory). The problem is that these website tools don’t work well and the results are delivered in an unanalysable format, as described on EOED at OED Online.

OUP is now planning ‘a new suite of tools based on an OED Text Annotator engine,’ of which the Text Visualizer critiqued above is an example. Exciting as such tools are, there are other features of the OED Online website in its current form which are so unsatisfactory as to require immediate attention. Sorting these out is a priority of at least equal if not greater importance than a new set of tools, especially if the new tools repeat the flaws of the existing ones. Here is a list.

Urgent issues for OED Online 

Transparency on date of entries and changes to entries

  • OED Online needs to make it entirely clear to users that its website presents a mix of new, revised, and unrevised entries, some of which have been unchanged or little changed for over a hundred years. Electronic searches should distinguish between revised and unrevised entries, otherwise the results are not usable for research purposes. It is worth pointing out that if users were able to search OED3 independently of OED2, they would be in a position to appreciate the quality and characteristics of OED3’s lexicographical innovation and scholarship. The character and achievements of OED3 are currently under-recognized because they are impossible to identify systematically, i.e. across a range of entries.
  • When significant changes are made to revised entries, these should be flagged. An example is the change made to the definition of marriage after new UK legislation in 2013. The entry continues to be dated 2000. Researchers need to be able to make use of and cite dictionary entries with confidence that the dates they bear are accurate. 
  • Similarly, unrevised entries frequently contain unidentified changes and additions (to definitions, editorial notes, quotations and other components) added since date of first or subsequent print/web publication. Again, OED Online needs to find a way of recording significant changes so that academic users can use and cite Dictionary entries with an understanding of their provenance and with confidence that the date-stamping provided by OED itself is accurate.

Quotation sources

  • A pressing issue for the OED is the unevenness of balance in its most heavily cited quotation sources. These sources are listed on the OED website and accessible via a front-page link (‘Explore the top 1,000 authors and works quoted in the OED’). As of June 2020, only 28 are by identifiably female authors. The reasons for this imbalance are evidently not straightforward but the matter needs to be acknowledged and discussed and the editors should say what they are doing to tackle the issue. For example, it would be extraordinarily helpful if it were possible to search by gender of author, where known. See further EOED pages on Top sources, Fe/male sources.
  • The question of the balance of quotations between white and non-white writers of English is also a salient issue, one that OED will certainly be thinking about. Geographical spread in sources quoted is not a reliable proxy, given that many quotations are from (colonial era) white authors.

Search tools

  • Electronic searching of OED’s text continues to yield flawed results, even when using search pathways indicated by the website. For example, if you click on the top item on OED’s list of top 1,000 sources, which is The Times, and follow the directions to identify the quotations in question, many of the results turn out to be from unrelated publications (Musical Times, N.Y. Times, Financial Times, etc). With large bodies of evidence it is impracticable for users to weed out false results by hand or by subsequent searches.
  • The form in which website results are provided is not usable for research purposes. By contrast, the Text Visualizer’s provision of different formats for search results is exemplary. Similar features should be imported into OED Online. 

Editorial principles and practiceother accompanying information

  • Description of editorial principles and practice. Over its initial 20 years OED3’s editorial practices – and by inference, editorial policies – have varied considerably, e.g. on the provision of and criteria for usage notes and labels of various kinds. Users need full information and guidance here, preferably in one location on the website which is easy to locate, access and search.
  • The ‘About’ section of the website ( contains much valuable material (e.g. on the history of the OED) but is hard to navigate. Users are often unaware of its contents. It needs to be completely reorganized, with content properly indexed and pages dated. 

EOED re-launched

November 2019 sees the launch of the new version of Examining the OED. The site has been rewritten and reorganized and lots of new material added – notably under Period coverage, where we look at the changes the new version of OED (OED3) is gradually making to the OED’s picture of the chronological shape of the language.

To find your way around, have a look at our new Contents page and (for a list of all material on the website) Site map.

All feedback and suggestions welcome.