Victorian Periodical Text Collection

Dr Alexis Antonia


Right click and 'Save as' to download:

Corpus 200

The initial collection was assembled as the working corpus for a research higher degree entitled Anonymity, Individuality and Commonality in Writing in British Periodicals between 1830 and 1890: A Computational Stylistics Approach.

Nova: The University of Newcastle's Digital Repository. Identifier: uon:6075

It was intended that there should be a sufficient number of articles to provide a good representation of the repertoire of discursive prose as it stood at the time. The 200 texts were all published in periodical journals during the sixty year period from 1829, when the three major quarterlies were dominating the scene, through the 50s 60s and 70s, when the monthlies came into their own and challenged the quarterlies for reader loyalty, through to 1890, after which both began to decline in popularity. Though most of the articles were anonymous at the time of publication, they all appear to have been reliably attributed thanks largely to the invaluable work of the Wellesley Index.

These initial articles were taken from five quarterlies (99 articles) and six monthlies (101 articles) (table 1 below) and comprised just under two million words.

Table 1

Quarterlies Monthlies
The Edinburgh Review Blackwood's Edinburgh Magazine
The Quarterly Review Cornhill Magazine
The Westminster Review The Fortnightly Review (which became monthly)
Bentley's Quarterly Review Fraser's Magazine for Town and Country
The National Review Macmillan's Magazine
Tait's Edinburgh Magazine

Eight women and fourteen men wrote the articles (table 2 below), the gender imbalance reflecting the fact that many more men than women were writing for the journals. Each author is represented by at least five texts and up to as many as fourteen. The authors represent a good spectrum of the variety of writers contributing to the journals at the time: from those who considered themselves primarily as journalists; to those who contributed articles as a side line; from those who wrote from economic necessity; to those who combined journalism with other forms of writing.

Table 2

Men Women
Walter Bagehot (1826-1877) Frances Power Cobbe (1822-1904)
John Stuart Blackie (1809-1895) George Eliot (1819-1880)
John Hill Burton (1809-1881) Christian Johnstone (1781-1857)
Thomas Carlyle (1795-1881) Eliza Lynn Linton (1822-1898)
Lord Robert Cecil (1830-1903) Harriet Martineau (1802-1876)
John Wilson Croker (1780-1857) Anne Mozley (1809-1891)
James Anthony Froude (1819-1894) Maragaret Oliphant (1828-1897)
William Rathbone Greg (1809-1881) Elizabeth Lady Eastlake née Rigby (1809-1893)
Abraham Hayward (1801-1884)
Thomas Henry Huxley (1825-1895)
Charles Kingsley (1819-1875)
George Henry Lewes (1817-1878)
Thomas Babington Macaulay (1800-1859)
Sir Leslie Stephen (1832-1904)

Expanding Corpus

In the years following the completion of the research higher degree a number of additional texts were added to the collection for various reasons: (i) in order to pursue a particular research enquiry; (ii) because of intrinsic interest; or simply (iii) because they were available in electronic form. The corpus itself was used as one of two large corpora for testing the relative merits of different size word n-grams in authorship attribution.

Antonia, Alexis, Hugh Craig and Jack Elliott. "Language chunking, data sparseness, and the value of a long marker list: explorations with word n-grams and authorial attribution". Literary and Linguistic Computing, 2013.

Saturday Review Collection

The Saturday Review collection was initially compiled for the specific purpose of carrying out a number of attribution tests on various anonymous articles. In particular, we were interested in shedding light on the long-standing question of who wrote the rather spiteful and condescending 'Women's Movement' articles in the Saturday in the 1850s?

"Who Wrote the Women's Movement Articles in The Saturday Review?" Nineteenth-Century Gender Studies (2008)

A second line of enquiry looked at the 'Modern Women' series of articles (1866-68), demonstrating the ease with which the methods of computational stylistics could separate similar-seeming articles written by different authors.

A third line of enquiry attempted to describe the distinctiveness of the 'house style' of the Saturday by comparing the articles of 6 authors who wrote both for the Saturday and for other journals.

Craig, Hugh and Alexis Antonia. "Six Authors and the Saturday Review: A Quantitative Approach to Style". Victorian Periodicals Review, 48, 1, 2015.

Tait's Edinburgh Magazine additional texts

Following correspondence with Eileen Curran concerning her doubts about some of the Wellesley attributions in Tait's Edinburgh Magazine for two Scottish authors (John Stuart Blackie and John Hill Burton) additional texts were prepared to test some of the uncertain attribution texts against some well attributed ones.

Antonia, Alexis and Ellen Jordan. "Checking some Wellesley Index Attributions by Empirical 'Internal Evidence': The Case of Blackie and Burton." Authorship, 1.1, Fall 2011.

Christian Remembrancer Collection

A number of Christian Remembrancer texts were prepared for a series of investigations seeking to identify the contributions of Anne Mozley to the journal.

Ellen Jordan, Hugh Craig, and Alexis Antonia. "The Brontë Sisters and the Christian Remembrancer: A Pilot Study in the Use of the "Burrows Method" to Identify the Authorship of Unsigned Articles in the Nineteenth Century Periodical Press." Victorian Periodicals Review. 2006.

Antonia, Alexis and Ellen Jordan. "Identifying Anne Mozley's Contributions to the Christian Remembrancer: A Computational Stylistic Approach". Victorian Literature and Culture, 42, 2, 2014.

Acquisition of Electronic Texts

A variety of methods was used to obtain the electronic texts of the collection. Most of the texts were transcribed onto the computer from a photo-image or a microfilm copy of the journal article. Some articles were sourced from public domain electronic texts available in online collections: the National Library of Australia's online ProQuest British Periodicals Collection; the Oxford Internet Library of Early Journals both of which provided photo images of texts for either transcribing or scanning; and the Gutenberg site which allowed the downloading of texts in editable form. Other articles were sourced from microfilm copies of the Journals. Where published editions of periodical articles existed in authorial collections of writings these were photocopied and scanned or transcribed.

Editing of the Electronic Texts in Preparation for Computational Stylistic Research Work

Good electronic text preparation is vital to the success of any computational stylistics project and must be done with thoroughness and exactitude. E-texts need to be proof-read since both key-boarding and OCR scanning can produce unexpected errors. The next step is to prepare the electronic texts for counting. Various protocols have been adopted to ensure that when the counting took place, the machine was able to count only what it was supposed to count. My practice for ensuring consistency throughout the periodical collection was to use the angled bracket notation of the Text Encoded Initiative (TEI) protocol for all exclusions and changes (listed below) so that these would remain obvious and recoverable.

Exclusions and Changes