(a) Monolingual
(b) Professional writing (texts of academic standard)
(c) Synchronic (1995-2002)
(d) Regional variety (AmE/BrE/etc.)
(e) Sample (50,000 words per journal)
(f) Selection criteria
In order to ensure an objective selection of journal texts, the project team decided to base content decisions on data obtained from the Journal Citation Reports (JCR), which presents quantifiable statistical data for an objective and systematic approach to determining the relative importance of journals within their subject categories. As of 2001, the Science Edition of the JCR contained about 5,700 journals. It uses a unique indicator called “Impact Factor,” which provides a way to evaluate or compare a journal’s relative importance as perceived by others in the same field. Employing these data, the journals with the top 20% impact factor in each field were selected for inclusion in the PERC Corpus. JCR classifications were also used to define the subject fields.
(i) Domains: science and technology including life science (texts from approximately 170 subdomains are classified into the following 22 domains. These domains can be accessed separately as sub-corpora. For further details, see the sub-corpus sections of the concordancer.)
Agriculture
Biology
Chemistry
Civil Engineering
Computer Science
Construction & Building Technology
Earth Science
Electrical & Electronic Engineering
Engineering
Environmental Sciences
Fisheries
Food Science
Forestry
General Science
Materials Science
Mathematics
Medicine
Metallurgy & Metallurgical Engineering
Nuclear Science & Technology
Oceanography
Physics
Telecommunications
(ii) Media: academic journals
The following information is indicated by the mark-up:
1. Sentence boundaries, parts of speech and lemma
2. Meta-textual information regarding the source or encoding of individual texts (Detailed descriptive information is added to each text, in the form of a header, which includes the author's name, title, publication year, journal title, etc.)