CoNSSA: Corpus of Novels of the Spanish Silver Age

CoNSSA: Corpus of Novels of the Spanish Silver Age

The corpus contains novels written by Spanish authors published between 1880 and 1939. The original corpus contains in total 358, however, due to copyright issues, 219 can be published currently. The corpus is designed considering the data of two authoritative Histories of Literature. These Histories represent a specification of the population of literature of this period. A subsection of the corpus (CoNSSA-canon) can be considered a representative population of the most important novels of this period. [more...]

Aggregation 1–1 of 1

CoNSSA: Corpus of Novels of the Spanish Silver Age

A full description of the corpus can be found in the chapters 3.1 and 3.2 of following book (Open Access and printed):

Besides, there is an article written in Spanish about the main characteristics of the corpus (Open Access):

CoNSSA, Text+ and TextGrid Repository

This corpus was already published through GitHub and Zenodo (DOI) previously.

As part of the activities of the consortium Text+ in the German National Research Data Infrastructure Germany (NFDI), a new version of the corpus is now also available in TextGrid Repository.

This new version contains:

  1. A better modeling of the FRBR model of the works, editions and texts in the TEI Header
  2. Data from further editions exported from the German catalog [K10plus}(https://opac.k10plus.de)
  3. Each work was described using library classification systems such as the Regensburger Verbundklassifikation (RVK), the Basic Classification (or Basisklassifikation, BK), and the Göttinger Online-Klassifikation (GOK). By that, we apply to research data the same classification systems which are used for describing primary and secondary literature
  4. References for works and authors to Wikidata, the in the German-speaking area authority files GND, VIAF and identifiers by the Spanish National library (BNE)

Queries in TextGrid Repository

  • Search for words:
    • Madrid
    • dictador
  • Further options for searches are available:
    • Españ*
    • mujeres~, hombres~
  • Search for authors (with complete name, part of the name or GND-ID):
    • work.agent.value: Benito Pérez Galdós
    • work.agent.value: Galdós
    • work.agent.id:"gnd:118641573"
  • Search for gender:
    • work.subject.id.value: authorGender AND work.subject.value: female
  • Search for year of publication
    • published in: work.dateOfCreation.value:1900
    • published after: work.dateOfCreation.value:>1901
    • published before: work.dateOfCreation.value:>1901
    • published between: work.dateOfCreation.value:>1900 work.dateOfCreation.value:<1910

Of course, these searches can be combained to construct pretty complex queries using information of the author, the edition and the text. For example, following query should find all texts written by women, published between 1890 and 1900 in which the root Españ appears in the text:

  • work.subject.id.value: authorGender AND work.subject.value: female AND work.dateOfCreation.value:>1890 work.dateOfCreation.value:<1900 AND Españ*

Why publish this corpus in TextGrid Repository if it was already available in GitHub and Zenodo?

  1. Persistent identifiers
  2. Repository with Core Trust Seal
  3. Repository for XML TEI
  4. Search functions
  5. Filtering functions
  6. Links to GND
  7. Combination
  8. Analysis
  9. Automatic annotation
  10. Manual annotation
  11. Download options
  12. Further developments
  13. Publication for reading

History of the corpus

The corpus was composed as a part of the PhD of José Calvo Tello at the University of Würzburg (Germany). It was part of the project Computational Literary Genre Stylistics (CLiGS), lead by Prof. Dr. Christof Schöch. The project was located at the Professorship of Prof. Dr. Fotis Jannidis.

The goal of the project was to analyze the Spanish novel and its subgenres (adventure, erotic, realistic novel, etc.) in the so-called Silver Age period (1880-1939).

Current version

Because of these changes, the corpus is now in its version 2.0. Specially about the FRBR model means that many metadata information is now in other place in the TEI Header, which forces to update the xPaths to extract this information.