Higher revision available You are viewing revision 13 of this document. A higher revision of this document has been published: Revision 17.

European Literary Text Collection (ELTeC) in TextGridRep (2021-11-22)

This is the project site of the European Literary Text Collection (ELTeC) in the TextGrid Repository (TextGridRep). The goal of adding the ELTeC to TextGridRep is to publish and archive this valuable set of corpora in European languages and combine them with the technical possibilites that TextGridRep offers. Below, we list some of the possibilities that TextGridRep facilitates to researchers and readers who are interested in the ELTeC. Currently, we have imported the 11 subcorpora of the ELTeC that contain more than 50 novels.

Browsing the ELTeC in TextGridRep

Here we present some possibilities of how to browse the ELTeC in TextGridRep:

In all these cases, you can add further filters with the options on the left.

Languages

Here are links the alls from each language subcorpus (edition.language:"[language]"):

Filtering through Specific Metadata of the ELTeC (Facets)

Because some specific metadata fields are relevant for the composition of the ELTeC, these have been incorporated as new searchable metadata (facets) to TextGridRep. For this, the metadata in the TEI files incorporated in the TextGrid metadata fiels. Here we present some possible queries specific for the ELTeC:

Of course, queries combining these facets are possible.

Benefits of ELTeC in TextGridRep

The ELTeC is already available as GitHub repositories and in Zenodo. So, what is the motivation to publish it also in TextGridRep? In our opinion, TextGridRep can offer a series of advantages to the ELTeC and its community of users:

  1. Long-term archive: TextGridRep is a long-term repository awarded with the CoreTrustSeal
  2. Identification: TextGridRep assigns persistent identifier to all subcorpora, works and editions of the ELTeC
  3. Integration: in TextGridRep, the ELTeC is integrated in one of the largest literary corpus openly available
  4. Combination with other corpora: users can combine easily some texts of the ELTeC with other corpora, for example filtering the entire TextGridRep by language or year of publication
  5. Shelf function: TextGridRep offer the shelf function, with any user can combine
  6. Publication in HTML: in contrast to other platforms, the TEI files are also published as HTML, enabling search engines to find them easily
  7. Transformation: Besides the HTML format, all texts in TextGridRep are authomatically transformed in other formats (zip, ePUB, plaintext)
  8. Analysis: TextGrid allows the sending single texts or entire subcorpora to Natural Language Processing (via Switchboard) and Digital Humanities tools (Voyant)
  9. Integration in the NFDI Consortium Text+ Portfolio: TextGridRep is part of the services of the Consortium Text+ as part of the German National Strategy of Research Data
  10. Future integration in future services: TextGridRep is further developed in association with several ongoing projects. With its integration, the ELTeC will profit from future features and development

TextGrid Metadata Files

The basic metadata is covered by the TextGrid Metadata schema in Edition and Work metadata, all additional project specific metadata is covered by the metadata added to the works. Please see the following two examples:

Technically, there are two parts of metadata: Metadata that can be searched using facets, and metadata that cannot.

Further Internal Documentation

Example Project

Project name: DISTANT READING XI

Project ID: TGPR-d683fb0f-b71d-89fa-6678-61979ac32d0f

Issues

More and up-to-date issues please find in the project's Gitlab Issues

and in the Gitlab import project


Citation Suggestion for this Object
TextGrid Repository (2021). README.md. Distant Reading – 2022-11-22. . https://hdl.handle.net/21.T11991/0000-001B-9340-4