Semantic Collaborative Corpora Analysis for Humanities and Social Sciences is a virtual research environment (VRE) - i.e. a collaborative and semantic Web 2.0 tool from the area of eHumanities. It was exemplarily developed for research in the history of education and can be adapted for the future use by other disciplines.

Subject to the project Virtual Research Environment for Research in the History of Education with semantic Wiki technology, a virtual research environment was developed based on Semantic MediaWiki (SMW). This technology allows for a collaborative analysis of comprehensive digitised text corpora, and accessibility to members from the research community of educational historians. The virtual research environment was initially developed for the history of education, allowing for integrating digitised objects and their bibliographic metadata, to collaboratively analyse them both qualitatively and quantitatively, and thus connecting the concept of Linked Data research in the humanities. This eHumanities tool allows libraries to introduce results from their digitalization projects to scientific discourse (primary data). Direct semantic linkage of the digitised objects and outcomes of analyses can generate added value for the scientific chain of creating research value, and archiving is possible in integrated form. Semantic CorA supports the process of generating research findings by collaborative processes, offering a platform for the publication of findings - which thus also become resources for further research.

The virtual research environment „Semantic CorA" is a typical tool from the area of eHumanities. eHumanities (e as in "enhanced“ or “enabled“) targets the support for research in the humanities and qualitative social sciences, usually based on digital infrastructures.

Semantic CorA is iteratively developed and directed towards research practice – including agile computing, stepwise open-source publication and participative design. Researchers themselves are engaged in its design and encouraged to actively participate.

Projects using Semantic CorA

Centre for Digital Research in the Humanities, Social Sciences and Educational Science (CEDIFOR)

As a contact point for eHumanities, CEDIFOR aims to support research in the humanities and social sciences with qualitative orientation. In this context, DIPF specifically addresses transdisciplinary and national educational research.

The Centre for Digital Research in the Humanities, Social Sciences, and Educational Science aims to provide methodological expertise for initiating and supporting research in the humanities and social sciences with qualitative orientation in the Rhine-Main area, including a transdisciplinary educational scientific perspective. The sustainable storage of research data is intended. Pilot projects from eHumanities will be supervised in their conceptualisation and preparation of relevant research tasks, necessary technological support will be assured, and CEDIFOR will also provide an infrastructure and thus store project data and research findings.

We intend CEDIFOR to become a widely acknowledged address for eHumanities. Based on profound competencies and a comprehensive toolkit, customised solutions will be compiled for research questions in the humanities. CEDIFOR will also support research practice implementations.

During the project funding phase, the CEDIFOR innovation strategy will be immediately demonstrated at DIPF regarding support for three educational research projects. As a key provider of research and information infrastructures in education, DIPF pursues a transdisciplinary and also national support for educational research within the joint eHumanities project.

Supported Research Projects are:

  • Forschungskapazitäten für die qualitative Forschung durch Kollaboration und semantische Auszeichnung. Das Beispiel Unterrichtsinteraktion. (in cooperation with Goethe-Universität, Frankfurt)
  • Virtuelle Forschungsumgebung zur kollaborativen Analyse von Klassenraumfotografien. (in cooperation with TU Braunschweig)
  • Interlinking Pictura – Bertuchs „Bilderbuch für Kinder“ als semantisches Netz. (in cooperation with Bibliothek für Bildungsgeschichtliche Forschung (BBF), Berlin

A Virtual Research Environment for the History of Education based on a Semantic Wiki Technology (Semantic MediaWiki for Collaborative Corpora Analysis: Semantic CorA)

The project "A Virtual Research Environment for the History of Education based on a Semantic Wiki Technology (Semantic MediaWiki for Collaborative Corpora Analysis: Semantic CorA)" targeted the development of a virtual research environment (VRE) based on Semantic MediaWiki (SMW) for the collaborative analysis of comprehensive digitised data corpora and an exemplary sustained nesting in the professional community of the history of education. Moreover, the project provided a sharing of the researchers' enrichments and analysis and in the long term, an infrastructural distribution of Semantic CorA to other disciplines. It was funded by the German Research Foundation (DFG) from Januar 2011 until October 2014.

Owing to its concrete need for collaborative means of analysing pedagogical reference books, the domain of history of education offered a good starting point for exemplarily realizing a virtual research environment. Well-established co-operations existed in the community of researchers, librarians and technicians. Such collaborations have, for instance, led to several digitization projects and an amount of research data for this domain. Semantic CorA permitted an integration of digitised documents along with their bibliographic metadata, collaborative analysis in a quantitative and qualitative sense, and connection of linked data with practical research in the digital humanities. Libraries were enabled to integrate the products from their digitization projects (primary data) into professional discourse and generate added scientific value by semantically linking digitized ressources with analytic results – as well as enabling integrated archiving.

Semantic CorA links up to concrete research projects in the history of education, aimed at discourse and field analyses of pedagogical reference works. Dictionaries from Scripta Paedagogica Online (SPO) (1774-1942), hosted by the Library of the History for Education at the German Institute for International Educational Research (DIPF), are integrated. SPO indexes references to relevant pedagogical works at the level of articles, rendering them accessible online as image files. The corpus started with 25 lexica and a total amount of nearly 22,000 articles. The researchers extended the corpus to more than 80.

A participative design of the Virtual Research Environment (VRE) was conducted by consulting the researchers and empowering them to take an active part in development. Thereby, the development was adjusted to the research process and an agile computing, a step-by-step open-source publishing, was realized.

This project was funded by the German Research Foundation (DFG) entitled: "Entwicklung einer Virtuellen Forschungsumgebung für die Historische Bildungsforschung mit Semantischer Wiki-Technologie - Semantic MediaWiki for Collaborative Corpora Analysis (INST 367/5-1, INST 5580/1-1 and RI 803/10-2, STU 170/21-2, HO 2134/7-2)" in the domain of Scientific Library Services and Information Systems (LIS). It is realized in a cooperation of the German Institute for International Educational Research (DIPF), the Karlsruhe Institute of Technology (KIT), the Research Library for the History of Education (BBF), and the Georg-August-University Göttingen.

see more projects here


Interrelation between communities at the design of Semantic CorA

Semantic CorA focuses on the social sciences and humanities and establishes VREs in the research community of historical research in education. Collaborative work should be possible in the maintenance and analysis of research data while special attention is paid to the re-use of research data at the beginning and the end of the research processes. Semantic CorA therefore relies on Semantic MediaWiki, which ensures a certain degree of interoperability of newly data due to RDF export features (and other export formats like csv, json,..).

Semantic CorA aims at connecting three different communities which are:

  • researchers in the social sciences and humanities
  • developers
  • digital libraries.

Semantic MediaWiki

Semantic Media Wiki was chosen as a platform because it is a lightweight system with a broad community of developers and users. As its development is open source, the outcomes from our project can easily be reused and adapted by others. As Semantic CorA does not aim at developing a (technically) new vre, and the basic modular system architecture of MediaWiki (i.e. Semantic MediaWiki) can be adapted to needs. Semantic CorA targets at the management and analysis of large corpora but clearly does not claim to be a large scale solution for the totality of research fields in the humanities and social sciences. The RDF support of Semantic MediaWiki with its interoperability is a fundamental criterion which ensures reuse of new data in other rdf-based systems. This interoperability at the data level allows for thinking at a smaller scale and more networked environment in the context of vres.

Wikis are well-known as a tool for collaborative work in the web. Even given some critical usabilty issues e.g. regarding the syntax, the use of wiki-based systems draws on fundamental user experiences. Although editing in the wiki-based system is less trivial for techical laypersons than often assumed, they are generally familar with the concept. A positive effect of Semantic CorA was that after a while the users were able to construct own queries and templates to gain more information from and interact more flexibly with their data. This development is clearly due to the openness of the wiki system which is highly adjustable compared to other large VREs.