Ετήσια διαδικτυακή συνάντηση DARIAH 2020, 7 Νοεμβρίου 2020, Ζάγκρεμπ, Κροατία


Goulis, H., Tsouloucha, E., “BBT User stories. Benefits from joining the thesaurus federation”, Presented at the Scholarly Primitives DARIAH Virtual Annual Event 2020. Zenodo (περισσότερα)

The paper briefly presents the benefits of joining the Backbone Thesaurus (BBT) federation and becoming part of a gradually growing community of thesauri maintainers. It focuses on the experience of DAI and FRANTIQ as use cases of existing thesauri holders that have aligned to the BBT. The latter is the research outcome of work undertaken by the Thesaurus Maintenance WG to design and establish a coherent overarching thesaurus for the Humanities, under which specialist thesauri and structured vocabularies used across scholarly communities can be aligned. Its core feature is that it promotes alignment of cutting-edge terminology to well-formed general terms of the meta-thesaurus capturing general meanings. It favors a loose integration of multiple thesauri, by means of mapping them to a small set of top-level concepts (facets and hierarchies), enabling cross-disciplinary resource discovery, while ensuring compatibility with thesauri that cover highly specific scientific domains and areas of knowledge in development. 

Dritsou, V., M. Ilvanidou, I. Despotidou, V. Liakopoulou, K. Vourvachaki, P. Constantopoulos, “Integrating archival materials for the study of the turbulent Greek 40s”, DARIAH Virtual Annual Event 2020: Scholarly Primitives, 10-13 Nov. 2020, sciencesconf.org:dariah-ae-2020:309708.

Η δημοσίευση έλαβε εύφημη μνεία.

Humanities researchers often need to study heterogeneous digitized archives from different sources. But how can they deal with this heterogeneity, both in terms of structure and semantics? What are the digital tools they can use in order to integrate resources and study them as a whole? And what if they are unfamiliar with the methods and tools available? Towards this end, DARIAH-EU  and CLARIN research infrastructures already support researchers in exploiting digital tools. Specific use case research scenarios have also been developed, with the PARTHENOS SSK being a successful example. In this paper we describe our related (ongoing) experience from the development of the Greek research infrastructure APOLLONIS, where, among others, we have focused on identifying and supporting the workflows that researchers need to follow to perform specific research studies while jointly accessing disparate archives. Using the decade of 1940s as a use case, a turbulent period in Greek history due to its significant events (WWII, Occupation, Opposition, Liberation, Civil War), we have assembled (digitized) historical archives, coming from different providers and shedding light on different historical aspects of these events. From the acquisition of the resources to the desired final outcome, we record the workflows of the whole research study, including the initial curation process of the digitized archives, the ingestion, the joint indexing of the data, the generation of semantic graph repesentations and, finally, their publication and searching. After the acquisition of the heterogeneous source materials we perform a detailed investigation of their structure and contents, in order to map the different archive metadata onto a common metadata schema, thus enabling joint indexing and establishing semantic relations among the contents of the archives. The next step is data cleaning, where messy records are cleaned and normalized. Natural Language Processing methods are then exploited for the extraction of additional information contained in the archival records or in free text metadata fields, such as persons, places armed units, dates, and topics, which enhance the initial datasets. The outcome is encoded in XML using the common schema and ingested into a common repository through an aggregator implemented using the MoRE system. A joint index based on a set of basic criteria is generated and maintained, thus ensuring joint access to all archival records regardless of their source. In addition, an RDF reprentation is generated from the encoded archival data enabling their publication in the form of a semantic graph and supporting interesting complex queries. This is based on a specifically designed extension of CIDOC CRM and a compilation of a list of research queries of varying complexity encoded in SPARQL. Preliminary tests of the entire workflows and the tools used in all steps yielded very encouraging results. Our immediate plans include full scale ingestion and indexing of the material from a number of archives, producing the corresponding semantic graph and streamlining the incorporation of new archives.

Constantopoulos, P., G. Artopoulos, C. Dallas, P. Hacigüzeller, “Process graphs”, DARIAH Virtual Annual Event 2020: Scholarly Primitives, 10-13 Nov. 2020

In the light of Unsworth’s seminal work on scholarly primitives and early initiatives to understand the nature and capture the variety of digital methods and tools in the humanities through classifications (such as the AHDS Taxonomy of Computational Methods and the AHRC ICT Methods Network), the endeavour to understand and explicitly model scholarly processes found its place in the agenda of DARIAH as early as the preparatory phase. The Scholarly Research Activity Model1 was then formulated, followed by the NeDiMAH Methods Ontology (NeMO) which was developed within the NeDiMAH project and was subsequently adopted and supported by DARIAH VCC2 through the DiMPO WG. NeMO aimed at capturing the scholarly research process through a set of concepts representing the main elements of the humanities research ecosystem, their intrinsic structure, and the relations among them. The explicit representation of relations among concepts enables representations of research processes in the form of semantic networks best suited for associative, exploratory search and inference. Taxonomies, on the other hand, such as those above or TaDiRAH (developed by DARIAH-DE), can be incorporated as hierarchical term dictionaries in NeMO. The development of NeMO was informed by the extensive empirical study of scholarly information practices, needs and attitudes performed by the DiMPO WG across Europe, and was validated in a series of workshops. In subsequent work related to DARIAH-GR, a streamlined process (Research Spotlight) was developed to extract information from research articles, enrich it with relevant information from other Web sources, organize it according to the domain-neutral part of NeMO (appropriately elaborated, called Scholarly Ontology), and republish it in the form of linked data. This enables compiling semantic graph databases capturing who has done what, how, why and with what results. We propose a synergy session in which the WGs involved will jointly explore the recording and analysis of actual and/or prototypical instances of work processes in their respective domains of interest and the compilation of semantic graph databases supporting advanced documentary and analytical work. The Digital Practices for the Study of Urban Heritage and the GeoHumanities WGs will put forward different classes of problems, yet all of them calling for heavy use of highly differentiated digital methods and corresponding goals. They will thus present complementary views on all aspects of the research process. For example, drawing on FRBRoo and the IfcOWL ontology, and going beyond previous efforts in linking Building Information Modelling elements with a conservation ontology, the proposed effort to bring together the WGs above will highlight the challenges of using a CIDOC CRM – based modelling approach to structure logically more diverse architectural data (i.e., structure, typology and usage) at a scale larger than a building, e.g., clusters of heritage buildings, forming streets and neighbourhoods of cultural and historical value. The Digital Methods and Practices Observatory WG will offer NeMO and its streamlined application as methodological framework. We expect the session to initiate collaboration among the WGs that will lead to an enriched conceptualization of their own domains, an appreciation of common (and non-common) patterns and, possibly, joint work on developing semantic graph compendium(s) of digital practices.