Announcing the EuropeanaTech conference 2015

On 12-13 February 2015 the 2nd EuropeanaTech Conference will take place at the National Library of France in Paris. The title of this year’s conference is ‘Making the beautiful thing – Transforming technology and culture’. Presenters and participants from Europe and around the globe will be sharing knowledge and collaborating on the themes of data modelling (including the Europeana Data Model), content re-use, discovery, multilingualism and open data. For more information about the conference, including the themes of the breakout sessions and topics of the renowned international keynote speakers, take a look at the conference programme.

Registration costs 60 Euro and is possible through this Eventbrite page

eutech

This blog entry was written by Kristin Dill (Austrian National Library) and is a condensed version of the one originally posted on the Europeana Professional Website.

Open Humanities Awards: finderApp WITTFind update 4

This is the fourth in a series of posts from Dr Maximilian Hadersbeck, the recipient of the DM2E Open Humanities Awards – DM2E track.

The research group “Wittgenstein in Co-Text” is working on extending the FinderApp WiTTFind tool, which is currently used for exploring and researching Wittgenstein’s Big Typescript TS-213 (BT), to the rest of the 5000 pages of Wittgenstein’s Nachlass that are made freely available by the Wittgenstein Archives at the University of Bergen and are used as linked data software from the DM2E project. The work in December focused on the implementation of new features in WiTTFind, extensive work for the PISA-Demo Milestone, the speech and presentation at the DM2E final event on 11 December in Pisa and following discussions.

Extensive work for our PISA-Demo Milestone

We continued to strengthen out gitlab and docker environment to reach our aim of producing a high quality FinderApp for Digital Humanity projects running under different operating-systems. For our presentation in Pisa we defined a PISA-Demo milestone which introduced new features, defined in 27 issues like: setting permanent webpage configuration-values; redesign webpage with bootstrap and adding multi-doc behavior; new logo and header line; adding scrollbar to our webpage; sort display of hits; switching to HD-facsimile; adapting the facsimile-reader to HD-facsimile; rewriting help-page and new E2E tests. To show the extensive software-activities we did in our gitlab short before the PISA-Demo Milestone, see figures 1, 2 and 3.

witt1
fig. 1 Feature- and Issue-List
fig. 2 New Mulitdoc WEB-Frontend see: http://dev.wittfind.cis.uni-muenchen.de
witt3
fig. 3 git lab Activities before PISA-Milestone

 

Speech and Demonstration of our FinderAPP at DM2E final event, 11.12.2014, Pisa

One aim of our award project was to give a speech and a demo at the DM2E Final event in Pisa. Our speech had the title: “Open Humanities Awards DM2E track: FinderApp WiTTFind, Wittgensteins Nachlass: Computational linguistics and philosophy” and the authors where Max Hadersbeck, Roman Capsamun, Yuliya Kalasouskaya, Stefan Schweter from the Centrum für Informations- und Sprachverarbeitung (CIS), LMU, München. In our speech we first gave a short overview of Ludwig Wittgenstein‘s Nachlass and available texts for our FinderApp. Then we described, what kind of “fine-grained computational linguistic perspectives on editions” our Finder WiTTFind offers. We showed the open source aspects of our software and demonstrated the tools. After this we stepped into the details of our implementations: The rule based access to the data together with local grammars. We showed the differences of our rule-based tool, compared to statistical indexing search machines like google books, Open Library project and apache Solr. We gave a short insight into one basis of our tool, our digital full-form lexicon of Wittgenstein’s Nachlass with 46000 entries (see figure 4 The Digital Lexicon WiTTLex)

fig. 4 The Digital Lexicon WiTTLex
fig. 4 The Digital Lexicon WiTTLex

In the next part of our speech, we informed the attendees about other important aims of our project, like extending data to 5000 pages of Wittgenstein’s Nachlass and making our finder openly available to other digital humanity projects by defining APIs and a XML-TEI-P5 tagset. We presented OCR tools for facsimile-integration and a facsimile-reader for the new multidoc environment. The last aim of our project was, that our software should work as an interoperable distributed application (Linux, Macos, Windows) and it should be browser and device independent. We reached this aim by using gitlab, docker and bootstrap software. In the final part of our speech we presented our new multidoc browser frontend (see figure 2).

Discussion after the speech in Pisa

After our speech and at the evening-meeting we had very interesting discussions with the DM2E partners about our rule-based and not statistically access of our FinderApp to the Wittgenstein Nachlass. We showed that the rule-based access works perfect on limited data, like we have here, because with the help of rules (local grammars) we can make a lot of disambiguations in the field of semantics and syntax. The second remarkable point in the discussions were oberservations within the cooperation work with “Humanity”-researchers, in our case philosophers. We found the phenomenon that the philosophers fully accept and use only tools, if they find their specific scientific-language and categories present and if the search tool offers almost 100% precision and 100% recall. We admit, that these limits can never be reached, but what is the important: “Humanists” are not interested in sophisticated programming tricks and features, which computer-scientists love so much, they expect solid and clear algorithms behind finding the sentences around their specified word-phrases. They also expect interactive menus with fine grained setting possibilities, to investigate and influence the way of finding the specific text, which fits to their question.

Open Humanities Awards: Early Modern European Peace Treaties Online update 3

This is the third in a series of posts from Dr Michael Piotrowski, one the recipients of the DM2E Open Humanities Awards – Open track.

Europäische Friedensverträge der Vormoderne online (“Early Modern European Peace Treaties Online”) is a comprehensive collection of about 1,800 bilateral and multilateral European peace treaties from the period of 1450 to 1789, published as an open access resource by the Leibniz Institute of European History (IEG). The goal of the project funded by the DM2E-funded Open Humanities Award is to publish the treaties metadata as Linked Open Data, and to evaluate the use of nanopublications as a representation format for humanities data.

navacchio

On December 11, I was invited to speak about the project at the DM2E Final Event in Navacchio, Italy. I gave a talk entitled “Early Modern European Peace Treaties Online—The LOD Remix.” You can find the slides on SlideShare; for a full report of the event, see the blog post Final DM2E & All-WP meeting, 11–12 December, Pisa.

I gave the talk the subtitle “The LOD Remix” because it started with a brief account of the prehistory, i.e., the project that created the database we used as the starting point for our project: “Europäische Friedensverträge der Vormoderne – online,” which was funded by the DFG from 2005 to 2010. I then went on to describe the current state of our work at that time; you can find the main points in my previous blog post.

I could also report on a surprising discovery I had made just a few days before the event. If you’ve read my previous posts, you may have noticed that we’re missing one interesting type of information: the names of the negotiators involved in the negotiation of the treaties. So I was happy to tell the audience that in the context of the BMBF-funded project Übersetzungsleistungen von Diplomatie und Medien im vormodernen Friedensprozess. Europa 1450–1789 (June 2009–May 2012), researchers at the University of Augsburg have gathered all the negotiators relevant to the treaties contained in our database, as well as the languages they’re written in. Some of the data is published as lists on their website, but these lists are actually exported from a Microsoft Access database, which contains additional information.1 We’re in contact with our colleagues in Augsburg and working towards a way to combine their data with our data and to publish everything as LOD. We may not be able to complete the merge in time for the first release, but we hope to finish it soon afterwards.

I also had some interesting conversations during the event, which prompted me to think a bit more about the modeling from a conceptual perspective. Our current modeling essentially represents the contents of the original relational database in RDF. For a future version I’d like to re-examine the relations between the various entities involved, such as those between a conclusion of peace(an event), a peace treaty (perhaps a work in the sense of FRBR), and various copies and versions of a treaty (manifestations).

We hope to release the first version next week, and in the next post I will then describe this release and maybe do a little retrospective of the project.

Footnotes:

1 The database is described in: German Penzholz, Andrea Schmidt-Rösler (2014). “Die Sprachen des Friedens – eine statistische Annäherung”. In: Johannes Burkhardt, Kay Peter Jankrift, Wolfgang E. J. Weber (eds.): Sprache. Macht. Frieden. Augsburg: Wißner. PDF

Modeling the Scholarly Domain

1 Introduction

The aim of DM2E’s Task 3.4 was to investigate how a digital humanist uses digital research tools and how his actions can be modeled. With respect to the environment of Linked Data and Europeana, the initial question were “What does the humanist want to do with the digital tools?” and “What are the ‘functional primitives’ of the digital humanities.”

With our model, the Scholarly Domain Model (SDM), we try to initiate and encourage more reflection on the methods of humanities scholars in a digital environment, and, at the same time, to connect the development of applications closer to scholarly practices.

2 Scholarly Domain Model

The model groups together the activities of a generic research process and constitutes the primitives of scholarly work. The model itself consists of different layers of abstraction, each one describing the field with more granularity. The top layer displays Research as the central aspect of the SDM, but it makes it clear that is dependent on input and will ideally produce output. The arrows leading back to input indicate that the output of one iteration can be used as input to a following research process.

SDM2arrows

Additionally, Research is embedded in a social context, which includes collaborative aspects and a documentary context, like reporting to a funding organization or blogging about a project.

Each layer zooms more in on the different activities that can be part of the scholarly domain. The lowest level, Scholarly Operations, being designed to be domain-specific implementations of the upper levels: all scholars use references, but how is a reference specifically done e.g. in linguistics?

We see the SDM as a living model and a framework for discussions on the humanities and the digital domain.

3 Application Scenarios

We hope that with the implementation of the SDM as an RDFS/OWL ontology we can build a bridge between the model and applications recurring to this model. Also, the model will help to identify gaps in the scholarly workflow and design tools that fill these gaps. Thirdly, aspects of scholarly work that are not yet covered by tools can be integrated. By providing this model we aim to contribute a significant building block to the digital humanities and try to enhance the sustainability of infrastructures.

A full report on the Scholarly Domain Model and the work conducted in Task 3.4 will be published at the end of DM2E in January 2015.

Steffen Hennicke, Humbolt-Universität zu Berlin

Final DM2E & All-WP meeting, 11-12 December, Pisa

The DM2E consortium in Pisa

 

Last month the DM2E project organised its final event under the title ‘Enabling humanities research in the Linked Open Web‘. Over 50 participants gathered at the Auditorium Incubatore of the Polo Tecnologico di Navacchio near Pisa, Italy to hear more about the final results of the three-year project, as well as those of the winners of the second round of the Open Humanities Awards.

 

DM2E: context and background

The day started with a welcome on behalf of the project coordinator, Humboldt Universität zu Berlin, by Violeta Trkulja. She briefly introduced the project, the partners involved and the main activities and outcomes.

Then it was time for the keynote: Sally Chambers (DARIAH-EU and Göttingen Centre for the Digital Humanities) gave an inspiring talk on sustainable digital services for humanities research communities in Europe, and the role that the DARIAH infrastructure can play in this regard.

Antoine Isaac (Europeana) followed with an illustration of the relevance of the DM2E results in the wider context of Europeana, Europe’s platform to access cultural heritage. Because of the work done within the project, there is now more material available in Europeana that is relevant to digital humanities researchers, workflows for data aggregation have been improved and of course the EDM (Europeana Data Model) has been specialized for the manuscript domain.

 

DM2E results

The work package leaders of the four DM2E work packages then went on to present what has been achieved over the course of the project in the areas of:

Content aggregation to Europeana (WP1 – Doron Goldfarb, Austrian National Library);

The interoperability infrastructure, including the DM2E data model, ingestion and contextualization – even with an ‘Oh, yeah?’ button (WP2 – Kai Eckert, University of Mannheim);

The Pundit tool for semantic annotation and enrichment (WP3 – Christian Morbidoni, Università Politecnica delle Marche / Net7 & Alessio Piccioli, Net7);

Experiments conducted to investigate how humanists work with linked data and tools such as Pundit and to better understand their reasoning process (WP3 – Steffen Hennicke, Humboldt Universität zu Berlin);

And community building around open data for cultural heritage and humanities research (WP4 – Lieke Ploeger, Open Knowledge).

 

Open Humanities Awards – round 2

Another part of the programme was reserved for the winners of the second round of the Open Humanities Awards. The first winner of the Open track, Michael Piotrowski (Leibniz Institute for European History) talked about how the metadata of European peace treaties from the Early Modern period can be published as Linked Open Data and evaluate the use of the nanopublications format for such content, so that it can become a more valuable resource for the humanities. Read more

Also for the Open track, Rainer Simon (Austrian Institute of Technology) shared the success story of the SEA CHANGE (SEmantic Annotation for Cultural Heritage And Neo-GEography) workshops, where people helped turn raw geographical data into Linked Open Data with the Recogito geo-annotation tool. In just two workshops, nearly 15.000 contributions were made, a great crowdsourcing achievement. Read more

For the DM2E track of the awards, Max Hadersbeck (University of Munich) demonstrated the impressive FinderApp WITTFind, with which it is possible to search Wittgenstein’s Nachlass in an extensive way, because of the incorporation of a full-form lexicon and features such as highlighting the hits in the displayed facsimile – all very valuable for researchers. Read more

It was very inspiring to see the impact the Open Humanities Awards have had on furthering teaching and research in the humanities – once again congratulations to all winners!

 

DM2E Final All-WP meeting

On the following day, Friday 12 December, the project consortium held the final All-WP meeting in Pisa. The first part of the meeting was dedicated to wrapping up the achievements of the four work packages: each WP lead gave an overview of the main achievements in the project, with a focus on the final six months.

(for WP2 and 3, the slides were similar to those presented at the final event on 11 December, so you can find the slides above under ‘DM2E results’)

Then some time was reserved for a look at applications developed for creative use of cultural contents in another FP7 research project, AthenaPlus, presented by Gábor Palkó (Petőfi Literary Museum).

The day concluded with a collective lookback at lessons learned throughout the project, and preparation for the final month of reporting, wrap up and review. After two successful days it was time for a final concluding consortium lunch.

Many thanks to all participants for joining our final event, and of course you can follow us on either Twitter or our blog to stay updated on all final deliverables!

 

 

Open Humanities Awards: SEA CHANGE final update

This is the final post from Dr Rainer Simon, one the recipients of the DM2E Open Humanities Awards – Open track.

Last Thursday, the University of Applied Sciences Mainz was the scene of our second SEA CHANGE annotation workshop. First things first: Mainz broke the record! Despite a few participants less than last time in Heidelberg and, overall, a few minutes less time, they made it. Below, I’ll speculate a bit on how and where Mainz may have scored those extra points. But for the sake of completeness, I should point out that this public boasting on twitter by our host Kai-Christian Bruhn would have made a day without breaking the record SOMEWHAT embarrasing ;-)

A day without a new record? No longer an option now for Mainz, it seems.

But Kai’s students did not let him down. The day ended with a breathtaking 7.511 contributions, totally smashing our previous record of 6.620 by almost 900! Totes amaze. We were knocked out by their efforts. Kudos to the students at Mainz!

Annotation Session

Like last time, Leif and I kicked off the day by introducing the Pelagios project (into which the SEA CHANGE results feed into) to our audience. Participants had a mixed background (engineering and archaeology), attending a joint course in both universities in Mainz. After a guided tour of our Web based geo annotation tool Recogito people got to work.

One of our participants at work on a 15th century Portolan by Grazioso Benincasa.

At times, the silence in the room was almost eerie as the students set about working on, in a highly concentrated way, a selection of both texts (Medieval geographic narratives and travel writing) and maps (a set of beautiful maritime charts from the 14th and 15th century). Despite the fact that most students missed the expert knowledge about the texts and historical background, they obviously found it meaningful to add annotations. They grasped the idea and, to quote Kai, They won the race because what they were doing during the annotation session was meaningful to them.

To back this up with some numbers: here’s the sum total of what the day ended with.

  • Approx. 2.600 place name identifications in text. That’s almost an identical number to our first workshop. So far so good.
  • Almost 3.200 place name identifications on images. Wow! That’s almost 700 more than last time!
  • About 620 map transcriptions. That’s a bit less than last time, where we had 830.
  • My personal favourite: 544 gazetteer resolutions. That’s almost four times as many as last time! Gazetteer resolution is the type of activity that’s most complex and time-consuming. Since our last workshop, we completely overhauled the user interface for this, and it’s great to see such an improvement.
  • 537 other activities such as corrections, comments, deletions, etc.

It’s good to see how stable the number of place name identifications in text was. This seems to show that (despite the occasional glitch and known issue) Recogito has really reached a level of maturity now. It’s also interesting to see how many more place name identifications we had in images this time. My personal take on this is that the different material may have played a small part in there, too. Portolan charts are very “dense” in place names, and the place names are typically arranged in sequence, in the same orientation. So there is less need to search and navigate the map. That may have allowed for slightly speedier tagging this time. On the other hand, though, the style of lettering in these maps was rather different from last time and much more challenging for the non expert to decipher. This may well be the reason why we got a lower number of transcriptions on this occasion. But in any case: the overall result speaks for itself.

Data Re-Use Session

The late afternoon was again dedicated to the topic of data re-use. This time, however, we tried something a little different. We ran two sessions in parallel. Participants could choose between them, depending on their own interest and background. Leif once again walked his half of the audience through a tutorial that uses the Open Source Geographical Information System QGIS to explore a medieval travel itinerary embedded in 3D terrain. (The resulting 3D visualization is available online here). In the meantime I ran a small “Pelagios hack tutorial” in which I guided the other half of the audience through three JavaScript examples that demonstrate how you can easily re-use Pelagios data in your own applications and mashups through our API, e.g. to create Web maps, timelines or network graphs. (The tutorial examples are on GitHub here.)

Leif in action

Well, I guess this concludes the SEA CHANGE project. Leif, Elton, Pau and I are happy to have gotten the opportunity to do this, and are very excited about what came out of it. We would love to repeat workshops like these at some point. (Maybe also in virtual online form?) If you’re interested in participating or hosting: by all means, do get in touch!

Last but not least (my standard reminder…): above all, our project is about gathering data and making it openly available to everyone. So do take a look at the CC-licensed annotation data that is now available for download through Recogito, as well as through the Pelagios API. We’d love to hear from you!

Open Humanities Awards: finderApp WITTFind update 3

This is the third in a series of posts from Dr Maximilian Hadersbeck, the recipient of the DM2E Open Humanities Awards – DM2E track.

The research group “Wittgenstein in Co-Text” is working on extending the FinderApp WiTTFind tool, which is currently used for exploring and researching Wittgenstein’s Big Typescript TS-213 (BT), to the rest of the 5000 pages of Wittgenstein’s Nachlass that are made freely available by the Wittgenstein Archives at the University of Bergen and are used as linked data software from the DM2E project. In November, they concentrated on delivering a development milestone for the final DM2E event in Pisa, redesigning the WiTTFind web frontend and integrating of a new facsimile (typoscript/manuscript).

Delivering development milestone

We continued to strengthen out gitlab and docker environment to reach our aim of producing a high quality FinderApp for Digital Humanities projects running under different operating systems. All software development is done within a git-branching model, as proposed by professional software teams. We run one tagged “master” branch, which is deployed on our master server and a “development”-branch running on our development server. All new software features are programmed, added, maintained and tested within a specific feature-branch (see Fig. 1). We define milestones, where all different feature developments have to be finished and are merged into the development-branch and deployed at the development server (http://dev.wittfind.cis.uni-muenchen.de). After extensive working and testing on the development-server we finish the programs at the development-branch and transfer it to the master-branch, which will be the new release of our FinderApp.

Fig. 1: git branching model

Enlarging our E2E tests for continuous tests

To detect software errors almost during software development and during the integration into the development branch we wrote a lot of automatic E2E tests (End-to-end-tests), which must succeed before we accept and integrate a new feature. E2E-tests are very similar to integration tests, because they test the interaction between different software components. They are mainly used in web frameworks and development of webpages, because they simulate web users activities in an automatic way. As testing environment software we use casper.js.

Web frontend for multidoc

Discussions with a web specialist and ideas from the “Nietzsche-Source” webpage led to the decision to rewrite our WiTTFind webpage:

  • Many documents have to be searched and the user should not lose the overview
  • There should be the same look & feel for different browsers and web devices
  • Our webage should use much more modern browser-features to offer dynamic behavior
  • We use suggestions from the Web Corporate Identity Team of our university

To reach this goals, we decided to use the software bootstrap, one of the most popular HMTL, CSS and javascript framework for web development. With this framework, our WiTTFind webpage can be called with the same look & feel from mobile devices, tablet computers and arbitrary browsers. In Fig. 2 you can see a first screenshot of our new bootstrap driven webpage.

Fig. 2: Our new multidoc webpage – http://dev.wittfind.cis.uni-muenchen.de

Integration and OCR of new HD-Facsimile (Typescript/Manuscript)

After we integrated the new high density-facsimile in our WiTTFind project-structure, we started to OCR the scans and can show first OCR results with the use of the OCR-Software tesseract. The OCR results of typescripts are rather good, compared to the results of OCR scanning of manuscripts. In Fig. 3 and Fig. 4 you see the results in the right column. Currently we are working on a multiuser-semiautomatic web-based correction-tool for OCR errors.

Fig. 3 OCR of a Wittgenstein typescript-scan
Fig. 4 OCR of a Wittgenstein manuscript-scan

Video: Extensive work for our PISA demo milestone

For our presentation in Pisa at the DM2E final event we defined a new milestone, where we fixed 27 issues, including setting permanent webpage configuration-values; redesign webpage with bootstrap and adding multidoc behavior; new logo and header line; adding scrollbar to our webpage; sort display of hits; switching to HD-facsimile; adapting the facsimile-reader to HD-facsimile; rewriting help-page, semantic-finder and graphical finder; new E2E tests. To show the extensive software-activities we did in our GIT-Lab short before the PISA-Demo Milestone, we produced a git-activity-video. You can watch it here: http://wast.cis.uni-muenchen.de/tutorial/gitlab-log/

 

 

Putting Linked Library Data to Work: the DM2E Showcase

max

Last week the DM2E team at the Austrian National Library (ONB) organised a seminar on the wider possibilities of scholarly and library (re-)use of Linked Open Data in Vienna. Max Kaiser, Head of the Research & Development department of the ONB opened the afternoon and stressed how satisfied the library is with the progress that DM2E has made in the last years, both in aggregating manuscript content into Europeana as well as in publishing delivered metadata as Linked Open Data using the DM2E model, a specialised version of the Europeana Data Model (EDM) for the manuscript domain.

After this welcome, Doron Goldfarb (ONB) gave an introduction to the DM2E project and the four main areas of work: aggregation of manuscript metadata and content, interoperability infrastructure, digital humanities applications and community building.

Marko Knepper  of the University Library of Frankfurt am Main then went on to explain, with examples from his library’s manuscripts, how library data is transformed into linked data through the use of tools developed in DM2E such as MINT and Pubby, showing the final result as it appears in the Europeana portal at the end.

After the first break, Bernhard Haslhofer (Open Knowledge Austria / AIT) and Lieke Ploeger (Open Knowledge) gave a joint presentation on the value of open data and the OpenGLAM network. Bernhard introduced the topic with a talk on his experiences with Maphub, an application he built which operates on open cultural heritage data and allows users to annotate digitized historical maps. Following on his example, Lieke introduced the OpenGLAM network, a global network of people working on opening up cultural data that was set up in the scope of the DM2E project, and talked about the recent and future activities of the OpenGLAM community.

Next up was Kristin Dill (ONB) with a presentation on the scholarly activities in DM2E. She talked about the Scholarly Domain Model (SDM), which informs the work on digital tools for humanities scholars by addressing gaps in digital workflows and recognising patterns in the behaviour of scholars. She showed the different layers of abstraction of the model, and demonstrated how certain scholarly activities can be identified in the Pundit tool.

The final talk of the day came from Susanne Müller of the EUROCORR project. After a brief introduction to the BurckhardtSource project, she detailed how the semantic annotation tools developed in DM2E have been applied to the European correspondence to Jakob Burckhardt, a Swiss cultural historian from the 19th century to enrich this data.

The last part of the day was reserved for a workshop based around the Pundit tool for semantic annotation from NET7. After an introduction to the tool, the group was divided into two for a hands-on session, which was received very well by participants.

pundit

Open Humanities Awards: Early Modern European Peace Treaties Online update 2

This is the second in a series of posts from Dr Michael Piotrowski, one the recipients of the DM2E Open Humanities Awards – Open track.

Europäische Friedensverträge der Vormoderne online (“Early Modern European Peace Treaties Online”) is a comprehensive collection of about 1,800 bilateral and multilateral European peace treaties from the period of 1450 to 1789, published as an open access resource by the Leibniz Institute of European History (IEG). The goal of the project funded by the DM2E-funded Open Humanities Award is to publish the treaties metadata as Linked Open Data, and to evaluate the use of nanopublications as a representation format for humanities data.

We’ve now converted the the structured metadata from the legacy database into RDF. In my last post I talked a bit about the structure and content of the legacy database; as we expected, the conversion required a fair bit of interpretation and cleanup work, but all in all, it worked quite well.

As the basis for our data model we have, not surprisingly, used the DM2E model. Currently we have three main classes of entities, namely the treaties, the treaty partners (or signatories–but we prefer the term partner to avoid confusion with the negotiators, i.e., the persons who actually signed the treaties), and finally, the locations where the treaties were signed. We usedm2e:Manuscript as class for the treaties, edm:Agent as class for the partners, and edm:Place as class for the locations. Furthermore we use the following properties:

  • dc:title for the treaty titles,
  • dc:date for the treaty date,
  • edm:happenedAt for linking to the location,
  • rdfs:label for the names of partners and locations, and
  • skos:narrower and skos:broader for modeling the hierarchy of partners.

The last point may need some explanation. Partners may be in a hierarchical relationship to each other to model that a power may be part of a larger entity. For example, Austria was a part of the Holy Roman Empire, whereas Milan, Mantova, and Sardinia were (at various points in time) parts of Austria. However, historical realities tend to be quite messy, so these relations are not necessarily “part-of” relations in the strict sense; for example, Austria also had territories outside the Empire. The hierarchy also contains “fictitious partners” as a help for searching; for example, introducing Switzerland or Parts of the Empire as “fictitious partners” makes it easier to search for treaties concerning certain regions of Europe. This pragmatic approach was taken over from the legacy database, as we think it makes sense, at least for the time being.

To link the treaties to the treaty partners we’re currently using the dc:contributor property. We’re not yet completely happy with this solution; it seems to stretch the meaning of “contributor” a bit. Coming up with a better solution (or for arguments in favor of keeping dc:contributor!) is on our todo list.

So, if we take a specific treaty, such as the Provisional convention of subsidy between Great Britain, the States General, and Austria, we have the following data:

Property Value
Type Manuscript
Title (dc:title) Provisorischer Subsidienvertrag (de)
Date (dc:date) 1746-08-31
Contributor (dc:contributor)
  • ieg-local:partner/12 = Austria
  • ieg-local:partner/42 = Great Britain
  • ieg-local:partner/49 = States General
Happened at (edm:happenedAt) ieg-local:place/13 = The Hague

This display is somewhat simplified for illustration but should give you an idea. We have loaded the data into Fuseki and set up Pubby (a Linked Data frontend for SPARQL endpoints) on an internal server. For reference, Figure 1 shows the last page of the treaty; the last sentence before the seals and signatures gives the place and the date: Fait à La Haye le trente un du Mois d’Aout de l’année mille Sept cent quarante Six.

Figure 1: Provisional convention of subsidy between Great Britain, the States General, and Austria (Nationaal Archief, Den Haag, Staten-Generaal, nummer toegang 1.01.02, inventarisnummer 12597.187)

What are the next steps? Now that the data can be easily browsed through Pubby, of course you spot various smaller errors here and there, which we’re fixing as we go. More importantly, we are currently working on linking the locations and partners to suitable authority files, most notably the GND, which will make the data not just open but also linked. The locations should be relatively straightforward, but the partners may pose some problems; we take the obvious approach to first handle the easy cases and then deal with the rest.

Open Humanities Awards: finderApp WITTFind update 2

This is the second in a series of posts from Dr Maximilian Hadersbeck, the recipient of the DM2E Open Humanities Awards – DM2E track.

The research group “Wittgenstein in Co-Text” is working on extending the FinderApp WiTTFind tool, which is currently used for exploring and researching Wittgenstein’s Big Typescript TS-213 (BT), to the rest of the 5000 pages of Wittgenstein’s Nachlass that are made freely available by the Wittgenstein Archives at the University of Bergen and are used as linked data software from the DM2E project. In October, they concentrated with full power on switching to professional open-source software development tools, the virtualization of the FinderApp to open it to other projects and the submission of a paper and poster on the work to a Digital Humanities conference in 2015. 

Switching to professional open-source Software Development Tools

The aims in our award project, enlarging and opening our FinderApp WiTTFind to new fields of Digital Humanities, led to the decision that we have to switch from our svn-based “personal software development” to a more powerful distributed revision control and source code management system. Our decision fell on GIT, an open source software tool which proved excellent capabilities during the development of new Linux Kernels. We built up a GIT-Server at our institute and developed, collected and maintained all modules around our FinderApp under the roof of the GIT-group WAST (Wittgenstein Advanced Search Tools). Together with new storage and revision control, we extended our software development with additional tools like: Test Driven Development (TDD), Continuous Integration (CI) and integrated Build System (gitlabci), Continuous Delivery and Deployment, best-practices bug reporting, Build and Test Engineering and at last also Quality Assurance. Within the GIT-group WAST, every module is implemented as an own project, connected to responsible owners. A central WEB-based Feedback-Application was implemented, to enable email- and GIT-based postage for staging errors, problems and new feature requests. All the delivered issues are visible to the GIT-members of the WASTgroup. The Feedback-Application is widely used: 192 issues have been processed since the installation.

Restructuring and integration our AWARD-Project under GIT Control

The modularization and restructuring of our programs and data around WiTTFind is finished and managed in the GIT-LAB group WAST. From now on, all data management and documentation is done under GIT-Control (see picture 1). An automatic quality assurance system is implemented to enable automatic testing of new software developments. Software will only be accepted and integrated as WAST-tool if there are automatic tests and they succeed. The project is maintained via the Feedback-Application.

 

Picture 1: Commit over time to WAST-git

New logo for Wittgenstein Advanced Search Tools (WAST)

To express the corporate identity we developed a new logo for our FinderApp WiTTFind, which is contained in the WAST-Tools. We extracted word-snippets from facsimiles of Ludwig Wittgenstein’s Nachlass:

 

Picture 2: WAST- Project LOGO

FinderApp for other Digital Humanity Projects

One of the biggest aims of our award project is to open our FinderApp WiTTFind and WAST-Tools to other Digital Humanities projects. The whole software should run interoperable under Linux, MacOS and Windows. To overcome the widespread software requirements of our application, which differ heavily between different platforms and even different releases, we use the virtualization software Docker. This technology, available open source for various operating systems, collects all software needed in one “container” and makes it run under docker-sever-control. In October we have released our first docker-container, which runs our FinderApp and WAST-Tools virtualized on laptops under Linux and soon MacOS as well. All our programmers in the project group use this technology to develop their software.

Picture 3: Interoperable Virtualized WiTTFind

Paper and poster for the Conference “Digital Humanities im deutschsprachigen Raum”, Graz 2015

To make our Wittgenstein Advanced Search Tools and the FinderApp WiTTFind known to a broader community in the field of Digital Humanities we submitted a paper and poster to the conference Digital Humanities im deutschsprachigen Raum (25-27 February 201, Graz, Austria). The paper “Wittgensteins Nachlass: Erkenntnisse und Weiterentwicklung der FinderApp WiTTFind” (authors Max Hadersbeck, Alois Pichler, Florian Fink, Daniel Bruder and Ina Arends)  describes in great detail the latest developments of our project, while the poster “Wittgensteins Nachlass: Aufbau und Demonstration der FinderApp WiTTFind und ihrer Komponenten” (authors Yuliya Kalasouskaya, Matthias Lindinger, Stefan Schweter and Roman Capsamun) complements a live-demo of WiTTFind.