Mining was a strong economic factor in Tyrol (Austria) in the 15th and 16th centuries. As a result, the Tyrolean Regional Archives (TLA) in Innsbruck, for example, contain a large number of historical documents dealing with the administration and organisation of the mining towns in the Tyrol. deal with the administration and organisation in the Tyrolean mining towns. In this context, the large number of documents belonging to the collection called "Montanistika" or the Pestarchiv of the TLA should be mentioned. All these documents illustrate the structures and processes in the mining industry at that time.
The team of the project "T.M.M.T." has selected two special manuscripts that make it possible to take a look at the mining historical developments in the mining regions in the area of Schwaz and Rattenberg-Brixlegg (Tyrol, Austria). Firstly, the "Verleihbuch der Rattenberger Bergrichter", 1460-1463 (Hs. 37) and secondly, the "Schwazer Berglehenbuch", ca. 1515 (Hs. 1587).
Both historical documents are digitised and transcribed. The transcripts are available on the Read and Search Platform Mining Hub. The transcripts are also available on Zenodo.
Verleihbuch der Rattenberger Bergrichter, 1460-1463 (Tiroler Landesarchiv Hs. 37), DOI:
Schwazer Berglehenbuch, approx. 1515 (Tiroler Landesarchiv, Hs. 1587), DOI: 10.5281/zenodo.6274928
The "Verleihbuch der Rattenberger Bergrichter" (Hs. 37) comprises 353 pages (57,797 words) and contains mining claims in the mining district of Rattenberg-Brixlegg (Tyrol, Austria) as well as information on logging for smelting and logging in general.
(c) Tiroler Landesarchiv: Verleihbuch der Rattenberger Bergrichter, Hs. 37 pag. 50
The "Schwazer Berglehenbuch" (Hs. 1587) was already published in 2009 by Wolfgang Tschan. The document comprises 409 pages (115,005 words) and refers to the mining area of Falkenstein in Schwaz (Tyrol, Austria). The manuscript lists the mines that existed at the time and the associated mining claims, as well as judgements in disputes between mines.
(c) Tiroler Landesarchiv, Schwazer Berglehenbuch, Hs. 1587 pag. 4
The manuscripts were scanned on location at the Tyrolean Regional Archive, Innsbruck using a ScanTent provided by the Transkribus team of the University of Innsbruck. The images were imported into the Transkribus tool via the APP DocScan. After importing the data into the Transkribus tool it was necessary to perform structural corrections on each document. These corrections were necessary due to badly polluted pages within the sources, unusually wide gaps between words in the text as well as interjections into the text from the side or the top. For both historical documents training models (HTR - Handwritten Text Recognition) were created. The HTR+ model “Hs37_TLA_1460-1463_v4“ is based on approx. 28 000 words and has a Character Error Rate (CER) on Validation Set with 3.04%. The second model „Cod_1587_TLA_1515_v2“ is based on approx. 20,000 words and has a CER of 3.26% Finally, the corrected transcripts were used once again to train new HTR+ models of both historical documents. The reasons for this final training are twofold: 1) Since the whole documents were used for training, a larger number of pages could be assigned to the Train Set and Validation Set. Therefore, the final results of the training models are exceptional: CER on Validation Set of 1.95% was achieved for Hs. 37 and a CER on Validation Set of 3.43% was accomplished for Hs. 1587. The improvement is obvious. 2) Both final HTR+ models will be released by Read Coop (Transkribus) on their official website
The generated models were used on both manuscripts but any way manual corrections were necessary.
Further, to support the development of a functional NER-tool of late Middle High German texts (WP3) and Semantic Representation (WP4) a gold standard was created. Transkribus was chosen as a basic annotation tool. As already mentioned, the fundamental research objective is the extraction and representation of the legal relationships between people, claims and mines over space and time. Therefore, only words and phrases that comprise the following entities: mines, locations, persons and dates.
During a first annotation turn in December 2020 we stumbled across certain annotation difficulties that needed further discussion to create a balanced and useful annotation of the historical documents. Therefore, the team created annotation guidelines for Early New High German Mining Texts that were aimed at the other members of the T.M.M.M.T. research group and was uploaded to Zenodo to support all persons in general who annotate late medieval German texts. Annotation Guidelines (DOI: 10.5281/zenodo.6275197).
Transkribus also offers the possibility to export the corrected transcripts as well as the tags using different formats, e.g. TEI compliant or Excel export. The T.M.M.M.T. project uses the Excel export function of the tags for integration into a Postgres database.
Finally, a first version of a Part-of-Speech (PoS) tagged corpus was created: Montanistika v.1.0.and is now online available. The words within the corpus were annotated using the RNNTagger MHD.
Access corpus: Montanistika