To develop a workflow for the automatic recognition of place and person names (Named Entity Recognition), T.M.M.M.T. combines two research approaches:
NER for historical texts
The pipeline performs Named Entity Recognition (NER) in two steps: (1) a specialised NER step based on, in simple terms, comparisons between tokens and pre-built lists of place and person names, and (2) NER as part of the part-of-speech tagging step.
Information Extraction & Gazetteers
Extraction of names (mines, places, persons) using Postgres. The created gazetteers and registers were used to develop & support a NER workflow.
Schmid, Helmut. „Probabilistic part-of-speech tagging using decision trees.“ International Conference on New Methods in Language Processing, 1994. 1994.
Schmid, Helmut. „Deep learning-based morphological taggers and lemmatizers for annotating historical texts.“ Proceedings of the 3rd international conference on digital access to textual cultural heritage. 2019.