Diachronic linguistics team

Language changes over the years. It was used in a different way 200 years ago than it is today. Diachronic linguistics deals with natural language, taking into account the context of time. In addition to expanding purely linguistic knowledge, our studies give us insight into the way people thought in the past and the topics that were important to them. They are also of practical value—allowing us to build better machine learning models. For example: by knowing the past equivalent of a word (e.g., Polish aeroplan for today’s samolot, meaning ‘airplane’), one can create a better search engine for historical texts.

Research has traditionally been conducted using the methods of linguistics. More recently, machine learning methods (in particular, neural models) have been gaining popularity. In our team we combine both approaches, as it is composed of specialists in both linguistics and computer science.

Currently, we focus mainly on Polish and English. We mainly study texts from periodicals dating back to the 19th century. We also deal with more recent, shorter slices of time—a few years or a decade or two. The problems we solve include dating texts (determining the date of creation or publication of the text), language modeling, and the detection and analysis of trends in the use of phrases over time.

We function as part of the DARIAH-PL consortium, which is engaged in general work in the digital humanities in Poland.

Team members

M.Sc.

Kacper Dudzic

Master of Computer Science and Japanese Studies, researching the intersection of computer science and linguistics. Specializes in computational linguistics and NLP, focusing on digital humanities, machine translation, and large models.
Piotr Wierzchoń

Professor

Piotr Wierzchoń

Scientific Areas: Chronologization of 19th- and 20th-century Polish vocabulary, historical lexicography, automation of natural language processing, translation lexicography, disinformation theory, photodocumentation, machine learning: text classifiers, image recognition, time series analysis.

Consultant

prof. UAM

Filip Graliński

He is involved in text processing and machine learning. His main interests are language modeling with neural networks, including time and graphics, historical text processing, and evaluation of machine learning systems. Author of the Gonito and GEval systems.
Zdjęcie przedstawiające profesora Krzysztofa Jassema

Professor

Krzysztof Jassem

Director of the Artificial Intelligence Center. Author of a doctoral dissertation on machine translation titled “Electronic Bilingual System in Automatic Text Translation” (1997), which initiated the development of machine translation in Poland.

PhD

Marek Kubis

Dr Kubis is conducting research on conversational systems and discourse modeling techniques. His interests include methods of geometric deep learning applied to natural language processing, techniques for text data augmentation, and methods for evaluating the robustness of large language models.