Diachronic linguistics team

Language changes over the years. It was used in a different way 200 years ago than it is today. Diachronic linguistics deals with natural language, taking into account the context of time. In addition to expanding purely linguistic knowledge, our studies give us insight into the way people thought in the past and the topics that were important to them. They are also of practical value—allowing us to build better machine learning models. For example: by knowing the past equivalent of a word (e.g., Polish aeroplan for today’s samolot, meaning ‘airplane’), one can create a better search engine for historical texts.

Research has traditionally been conducted using the methods of linguistics. More recently, machine learning methods (in particular, neural models) have been gaining popularity. In our team we combine both approaches, as it is composed of specialists in both linguistics and computer science.

Currently, we focus mainly on Polish and English. We mainly study texts from periodicals dating back to the 19th century. We also deal with more recent, shorter slices of time—a few years or a decade or two. The problems we solve include dating texts (determining the date of creation or publication of the text), language modeling, and the detection and analysis of trends in the use of phrases over time.

We function as part of the DARIAH-PL consortium, which is engaged in general work in the digital humanities in Poland.

Team members

M.Sc.

Jakub Pokrywka

Doctoral student. He works on neural processing of natural language, language modeling (including language diachrony), information extraction, dialogue agents, and to a lesser extent computer vision and data analysis. He has worked for Samsung, Allegro, and Supermemo, among others.

M.Sc.

Marcin Pigulak

Master’s degree in history and film studies, doctoral student at AMU. Among his research interests are audiovisual and digital manifestations of historical culture, ways of constructing discourses about the past, and the source material aspects of the digital humanities.

Consultant

prof. UAM

Filip Graliński

He is involved in text processing and machine learning. His main interests are language modeling with neural networks, including time and graphics, historical text processing, and evaluation of machine learning systems. Author of the Gonito and GEval systems.