Antske Fokkens holds a Universtity Research Chair on Computational Linguistic Methods. She coordinates the Text Mining/Language and AI track of the linguistics masters and the Human Language Technology track of the Research Masters together with Hennie van der Vliet.
She is Vice Dean of Research in the Faculty of Humanities.
Research Statement
My main interest lies in methodological aspects of research in Computational Linguistics. I am driven by the question of how computational models of language work: what patterns and systems are found in natural language? How can they be modeled computationally? Which computational methods are suitable for modeling or analyzing which phenomena.
I am currently working on two main topics. The first topic focuses on gaining a better understanding of language models: what information is contained in their representations and how is this used? This interest came out of a topic more oriented towards social aspects of language: investigating in what (subtle) ways people express perspectives. Here, I design and implement tools that extract patterns of how specific groups of people, events or concepts are described in large amounts of text. For instance, do media systematically talk differently about events when the actors have a certain ethnic background? What is said about health and weight and how did that change over time? The basic system extracts transparent patterns, using labels that historians, social scientist and other interested researchers can read directly.
Since I joined the VU in 2012, I have mainly focused on methodological aspects is the application of NLP to digital humanities. This work is mainly carried out as part of the BiographyNet project. In this project, we (a historian, computer scientist and me) work together to see how we can use NLP and Semantic Web technology to enhance historic research on the Biography Portal of the Netherlands. My research addresses how we can identify information that is useful for historians from text and how we can make sure that historians can assess the reliability of the output of tools of which they do not know the working.
The Network Institute projects Time will tell a different story and Political Discourse in the News also addressed the question of how NLP can be used in historic research and communication science, respectively. As part of investigating methodological issues, I have also worked on issues regarding the system architecture in NewsReader and am coordinating the Enlighten Your Research project Can we Handle the News, where we pushed the limits of large scale processing and investigate what would be needed to process all the news that is published every day.
My PhD thesis proposed a new methodology for developing linguistic precision grammars. The main idea of the proposal, storing alternative analyses in a metagrammar so that they may be compared at different stages of the development process, can be applied in any theory. I particularly looked at grammars developed as part of the DELPH-IN consortium in which context open-source HPSG-based grammars are developed. The method is also closely related to the LinGO Grammar Matrix.
Teaching Statement
In my view, the best way to train students in critical thinking and methodological reflection is by making students experience the possibilities and pitfalls first handedly. I use assignments in which students are trained to ground their hypotheses in theory and apply solid evaluation methods. They thus largely learn by doing: this way, students do not only hear that careful evaluation and analyses of linguistic phenomena and data are important, but they also experience that this is the case. As such, they come to understand that reflecting on linguistic properties, computational models, domain specific information and structural evaluation are essential regardless of whether they end up working as researchers or in a company as a linguistic experts or text mining specialists.
Studying through assignments has two main advantages: (1) I can see whether students really understood the theory or whether certain components need to be explained further; (2) Students experience directly why they need specific knowledge and get direct feedback as to whether they master the necessary skills.
Since our group also does a lot of research grounding computational linguistics in linguistic theory and methodology, we can directly incorporate insights from our own research into our courses. This makes our courses attractive to students who immediately see the relevance of the topic or can be inspired to think of current open questions in the field.