Category Archives: Latest News

Minh Le and Antske Fokkens’ long paper accepted for EACL 2017

Title: Tackling Error Propagation through Reinforcement Learning: A Case of Greedy Dependency Parsing

Conference: EACL 2017 (European Chapter of the Association for Computational Linguistics), at Valencia, 3-7 April 2017.

Authors: Minh Le and Antske Fokkens Title: Tackling Error Propagation through Reinforcement Learning: A Case of Greedy Dependency ParsingTackling Error Propagation through Reinforcement Learning: A Case of Greedy Dependency Parsing by Minh Le and Antske Fokkens

Abstract:
Error propagation is a common problem in NLP. Reinforcement learning explores erroneous states during training and can therefore be more robust when mistakes are made early in a process. In this paper, we apply reinforcement learning to greedy dependency parsing which is known to suffer from error propagation. Reinforcement learning improves accuracy of both labeled and unlabeled dependencies of the Stanford Neural Dependency Parser, a high performance greedy parser, while maintaining its efficiency. We investigate the portion of errors which are the result of error propagation and confirm that reinforcement learning reduces the occurrence of error propagation.

Papers accepted at COLING 2016

Two papers from our group have been accepted at the 26th International Conference on Computational Linguistics COLING 2016, at Osaka, Japan, from 11th to 16th December 2016.

sushi_COLING

Semantic overfitting: what ‘world’ do we consider when evaluating disambiguation of text? by Filip Ilievski, Marten Postma and Piek Vossen

Abstract
Semantic text processing faces the challenge of defining the relation between lexical expressions and the world to which they make reference within a period of time. It is unclear whether the current test sets used to evaluate disambiguation tasks are representative for the full complexity considering this time-anchored relation, resulting in semantic overfitting to a specific period and the frequent phenomena within.
We conceptualize and formalize a set of metrics which evaluate this complexity of datasets. We provide evidence for their applicability on five different disambiguation tasks. Finally, we propose a time-based, metric-aware method for developing datasets in a systematic and semi-automated manner.

More is not always better: balancing sense distributions for all-words Word Sense Disambiguation by Marten Postma, Ruben Izquierdo and Piek Vossen

Abstract
Current Word Sense Disambiguation systems show an extremely low performance on low frequent senses, which is mainly caused by the difference in sense distributions between training and test data. The main focus in tackling this problem has been on acquiring more data or selecting a single predominant sense and not necessarily on the meta properties of the data itself. We demonstrate that these properties, such as the volume, provenance and balancing, play an important role with respect to system performance. In this paper, we describe a set of experiments to analyze these meta properties in the framework of a state-of-the-art WSD system when evaluated on the SemEval-2013 English all-words dataset. We show that volume and provenance are indeed important, but that perfect balancing of the selected training data leads to an improvement of 21 points and exceeds state-of-the-art systems by 14 points while using only simple features. We therefore conclude that unsupervised acquisition of training data should be guided by strategies aimed at matching meta-properties.

VU Master’s Day, Mar. 12 2016

Visit our Research Master Linguistic Engineering at VU Master’s Day Saturday 12 March 2016
flyer Research Master Linguistic EngineeringFlyer Linguistic Engineering, Specialization of the Research Master Linguistics.
Overview Courses Linguistic Engineering.

On 12 March 2016 you will have the opportunity to visit the Master’s Day and obtain detailed information on our Research Master Linguistic Engineering, Specialization of the Research Master Linguistics.

Date Saturday, 12 March 2016
Time 9.30 am – 2.30 pm
Target Group Higher education students and professionals
Location Main Building, VU University Amsterdam, De Boelelaan 1105 (directions)
Please note Preregistration is open until 12.00 pm on Friday 11 March

Programme

VU_Masters_Day

Specialization ‘Linguistic Engineering’ 2017—2018

Linguistic Engineering is a specialization in the Research Master Linguistics at VU Amsterdam. More details on the: Programme, Admission and Application.

Overview Courses Research Master Specialization: Linguistic Engineering
Overview Courses Research Master Linguistic Engineering, in Research Master LinguisticsView/download flyer Research Master Linguistic Engineering. Programme, admission and application.

Language technology is a rapidly developing field of research. In humanistic research nowadays a firm background in language technology is extremely valuable in the context of manipulating large datasets. The Computational Lexicology and Terminology Lab (CLTL) offers a specialization in the research master Linguistics in which students are trained as linguistic engineer. A linguistic engineer has knowledge of language technology as used in computer applications (e.g. search engines) and of the relevant linguistics.

WHY STUDY AT VU AMSTERDAM?
• The Computational Lexicology and Terminology Lab (CLTL) is one of the world’s leading research institutes in Linguistic Engineering.
• Prof. Dr. Piek Vossen, winner of NWO Spinoza Prize, is leading the group of researchers and several national and international interdisciplinary projects, including the Spinoza project ‘Understanding Language by Machines’.
• Become part of an international group of researchers at Vrije Universiteit Amsterdam!

CAREER PROSPECTS
You can set up your own field of research as a PhD student or you can embark on a career at a research institute. Other opportunities are in the industry, which is in need of linguists with a technical background. Being a graduate of the CLTL will certainly enhance your chances.

flyer Research Master Linguistic Engineering

ADMISSION REQUIREMENTS
• Applicants must have at least a Bachelor’s degree in Linguistics, Artificial Intelligence or comparable Bachelor programme.
• Applicants who do not meet the requirement(s) are also encouraged to apply, provided that they have a sound academic background and a demonstrated interest in and knowledge of engineering and/or linguistics.

SPECIALIZATION: LINGUISTIC ENGINEERING
IN RESEARCH MASTER: LINGUISTICS
LANGUAGE: ENGLISH
DURATION: 2 YEARS FULLTIME
DEADLINE: APRIL 1 2016 (NON-EU), JUNE 1 2016 FOR DUTCH AND EU STUDENTS

For more details on the programme, admission and application:
WWW.FGW.VU.NL
WWW.VU.NL/MA-LINGUISTICS
Dr. H. D. van der Vliet: +31 (0)20 598 6466
EMAIL: Dr. H. D. van der Vliet

Computational Lexicology and Terminology Lab (CLTL)
Language, Literature and Communication
Faculty of Humanities
VU Amsterdam
de Boelelaan 1105
1081 HV Amsterdam
The Netherlands

General information on the Research Master’s in Linguistics at VU Amsterdam.

Specialization ‘Forensic Linguistics/ Language and the Law’ 2016—2017

Forensic Linguistics/ Language and the Law is a specialization in the Research Master Linguistics at VU University Amsterdam. More details on the: Programme, Admission and Application.

Click for full Overview Courses Research Master Specialization: Forensic Linguistics/ Language and the Law
20151219_Forensic_LinguisticsView/download flyer Research Master Forensic Linguistics/ Language and the Law. Programme, admission and application.

Forensic Linguistics is a new and exciting field which has both a narrow and a broad definition. In its more specific sense it denotes the use of linguistic evidence in the courtroom. In its broader sense it refers to all areas of overlap between language and the law, including the language used in legal or quasi-legal settings by participants including judges, lawyers, witnesses, police officers and interpreters.  Graduates of this program will have acquired the theoretical background and practical casework experience to be able to analyze disputed texts, recognize a “language crime” such as bribery or threatening communication (nowadays often sent via social media), and identify participants in the police station or courtroom who are at a linguistic disadvantage and therefore vulnerable to miscarriages of justice.

WHY STUDY AT VRIJE UNIVERSITEIT AMSTERDAM?
• The program in Forensic Linguistics/ Language and the Law is the first program in this area to be offered in the Netherlands (and is currently the only program of its kind in mainland Europe).

• The program has close links with the Linguistic Engineering specialization (led by Prof. Dr. Piek Vossen, winner of NWO Spinoza Prize) giving students unique options within the area of Forensic Linguistics.

• The lecturers on the program are an international team with extensive teaching experience in the UK and USA, as well as the Netherlands, with research networks worldwide.

• Your master’s program will include a period of 2-3 months spent abroad at a university in another country.

• Become part of an international group of researchers at Vrije Universiteit Amsterdam!

CAREER PROSPECTS
You can develop your own specialized research topic as a PhD student; or you can embark on a career at a research institute. Depending on the country where you are seeking to work you may find employment with a government body, such as the ministry of justice or a forensic institute. Commercial companies also often need linguists with a technical background. Being a graduate in Forensic Linguistics, especially if you have taken courses in linguistic engineering, will certainly enhance your chances.

ADMISSION REQUIREMENTS
• Applicants must have at least an (honors) Bachelor’s degree in Linguistics, Modern Languages, Cognitive or Communication Sciences or a comparable Bachelors program.
• Applicants who do not meet the requirement(s) are also encouraged to apply, provided that they have a sound academic background and a demonstrated interest in and knowledge of linguistics and/or law.

SPECIALIZATION: FORENSIC LINGUISTICS/ LANGUAGE AND THE LAW
IN RESEARCH MASTER: LINGUISTICS
LANGUAGE: ENGLISH
DURATION: 2 YEARS FULLTIME
DEADLINE: APRIL 1 2016 (NON-EU), JUNE 1 2016 FOR DUTCH AND EU STUDENTS

For more details on the program, admission and application:
WWW.FGW.VU.NL
WWW.VU.NL/MA-LINGUISTICS
CONTACT PERSON: Dr. F. van der Houwen
EMAIL: f.vander.houwen@vu.nl

Language, Literature and Communication
Faculty of Humanities
VU University
de Boelelaan 1105
1081 HV Amsterdam
The Netherlands

General information on the Research Master’s in Linguistics at VU University Amsterdam.

Video release: Meet NewsReader’s Reading Machine

A Reading Machine in 4 languages

Meet NewsReader’s Reading Machine! — Video explaining NewsReader’s Reading Machine

The volume of news data is enormous and expanding, covering billions of archived documents with millions of documents added daily. These documents are also getting more and more interconnected with knowledge from other sources such as biographies and company databases. NewsReader built a system that extracts what happened to whom, when and where from these sources and stores them in a structured database, enabling more precise search over this immense stack of information. Currently, our system supports English, Spanish, Italian and Dutch. Pilot projects are underway with government and financial information specialists, but the system can be useful to anyone looking to make sense of large amounts of news text.

NewsReader in a nutshell

NewsReader in a nutshell — From Newspapers to Knowledge, Visualised
NewsReader_in_a_NutshellThe project is described in this brochure (PDF).

http://www.newsreader-project.eu/

Logos_Partners

LREC2016 Accepted Papers

CLTL has 11 accepted papers at LREC2016. We’ll see you in Portorož in May!

ORAL PRESENTATIONS

Evaluating Entity Linking: An Analysis of Current Benchmark Datasets and a Roadmap for Doing a Better Job ”  by Marieke van Erp, Pablo Mendes, Heiko Paulheim, Filip Ilievski, Julien Plu, Giuseppe Rizzo and Joerg Waitelonis

Context-enhanced Adaptive Entity Linking” by Giuseppe Rizzo, Filip Ilievski, Marieke van Erp, Julien Plu and Raphael Troncy

MEANTIME, the NewsReader Multilingual Event and Time Corpus” by Anne-Lyse Minard, Manuela Speranza, Ruben Urizar, Begoña Altuna, Marieke van Erp, Anneleen Schoen and Chantal van Son

Crowdsourcing Salient Information from News and Tweets” by Oana Inel, Tommaso Caselli and Lora Aroyo

Temporal Information Annotation: Crowd vs. Experts” by Tommaso Caselli, Rachele Sprugnoli and Oana Inel

Addressing the MFS bias in WSD systems” by Marten Postma, Ruben Izquierdo, Eneko Agirre, German Rigau and Piek Vossen

POSTER/DEMO PRESENTATIONS

A multi-layered annotation scheme for perspectives” by Chantal van Son, Tommaso Caselli, Antske Fokkens, Isa Maks, Roser Morante, Lora Aroyo and Piek Vossen

The VU Sound Corpus: Adding more fine-grained annotations to the Freesound database” by Emiel van Miltenburg, Benjamin Timmermans and Lora Aroyo

NLP and public engagement: The case of the Italian School Reform” by Tommaso Caselli, Giovanni Moretti, Rachele Sprugnoli, Sara Tonelli, Damien Lanfrey and Donatella Solda Kutzman

Two architectures for parallel processing for huge amounts of text” by Mathijs Kattenberg, Zuhaitz Beloki, Aitor Soroa, Xabier Artola, Antske Fokkens, Paul Huygen and Kees Verstoep

“The Event and Implied Situation Ontology: Application and Evaluation” by Roxane Segers, Marco Rospocher, Piek Vossen, Egoitz Laparra, German Rigau, Anne-Lyse Minard

CLIN 26 Organised by CLTL

Computational Linguistics in The Netherlands, CLIN26. VU University of AmsterdamCLIN26, Computational Linguistics in The Netherlands, Amsterdam, December 18 2015

The 26th Meeting of Computational Linguistics in the Netherlands (CLIN26) was organized by the CLTL group of the VU University of Amsterdam, and took place at the Hotel Casa 400 in Amsterdam on December 18, 2015.

Presentations CLIN26
Pictures of CLIN26 on Facebook CLTLVU

CLIN26 invited speaker Miriam Butt
Invited speaker Miriam Butt, Professor for General and Computational Linguistics at the University of Konstanz.

STIL Thesis Prize awarded to Nikos Voskarides
STIL Thesis Prize awarded to Nikos Voskarides

Organising Committee:
Antske Fokkens
Ruben Izquierdo
Roser Morante
Marten Postma
Piek Vossen