VENI grant for Antske Fokkens

Antske Fokkens received a VENI grant for her proposal Reading between the lines. The project aims at identifying so-called implicit perspectives in text.

Perspectives are conveyed in many ways. Explicit opinions or highly subjective terms are easily identified. However, perspectives are also expressed more subtly. For instance, Nick Wing argues that media describe white suspects (e.g. brilliant, athletic) more positively than black victims (e.g. gang member, drug problems). Ivar Vermeulen (p.c.) observes in a small Dutch corpus that Moroccan perpetrators are easily called thieves (implying generic behavior), where other perpetrators from Dutch only stole something (implying incidental behavior). These observations are anecdotal, but reveal how choices concerning what information to include or how to describe someone’s role may display a specific perspective.

This project will investigate how linguistic analyses may be used to identify these more implicit ways of expressing perspectives in text. This research will be carried out in three stages: First, large scale corpus analyses will be applied to identify distributions of semantic roles (what entities do) and other properties assigned to them (their characteristics). In the second stage, generic participants will be linked to the semantic role they imply (e.g. a thief will be linked to the perpetrator of stealing). With these links, we can investigate whether thieves are described differently from people who steal. In the third stage, emotion and sentiment lexica will be used to identify the sentiment associated with descriptions of people enabling research that investigates whether people are depicted positively or negatively.

The research is carried out in the context of digital humanities and social sciences. Evaluation and experimental setup will be guided towards identifying differences in perspective between sources. In addition to correctness of linguistic analyses (intrinsic evaluation), the possibility of using the method for identifying changes in perspective over time (historic research) or differences in perspective between sources (communication science) will be investigated.

VU University scientists cluster responses NWO’s National Research Agenda

Led by Piek Vossen, a group of scientists at VU University automatically divided 11,700 questions from NWO’s National Research Agenda into clusters. On the basis of language technology and mathematical equations of the most important words, there were slightly over 60 clusters of questions found which at their turn were classified in a few hundred sub-clusters. Important themes are health and energy, but also big data, art, and sports. NWO is happy with this analysis. The VU Topic Browser allows NWO to quickly and efficiently process the large number of responses.

National Science Agenda VU Topic Browser — by Emiel van Miltenburg, Kasper Welbers, Hennie van der Vliet, Wouter van Atteveldt, Piek Vossen
National Science Agenda VU Topic BrowserNational Science Agenda VU Topic Browser The graph shows 60 clusters and a few hundred sub-groups found in 11,700 questions from NWO’s National Research Agenda.

Netherlands Organisation for Scientific Research (NWO): Dutch Science Agenda
Scientists determine the Dutch Science Agenda together with companies, civil society organisations and interested citizens. The agenda consolidates the themes that science will focus on in the coming years. What are the favourable opportunities for Dutch science and how can science contribute to finding solutions for societal challenges and making the most of economic opportunities?

Position AAA Data Science Postdoc or PhD Student in Computational Linguistics

AAA Data Science Postdoc or PhD Student in Computational Linguistics

The Amsterdam Academic Alliance (AAA) is a joint initiative of the two Amsterdam-based universities – VU and the UvA – aimed at intensifying collaboration with each other and with other knowledge institutions in the region. The objective of the AAA is to cement Amsterdam’s position as a major international player and hub of academic excellence. The alliance is to result in different outcomes in each scientific field.

This advertisement concerns one of the 14 positions. The Network Institute of the VU University of Amsterdam is looking for a motivated Postdoctoral researcher or PhD student for the project “From text to Deep Data”. The candidate will be part of the Network Institute of the VU University Amsterdam and will work within multidisciplinary teams of humanities researchers and computer scientists.
The work will be done in the context of a larger research program called “QuPiD2: Quality and Perspectives in Deep Data” in collaboration with other researchers aiming all together to achieve a formal modeling of quality and perspectives.

As part of the QuPiD2 research team, the candidate will develop 1) a perspective model for representing the subjective relation between the source of information and the statements in it, and 2) software to detect such interpersonal communication layers and perspectives from text. The project will transform big unstructured text data into deep data that show the emotions, opinions and view points on the changing world. It will reveal the social networks and dynamics within trust networks that influence our world views.

1. Studying existing models for handling provenance, attribution, sentiments, opinions and emotions as expressed in text.
2. Developing an overarching perspective model for representing the subjective relations between sources and their statements. The model will initially be based on textual data but should show the capacity to model perspectives on any type of (big) data.
3. Using semantic web standards, e.g. RDF, SPARQL, to represent and access the data within the project
4. Studying existing NLP approaches to detect perspective relations in texts. Both English and Dutch texts will be considered.
5. Developing a machine-crowd empowered processing of textual sources for populating the QuPiD2 model
6. Creating data sets for training and evaluation through expert annotation and crowd annotation.
7. Developing new components and approaches to obtain the perspective values within the model from textual data.
8. Evaluate the components against the data sets developed and within an application environment.
9. Collaboration with the QuPiD2 program research team
10. Publish the results of the work as scientific articles in high ranked journals and conferences, as well as present the work at relevant scientific venues

The candidate should have a strong background (MA) in computational linguistics and semantic web technology with expertise in data modelling, modelling perspectives, subjectivity and attribution relations expressed in natural language. The candidate should have sufficient programming skills and experience with data engineering and text mining.

Further particulars
The appointment will be for a period of three years for a postdoc and four years for a PhD student. You can find information about our excellent employment conditions at such as:
• Remuneration of 8,3% end-of-year bonus and 8% holiday allowance;
• Solid pension scheme (ABP);
• A minimum of 29 holidays In case of full-time employment;
• Possibilities to save up holidays for sabbatical leave.

For a postdoc, the salary will be in accordance with university regulations for academic personnel, and depending on experience, range from a minimum of € 2,476 gross per month up to a maximum of € 3,908 gross per month (salary scale 10) based on a full-time employment.

For a PhD student, the salary will be in accordance with university regulations for academic personnel, range from a minimum of € 2,125 gross per month in the first year up to a maximum of € 2,717 (salary scale 85.0-3) based on full-time employment.

For additional information please contact:
Prof. Piek Vossen
phone: +31 681773878 or +31 20 59 86457
e-mail: / attention Piek Vossen

Dr. Lora Aroyo
Phone: +31 620329972

Applications may only be submitted via To process your application immediately, please quote the vacancy number and the title of the position you are applying for in the subject-line. Applications must include a detailed curriculum vitae, a motivation letter explaining why you are the right candidate, list of projects you have worked on with brief descriptions of your contributions and the names and contact addresses of two academic references from which information about the candidate can be obtained. All these should be grouped in one PDF attachment.

Applications will be accepted until 10 May 2015.

Any other correspondence in response to this advertisement will not be dealt with.

Call for papers Global Wordnet Conference (GWC2016)

The Global Wordnet Association is pleased to announce the 8th International Global Wordnet Conference (GWC2016).
Bucharest, Romania, January 27-30, 2016
Global Wordnet Association:
Conference website:

Research Institute for Artificial Intelligence “Mihai Drăgănescu” of the Romanian Academy

The conference will be hosted by the Research Institute for Artificial Intelligence “Mihai Drăgănescu” of the Romanian Academy (local organization: Verginica Mititelu and Corina Forăscu).

Details about the Association and the conference can be found on the conference website (

Release Cornetto Demo

Cornetto is a lexical resource for the Dutch language which combines two resources with different semantic organisations: the Dutch Wordnet with its synset organisation and the Dutch Reference Lexicon which includes definitions, usage constraints, selectional restrictions, syntactic behaviours, illustrative contexts, etc. For more information on the contents of Cornetto, see Cornetto user documentation.

Cornetto — Demo by INL, VU University and CLARIN-NLCornetto DemoCornetto demo. Visualization of Synset Relations of the Dutch Word Form: ‘bloem’, Part of Speech: Noun, Sense Number: 3.

The Cornetto demo provides possibilities to query the resource by choosing one of the following options:

Simple Search for Lexical Entries
Advanced Search for Lexical Entries
Visualization of Synset Relations

Cornetto is also available in XML (following the Lexicon Markup Format) and RDF (for more information, please refer to Open Source WordNet).

The Cornetto demo is realized as part of the Cornetto-LMF-RDF project, which has been funded by CLARIN-NL (

NLeSC Project granted: Visualizing uncertainty and perspectives

Visualizing uncertainty and perspectives: Netherlands eScience Center, project number 027.014.402 (April 01, 2015 – April 01, 2016)


The Netherlands eScience Center is pleased to announce the initiation of six new projects in the areas of Environment and Sustainability, Life Sciences & eHealth, Humanities and Social Sciences and Physics and Beyond. The projects, scheduled to start in 2015, are collaborations with research teams from multiple Dutch academic groups and represent the latest step in the continued development of NLeSC’s project portfolio. Two projects will be funded in the areas of Humanities and Social Sciences.

Visualizing uncertainty and perspectives
Prof. Piek Vossen and Antske Fokkens are co-applicants of the project “Visualizing uncertainty and perspectives”: one of the six proposals that will receive funding to the value of 125K euro. This project aims to develop a tool that visualizes subjectivity, perspective and uncertainty to make them controllable variables in Humanities research. The tool should allow users to compare information from different sources representing alternative perspectives and visualize subjectivity and uncertainty. Such a visualization enables improved and comprehensive source criticism, provides new directions of research and strengthens the methodology of digital humanities.

Hackathon NewsReader Amsterdam: Jan. 21, 2015

Porsches to Pizza – Hack 6,000,000 automotive news articles #NewsReader


The global automotive industry has a value of the order of $1 trillion annually.

The industry comprises a massive network of suppliers, manufacturers, advertisers, marketeers and journalists. Attracting and supporting the industry is a significant goal of industrial policy.

On January 21st 2015 we’re running an event which should be of interest if :-

  • You’re a data journalist on an automotive desk;
  • You’re an analyst sifting daily news looking for information on your company or on competitors;
  • You’re a data analyst looking to understand how your customers operate their supply chain;
  • You’re an analyst trying to find secondary events that could influence an investment decision

We’ve developed a powerful new tool called ‘NewsReader’ which utilises natural language understanding and semantic web technology. This helps you to better understand the interactions between companies and key individuals, derived from news articles.

We’re processing 6 million news articles from sources around the world both general and specialist media to obtain a searchable database of the news on the automotive sector and you can play with at our Hack event.

Over summer we ran a Hack Day on news surrounding the World Cup. NewsReader enabled the attendees to pull out networks of interactions between politicians, football players and people in FIFA. Not only who they were interacting with but what they were doing.

Early analysis of the automotive data is giving us some interesting insights. For example, news stories between 2005 and 2009 reported that Porsche was buying an ever larger stake in Volkswagen, prompting speculations that Porsche would take over Volkswagen. However, in 2009 the tables turned and Volkswagen took a majority stake and eventually took over Porsche. Our system is able to discriminate between articles mentioning that Volkswagen was taking over Porsche and vice versa rather than simple co-occurrences that are generally found in aggregated news analysis systems. In the slipstream of this take-over, Wendelin Wiedeking, longtime Porsche senior executive, was fired from Porsche. In 2013 he opened the first of a chain of Italian restaurants. As we have a structured database in which similar events can easily be retrieved, it is a small effort to find out that Jürgen Schrempp, former CEO of DaimlerChrysler, also opened a restaurant after retiring from the car industry.

We are running an event in London on January 30th 2105 and if you cannot make the 21st in Amsterdam you may want to join us there.

NewsReader helps you find a needle in a haystack.



Mini-seminar: Disambiguating entities: Dec. 12 2014

Presentations Mini-seminar: Disambiguating entities and their roles in texts based on background knowledge on December 12, 2014:

Introduction (pdf) by Prof. dr. Piek Vossen
Towards a Dutch FrameNet-style Semantic Role Labeler (pdf) Presentation by Chantal van Son
Named Entity Disambiguation with two-stage coherence optimization (pdf) Presentation by Filip Ilievski

Invitation below:

Mini-seminar: Disambiguating entities and their roles in texts based on background knowledge

Dear all,

We cordially invite you to our mini-seminar “Disambiguating entities and their roles in texts based on background knowledge ” in which we will present our Master’s thesis topics and the current/future work. It will take place on Friday, December 12 from 10:00 to 12:00 in room C-121 (W&N Building).

An array of text processing tools is currently used to extract events, recognize and link entities, and discover relations between the two. We, Filip Ilievski and Chantal van Son, tackle these type of Natural Language Processing tasks by using background knowledge from lexical resources and the Semantic Web. The disambiguation of entities and their context is in the core of both approaches: Filip’s thesis aims to disambiguate them by determining their identity while Chantal’s thesis aims at disambiguating the roles they play in context. You can find the descriptions of both projects below.

Prof. Piek Vossen will kick-off the mini-seminar by depicting the background of the problem and presenting the existing approaches. Prof. Frank van Harmelen will conclude the event with a discussion on the integration of background knowledge in language processing.

10:00 – 10:15 Introduction by Piek Vossen
10:20 – 10:50 Towards a Dutch FrameNet-style Semantic Role Labeler (Chantal van Son)
10:55 – 11:25 Named Entity Disambiguation with two-stage coherence optimization (Filip Ilievski)
11:30 – 12:00 Closing remarks and discussion lead by Frank van Harmelen

Towards a Dutch FrameNet-style Semantic Role Labeler (Chantal van Son)
Semantic role labeling (SRL) is one of the key tasks in Natural Language Processing for deep text understanding. Because of its rich and fine-grained categorization of different conceptual scenarios and their specific semantic roles, FrameNet is a popular resource to serve as a basis for SRL systems in English. For Dutch however there is currently no FrameNet-like resource available that can be used to train a SRL system, and creating such a resource usually takes a great deal of expensive manual effort. This study investigates how existing tools and resources, such as the SoNaR Semantic Role Labeler (SSRL) and the Predicate Matrix, can be exploited for FrameNet based SRL in Dutch. In this talk I will present this method while discussing some of its difficulties and possible solutions to solve them.

Named Entity Disambiguation with two-stage coherence optimization (Filip Ilievski)
Contemporary Natural Language Processing modules solve Entity Linking, Event Detection, and Semantic Role Labeling as separate problems. From the semantic point of view, each of these processes adds another brush-stroke onto the canvas of meaning: entities and events are components that occur in relations which correspond to roles. The approach presented here extends such NLP processes with a semantic process of coherence optimization. I use both binary logic and probabilistic models built through manual and automatic techniques. The binary filtering phase relies on restrictions from VerbNet and a domain-specific ontology. The optimization phase aims to maximize the coherence between the remaining candidates in a probabilistic manner based on available background knowledge about the entities.

Kind regards,
Filip Ilievski & Chantal van Son

Round table on ‘Time and Language’: Oct. 30, 2014

Event date:
Thursday, 30 October, 2014 – 18:30 to 20:00

The round table “Time and Language”, organized by Tommaso Caselli (VUA, Amsterdam) and Rachele Sprugnoli (DH-FBK), will be held in Genoa on Thursday October 30 as part of “Festival della Scienza “.

Time is a pervasive element of human life that is also reflected in the language. But how time is encoded in the various languages ​​of the world? How long is an event? What happens if we want to teach a computer to reconstruct the temporal order of events in a text? The philosopher of language Andrea Bonomi, the linguist Pier Marco Bertinetto and the computational linguist Bernardo Magnini will answer these questions to reveal to the public the role that time has in the language and the challenges of technology in this field. Philosophy, linguistics and technology come together and introduce the public to the fascinating relationship between time and language.

TiNT: Terminologie in het Nederlandse Taalgebied: Nov. 14, 2014

141117_TiNT_2014Impression of TiNT 2014, November 14, 2014 and link to programme

Op 14 november 2014 organiseert de vereniging NL-Term in samenwerking met het Steunpunt Nederlandstalige Terminologie voor de zesde maal de TiNT-dag. TiNT staat voor Terminologie in het Nederlandse Taalgebied. Dit jaarlijks terugkerende evenement is bedoeld om actueel onderzoek en de professionele praktijk op het gebied van Nederlandstalige terminologie voor te stellen voor een breed publiek. TiNT 2014 vindt plaats op het Ministerie van Buitenlandse Zaken in Den Haag, van 9.30 tot 18.00 uur. Het thema is dit jaar Terminologie in de communicatie tussen overheid en burger. We hebben zeer interessante en voor het thema relevante sprekers vast kunnen leggen, waaronder Alex Brenninkmeijer van de Europese Rekenkamer, tot voor kort Nationale Ombudsman, Geert Joris, Algemeen Secretaris van de Nederlandse Taalunie en Jac Brouwer, Landelijk Huisstijlcoördinator van de Belastingdienst.

We hopen ook dit jaar weer op een volle zaal, interessante presentaties en levendige discussie.

Meer informatie, het voorlopige programma en het inschrijfformulier kunt u vinden op onze website:

Met vriendelijke groet,
namens het SNT en NL-Term
Anneleen Schoen