NLP@ITU at the first EurNLP summit

We are participating in the first EurNLP summit at Facebook in London this week!

  • Barbara is program co-chair of EurNLP, together with Sebastian Ruder (DeepMind) and with the general chairs from Facebook AI: Sebastian Riedel, Fabrizio Sebastiani and Armand Joulin –  EurNLP organization
  • Natalie is giving the talk “Neural Syntactic Parsing seems so simple. Is it?
  • Poster presentations by Alan and Manuel:

    Alan Ramponi, Barbara Plank and Rosario Lombardo. On the impact of cross-domain edge detection in biomedical event extraction

    Manuel Ciosici, Leon Derczynski and Ira Assent. Characterizing the information content of Brown clusters

Marija Stepanovic joins NLP at ITU

Marija is a new PhD student in computer science working on automatic speech recognition. Her prior education includes bachelor and master studies in theoretical and applied linguistics with a specialization in phonetics, phonology, and cognitive semantics, as well as master studies in computational cognitive science with an emphasis on machine learning and natural language processing. Her research incorporates linguistic, computational, and statistical analyses of spoken and textual data with the aim of identifying and modeling cognitive processes behind recurring concepts and patterns across languages for the purpose of bringing machines closer to truly understanding natural language.

 

At ITU, Marija will be working as a PhD student under the supervision of Barbara Plank. Her project is concerned with improving speech recognition for low-resource dialects of Danish and English through a comparative acoustic analysis of their vowel systems.

Marija

Where to find us: NODALIDA 2019

Being a research group located in the Nordics, ITU NLP has a strong presence at NODALIDA this year, held in Turku. The conference’s general chair is Barbara Plank from ITU NLP, for whose efforts we are all very grateful. You can find us here:

Monday September 30

NLPL Workshop on Deep Learning for Natural Language Processing, 09:00-17:00 PUB2

  • Co-organiser: Leon Derczynski (also first session chair, 09:20-10:00)

Deep Transfer Learning: Learning across Languages, Modalities and Tasks

Barbara Plank. Keynote, 10:30-11:30, NLPL DL4NLP, PUB2

Tuesday October 1

Lexical Resources for Low-Resource PoS Tagging in Neural Times.

Barbara Plank and Sigrid Klerke. Talk: 11:25-11:50, Parallel session A, PUB1

Bornholmsk Natural Language Processing: Resources and Tools.

Leon Derczynski and Alex Speed Kjeldsen. Poster: 16:45-17:45, Poster and demo session, Entrance hall

We introduce language processing resources and tools for Bornholmsk, a language spoken on the island of Bornholm, with roots in Danish and closely related to Scanian. This presents an overview of the language and available data, and the first NLP models for this living, minority Nordic language.

The Lacunae of Danish Natural Language Processing.

Andreas Kirkedal, Barbara Plank, Leon Derczynski and Natalie Schluter. Poster: 16:45-17:45, Poster and demo session, Entrance hall

Danish has received relatively little attention from a technological perspective. In this paper, we review Natural Language Processing (NLP) research, digital resources and tools which have been developed for Danish. We find that availability of models and tools is limited, which calls for work that lifts Danish NLP a step closer to the privileged languages.

UniParse: A universal graph-based parsing toolkit.

Daniel Varab and Natalie Schluter. Demo: 16:45-17:45 

Come by for a chat on how UniParse works and how it may be useful for your research.

Wednesday October 2, 2019

Political Stance Detection for Danish.

Rasmus Lehmann and Leon Derczynski. Talk: 11:10-11:35, Parallel session A: Sentiment Analysis and Stance, PUB1

The presented research concerns identification of the stance towards immigration within quotes from politicians brought in Danish newspapers. Covered in the presentation will be the creation of a dataset of stance annotated quotes from politicians in Danish, the first of its kind, along with the creation of two deep-learning based stance detection models, one using an LSTM architecture and one using a basic feed forward architecture, along with the results of testing these models.

Neural Cross-Lingual Transfer and Limited Annotated Data for Named Entity Recognition in Danish.

Barbara Plank. Talk: 11:10-11:35, Parallel session B: Named Entity Recognition, PUB3

Session chairing – Parallel session A: Text Generation and Language Model Applications
14:00-15:15, PUB1. Leon Derczynski

Joint Rumour Stance and Veracity Prediction.

Anders Edelbo Lillie, Emil Refsgaard Middelboe and Leon Derczynski. Talk: 11:35-12:00, Parallel session A: Sentiment Analysis and Stance, PUB1

We present an end-to-end stance and veracity prediction system that works at SotA level on Danish despite low data, and show that stance-based veracity prediction models can be transferred across languages and platforms with negligible performance drop.

NEALT business meeting

13:00-14:00, PUB1

Watch out for ✨exciting✨ items from ITU NLP here…

 

 

We hope to meet you in Turku!

Rob van der Goot joins NLP at ITU

Rob has a background in information science, but quickly became interested in the field of natural language processing, especially in the problem of building robust models. His expertise lies in automatically deriving syntactic analyses of natural language (parsing), with a focus on low-resource settings. During his PhD, he improved the automatic syntactic analysis of social media texts by first translating it to a more ‘standard’ form (try it yourself: www.robvandergoot.com/monoise). More broadly, he is interested in the automatic processing of all types of language varieties without having explicit training data.
 
Rob will be working at the ITU as a postdoc under supervision of Barbara Plank (partially funded by Amazon), together they will develop natural language processing models for low-resource languages and language varieties.
 
Rob van der Goot

Alan Ramponi joins NLP at ITU

Alan is a Ph.D. student in natural language processing at Fondazione The Microsoft Research – University of Trento COSBI, Italy. His research focuses on unsupervised domain adaptation and deep learning methods for biomedical information extraction from scientific publications. Broadly, his interests are centered on building robust language models which are resilient to domain shift, thus being readily applicable to real-world problems in which the target domain is not known in advance.

Alan will be doing his work as a visiting Ph.D. fellow with Barbara Plank, researching domain adaptation methods for all the stages of the task of biomedical event extraction.

Rasmus Lehmann joins NLP at ITU

We’re very happy to welcome Rasmus Lehmann to NLP at ITU!

Rasmus resides in the cross section between business, communication and technology, with a Bachelor’s degree within organizational communication and economics from CBS, and a Master’s degree within software development, specialized in Business Intelligence and Machine Learning. Rasmus’ interest in the field of NLP was aroused while working on implementing a deep learning-based model for use in rumor identification, and he continued to write his thesis, titled “Stance Detection in Danish Politics”. The focus of this project was to build a dataset of quotes from Danish politicians for use in stance detection in Danish, and applying a deep learning-based approach to solving this classification task. The project was subsequently turned into a submission for the NoDaLiDa 2019 conference on Computational Linguistics, to which the paper was accepted.

Rasmus will be working closely with Leon Derczynski on creating tools for NLP in Danish.

Rasmus Lehmann

Daniel Varab joins NLP at ITU

We are delighted to welcome Daniel Varab back to NLP at ITU! Daniel introduces himself:

“I come from a traditional computer science background and somewhat by chance ended up writing my thesis titled “Contradiction Detection in Natural Languages” in the scope of natural language inference (NLI). This sparked my interest in NLP and has caused all my work since to revolve around the field. I am now two years out after graduation and have spent a year as a research assistant exploring NLI and graph-based dependency parsing, followed by a year at the Danish/Swedish company Karnov Group where I have worked on helping lawyers navigate the ever-growing pile of legislation with the use of NLP techniques. I am now excited to be heading back into academia where I will be working on text summarization together with Natalie Schluter and supporting courses of ITU’s data science bachelor degree.

With regards to interests, I genuinely enjoy work on simple models with well-founded inductive biases, work on so-called less privileged languages, and good old thorough research.”

Daniel Varab

Mateusz Jurewicz joins NLP at ITU

We are delighted to welcome Mateusz Jurewicz to NLP at ITU! Mateusz’ project is on Deep Learning Generative Models for Content Structuring. He introduces himself:

I’m currently working as Machine Learning Engineer at Tjek A/S (also known as ShopGun, eTilbudsavis & Mattilbud in other countries) and have just started my Industrial PhD.

I have previously worked as a software engineer at Intel, working on their Nervana ML project as well as a business analyst at a number of other companies. I’ve received my Master’s degree at the University of Warsaw, back in Poland where I’m originally from.

I’ll be working on generative approaches towards structuring product catalogs, such as the one you can see here:
https://bit.ly/2ZhRoY7

I really enjoy solving problem through code (particularly in Python), reading unusual books (e.g. Kim Stanley Robinson’s Years of Rice and Salt), rock climbing and dungeons and dragons.  

If you’d like to check out some of my engineering projects, you can take a look at my github portfolio here:
https://github.com/mateuszjurewicz

I look forward to working with you 🙂 

Amrith Krishna joins NLP at ITU

We are delighted to welcome Amrith Krishna to NLP at ITU! Amrith introduces himself:

A Passepartout when it comes to my research interests. Broadly, I am interested in anything that comes under computational linguistics and Natural Language Processing. Specifically, my research interests lie in Morphology, Free word order languages, structured prediction, and program synthesis. My Ph.D. thesis titled, “Addressing Characteristics for Data-Driven Modelling of Lexical, Syntactic and Prosodic Tasks in Sanskrit”, was under the supervision of Prof Pawan Goyal at the Dept. of Computer Science and Engineering, IIT Kharagpur (under review). Currently, I work with Dr. Natalie Schluter, where we explore research at the intersection of formal languages, algorithms, and machine learning.

Amrith Krishna

ITU Copenhagen at ACL 2019, Florence

We’re glad to have the following papers at the Annual meeting of the Association for Computational Linguistics 2019 (ACL) in Florence:

  • Claudio Greco Barbara Plank, Raquel Fernández, Raffaella Bernardi. Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering. In ACL 2019. Tuesday July 30, 15:03, Hall 4
  • Nils Rethmeier and Barbara Plank. MoRTy: Unsupervised Learning of Task-specialized Word Embeddings by Autoencoding. In RepL4NLP, ACL 2019 workshop. Friday August 2

Barbara Plank has also co-chaired the entire set of workshops at ACL conferences this year, including ACL and also NAACL and EMNLP. Also, rumour has it that Natalie Schluter may be making a presentation during the final day’s closing talks. Enjoy Florence, and we hope to see you here!