Daniel Varab joins NLP at ITU

We are delighted to welcome Daniel Varab back to NLP at ITU! Daniel introduces himself:

“I come from a traditional computer science background and somewhat by chance ended up writing my thesis titled “Contradiction Detection in Natural Languages” in the scope of natural language inference (NLI). This sparked my interest in NLP and has caused all my work since to revolve around the field. I am now two years out after graduation and have spent a year as a research assistant exploring NLI and graph-based dependency parsing, followed by a year at the Danish/Swedish company Karnov Group where I have worked on helping lawyers navigate the ever-growing pile of legislation with the use of NLP techniques. I am now excited to be heading back into academia where I will be working on text summarization together with Natalie Schluter and supporting courses of ITU’s data science bachelor degree.

With regards to interests, I genuinely enjoy work on simple models with well-founded inductive biases, work on so-called less privileged languages, and good old thorough research.”

Daniel Varab

Mateusz Jurewicz joins NLP at ITU

We are delighted to welcome Mateusz Jurewicz to NLP at ITU! Mateusz’ project is on Deep Learning Generative Models for Content Structuring. He introduces himself:

I’m currently working as Machine Learning Engineer at Tjek A/S (also known as ShopGun, eTilbudsavis & Mattilbud in other countries) and have just started my Industrial PhD.

I have previously worked as a software engineer at Intel, working on their Nervana ML project as well as a business analyst at a number of other companies. I’ve received my Master’s degree at the University of Warsaw, back in Poland where I’m originally from.

I’ll be working on generative approaches towards structuring product catalogs, such as the one you can see here:

I really enjoy solving problem through code (particularly in Python), reading unusual books (e.g. Kim Stanley Robinson’s Years of Rice and Salt), rock climbing and dungeons and dragons.  

If you’d like to check out some of my engineering projects, you can take a look at my github portfolio here:

I look forward to working with you 🙂 

Amrith Krishna joins NLP at ITU

We are delighted to welcome Amrith Krishna to NLP at ITU! Amrith introduces himself:

A Passepartout when it comes to my research interests. Broadly, I am interested in anything that comes under computational linguistics and Natural Language Processing. Specifically, my research interests lie in Morphology, Free word order languages, structured prediction, and program synthesis. My Ph.D. thesis titled, “Addressing Characteristics for Data-Driven Modelling of Lexical, Syntactic and Prosodic Tasks in Sanskrit”, was under the supervision of Prof Pawan Goyal at the Dept. of Computer Science and Engineering, IIT Kharagpur (under review). Currently, I work with Dr. Natalie Schluter, where we explore research at the intersection of formal languages, algorithms, and machine learning.

Amrith Krishna

ITU Copenhagen at ACL 2019, Florence

We’re glad to have the following papers at the Annual meeting of the Association for Computational Linguistics 2019 (ACL) in Florence:

  • Claudio Greco Barbara Plank, Raquel Fernández, Raffaella Bernardi. Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering. In ACL 2019. Tuesday July 30, 15:03, Hall 4
  • Nils Rethmeier and Barbara Plank. MoRTy: Unsupervised Learning of Task-specialized Word Embeddings by Autoencoding. In RepL4NLP, ACL 2019 workshop. Friday August 2

Barbara Plank has also co-chaired the entire set of workshops at ACL conferences this year, including ACL and also NAACL and EMNLP. Also, rumour has it that Natalie Schluter may be making a presentation during the final day’s closing talks. Enjoy Florence, and we hope to see you here!

Manuel Ciosici joins NLP at ITU

We are delighted to welcome Manuel Ciosici to NLP at ITU! Manuel has recently handed in his Ph.D. thesis at Aarhus University. During his studies, Manuel researched word representations and their role in Natural Language Processing. Word representation induction methods take in large corpora of natural language text and compute ways to represent words in such a way that makes words understandable by computer algorithms. He studied word representations based on word clusters and showed that they are highly effective at learning to represent syntactic information. With word representations based on word vectors he proposed a method for determining the meaning of abbreviations based on their use in sentences.

Manuel will be doing postdoc work with Leon Derczynski, researching deep learning approaches to multi-lingual stance detection for misinformation detection, as part of the internal MultiStance project funded by ITU Computer Science.

Manuel Ciosici

ITU Copenhagen at NODALIDA 2019, Turku

We are excited to have seven papers accepted at the Nordic Natural Language Processing conference, NODALIDA:

  • UniParse: A universal graph-based parsing toolkit
    Daniel Varab and Natalie Schluter (arXiv)
  • The Lacunae of Danish Natural Language Processing
    Andreas Kirkedal, Barbara Plank, Leon Derczynski and Natalie Schluter
  • Bornholmsk Natural Language Processing: Resources and Tools
    Leon Derczynski and Alex Speed Kjeldsen
  • Political Stance in Danish
    Rasmus Lehmann and Leon Derczynski
  • Cross-Lingual Transfer and Very Little Labeled Data for Named Entity Recognition in Danish
    Barbara Plank
  • Joint Rumour Stance and Veracity Prediction
    Emil Refsgaard Middelboe, Anders Edelbo Lillie and Leon Derczynski
  • Lexical Resources for Low-Resource PoS Tagging in Neural Times
    Sigrid Klerke and Barbara Plank

We hope to see you in Finland!

Natalie Schluter wins Carlsberg Foundation award

ITU’s Natalie Schluter has won a prestigious infrastructure award from the Carlsberg Foundation. The goal of the grant is to improve the level language technology for Danish, affecting the digital lives of all Danish citizens. The project implements a bootstrapping methodology for high-speed, high-quality development of large-scale Danish language research resources: syntactic, semantic, and discursive. In doing so, it will quickly bring the Danish language into relevant realms of modern research, in particular for Deep Learning research. This project is allocated 800,000 kr. and officially starts in Spring 2019; the outputs will be a basic foundation for future researchers and technology companies working with/in Danish.

The project title is Danish Language Inclusion: High-speed high-quality bootstrapping of large-scale research resources. The PI is Associate Professor Natalie Schluter, who also runs the Data Science program at ITU and is part of the WIDS Denmark conference leadership, also funded in 2019 by the Carlsberg Foundation.

One of the Jelling stones, containing key ancient Danish texts

Barbara Plank wins Amazon Research Award

Barbara Plank from ITU’s Natural Language Processing (NLP) research group has received a prestigious Amazon Research Award (ARA) for her work on multi-task deep learning for NLP under adverse conditions. Such adverse conditions include learning for noisy domains up to the extreme case of adaptation, learning new languages. The project will be carried out in collaboration with Amazon Alexa AI, Aachen. The aim of the project is to expedite natural language understanding to dozens of domains and languages.

The ARA awards are granted to foster innovation and collaboration with major research institutions around the globe. The annual award offers up to $80,000 in funding to faculty members at academic institutions worldwide and $20,000 in Amazon Web Service credits to support research in a variety of Artificial Intelligence areas such as computer vision, natural language processing, robotics and security. 

This year, 82 faculty around the globe received the award. In 2017, Amazon awarded 49 researchers, out of which only 5 in the Europe, from over 800 submissions.

ITU Copenhagen at NAACL 2019, Minnesota

We are excited to have papers accepted at the NAACL conference this year:

  • “Recurrent models and lower bounds for projective syntactic decoding” – Natalie Schluter
  • “A closer look at jointly learning to see, ask, and GuessWhat” – Ravi Shekhar, Aashish Venkatesh, Tim Baumgärtner, Elia Bruni, Barbara Plank, Raffaella Bernardi and Raquel Fernández
  • “Quantifying the morphosyntactic content of Brown Clusters” – Manuel Ciosici, Leon Derczynski and Ira Assent

Andreas Søeborg Kirkedal joins NLP at ITU

We are delighted to welcome Dr. Andreas Søeborg Kirkedal to NLP at ITU! Dr. Kirkedal will be collaborating, and working as a lecturer.

Andreas writes:

I am interested in speech recognition and related NLP tasks such as language modelling, intent classification, dialogue modelling and NLU.

I am an external lecturer at the IT university of Copenhagen, in the NLP research group where I will give lectures on automatic speech technology and a mini-project on language modelling. I am also a Sr. speech research scientist at Interactions LLC where I create ASR models for a variety of uses and research transfer learning and multilingual ASR.

Former positions:

  • Independent researcher, consultant and CEO at Seacastle
  • Chief Science Officer, Corti ApS
  • Postdoctoral researcher in ASR, University of Copenhagen
  • Industrial PhD researcher, Mirsk Digital ApS
  • Scientific Assistant, Danish Centre for Applied Speech Technology and Centre for Research in Translation Technology at Copenhagen Business School
  • Student Researcher, Machine Translation group at German Research Centre for Artificial Intelligence, Saarbrücken, Germany


  • PhD in Automatic Speech Recognition
  • Erasmus Mundus Master programme in Language, Communication and Technology:
  • M.Sc. Language Science and Technology from Universität des Saarlandes in Saarbrücken
  • M.Sc. Cognitive Science and Applications from Université de Lorraine in Nancy
  • BA.ling.merc in English and communication from Copenhagen Business School

Public resources:
The result for my PhD thesis work and basis for a lot of the work and research that came after. To my knowledge, it is still the only public ASR system for Danish complete with data and training recipe (not up to date with the latest developments of Kaldi).


I helped a Swedish master student port the code to support the Swedish portion of the Språkbank corpus: