ITU Copenhagen at ACL 2019, Florence

We’re glad to have the following papers at the Annual meeting of the Association for Computational Linguistics 2019 (ACL) in Florence:

  • Claudio Greco Barbara Plank, Raquel Fernández, Raffaella Bernardi. Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering. In ACL 2019. Tuesday July 30, 15:03, Hall 4
  • Nils Rethmeier and Barbara Plank. MoRTy: Unsupervised Learning of Task-specialized Word Embeddings by Autoencoding. In RepL4NLP, ACL 2019 workshop. Friday August 2

Barbara Plank has also co-chaired the entire set of workshops at ACL conferences this year, including ACL and also NAACL and EMNLP. Also, rumour has it that Natalie Schluter may be making a presentation during the final day’s closing talks. Enjoy Florence, and we hope to see you here!

Manuel Ciosici joins NLP at ITU

We are delighted to welcome Manuel Ciosici to NLP at ITU! Manuel has recently handed in his Ph.D. thesis at Aarhus University. During his studies, Manuel researched word representations and their role in Natural Language Processing. Word representation induction methods take in large corpora of natural language text and compute ways to represent words in such a way that makes words understandable by computer algorithms. He studied word representations based on word clusters and showed that they are highly effective at learning to represent syntactic information. With word representations based on word vectors he proposed a method for determining the meaning of abbreviations based on their use in sentences.

Manuel will be doing postdoc work with Leon Derczynski, researching deep learning approaches to multi-lingual stance detection for misinformation detection, as part of the internal MultiStance project funded by ITU Computer Science.

Manuel Ciosici

ITU Copenhagen at NODALIDA 2019, Turku

We are excited to have seven papers accepted at the Nordic Natural Language Processing conference, NODALIDA:

  • UniParse: A universal graph-based parsing toolkit
    Daniel Varab and Natalie Schluter (arXiv)
  • The Lacunae of Danish Natural Language Processing
    Andreas Kirkedal, Barbara Plank, Leon Derczynski and Natalie Schluter
  • Bornholmsk Natural Language Processing: Resources and Tools
    Leon Derczynski and Alex Speed Kjeldsen
  • Political Stance in Danish
    Rasmus Lehmann and Leon Derczynski
  • Cross-Lingual Transfer and Very Little Labeled Data for Named Entity Recognition in Danish
    Barbara Plank
  • Joint Rumour Stance and Veracity Prediction
    Emil Refsgaard Middelboe, Anders Edelbo Lillie and Leon Derczynski
  • Lexical Resources for Low-Resource PoS Tagging in Neural Times
    Sigrid Klerke and Barbara Plank

We hope to see you in Finland!

Natalie Schluter wins Carlsberg Foundation award

ITU’s Natalie Schluter has won a prestigious infrastructure award from the Carlsberg Foundation. The goal of the grant is to improve the level language technology for Danish, affecting the digital lives of all Danish citizens. The project implements a bootstrapping methodology for high-speed, high-quality development of large-scale Danish language research resources: syntactic, semantic, and discursive. In doing so, it will quickly bring the Danish language into relevant realms of modern research, in particular for Deep Learning research. This project is allocated 800,000 kr. and officially starts in Spring 2019; the outputs will be a basic foundation for future researchers and technology companies working with/in Danish.

The project title is Danish Language Inclusion: High-speed high-quality bootstrapping of large-scale research resources. The PI is Associate Professor Natalie Schluter, who also runs the Data Science program at ITU and is part of the WIDS Denmark conference leadership, also funded in 2019 by the Carlsberg Foundation.

One of the Jelling stones, containing key ancient Danish texts

Barbara Plank wins Amazon Research Award

Barbara Plank from ITU’s Natural Language Processing (NLP) research group has received a prestigious Amazon Research Award (ARA) for her work on multi-task deep learning for NLP under adverse conditions. Such adverse conditions include learning for noisy domains up to the extreme case of adaptation, learning new languages. The project will be carried out in collaboration with Amazon Alexa AI, Aachen. The aim of the project is to expedite natural language understanding to dozens of domains and languages.

The ARA awards are granted to foster innovation and collaboration with major research institutions around the globe. The annual award offers up to $80,000 in funding to faculty members at academic institutions worldwide and $20,000 in Amazon Web Service credits to support research in a variety of Artificial Intelligence areas such as computer vision, natural language processing, robotics and security. 

This year, 82 faculty around the globe received the award. In 2017, Amazon awarded 49 researchers, out of which only 5 in the Europe, from over 800 submissions.

ITU Copenhagen at NAACL 2019, Minnesota

We are excited to have papers accepted at the NAACL conference this year:

  • “Recurrent models and lower bounds for projective syntactic decoding” – Natalie Schluter
  • “A closer look at jointly learning to see, ask, and GuessWhat” – Ravi Shekhar, Aashish Venkatesh, Tim Baumgärtner, Elia Bruni, Barbara Plank, Raffaella Bernardi and Raquel Fernández
  • “Quantifying the morphosyntactic content of Brown Clusters” – Manuel Ciosici, Leon Derczynski and Ira Assent

Andreas Søeborg Kirkedal joins NLP at ITU

We are delighted to welcome Dr. Andreas Søeborg Kirkedal to NLP at ITU! Dr. Kirkedal will be collaborating, and working as a lecturer.

Andreas writes:

I am interested in speech recognition and related NLP tasks such as language modelling, intent classification, dialogue modelling and NLU.

I am an external lecturer at the IT university of Copenhagen, in the NLP research group where I will give lectures on automatic speech technology and a mini-project on language modelling. I am also a Sr. speech research scientist at Interactions LLC where I create ASR models for a variety of uses and research transfer learning and multilingual ASR.

Former positions:

  • Independent researcher, consultant and CEO at Seacastle
  • Chief Science Officer, Corti ApS
  • Postdoctoral researcher in ASR, University of Copenhagen
  • Industrial PhD researcher, Mirsk Digital ApS
  • Scientific Assistant, Danish Centre for Applied Speech Technology and Centre for Research in Translation Technology at Copenhagen Business School
  • Student Researcher, Machine Translation group at German Research Centre for Artificial Intelligence, Saarbrücken, Germany

Education:

  • PhD in Automatic Speech Recognition
  • Erasmus Mundus Master programme in Language, Communication and Technology:
  • M.Sc. Language Science and Technology from Universität des Saarlandes in Saarbrücken
  • M.Sc. Cognitive Science and Applications from Université de Lorraine in Nancy
  • BA.ling.merc in English and communication from Copenhagen Business School

Public resources:
The result for my PhD thesis work and basis for a lot of the work and research that came after. To my knowledge, it is still the only public ASR system for Danish complete with data and training recipe (not up to date with the latest developments of Kaldi).

https://github.com/kaldi-asr/kaldi/tree/master/egs/sprakbanken 

I helped a Swedish master student port the code to support the Swedish portion of the Språkbank corpus:
https://github.com/kaldi-asr/kaldi/tree/master/egs/sprakbanken_swe

ITU Copenhagen at EMNLP 2018, Brussels

Our group has had success at EMNLP 2018 (Empirical Methods in Natural Language Processing) to be held in Brussels; excellent news! See us here:

  • Dimensions of Variation in User-generated Text. Leon Derczynski
    Thursday Nov 1 at 9:05, keynote at the Workshop on Noisy, User-generated Text (WNUT).
  • Toward Universal Dependencies for Shipibo-Konibo. Alonso Vásquez, Renzo Ego Aguirre, Candy Angulo, John Miller, Claudia Vil- lanueva, Željko Agic, Roberto Zariquiey and Arturo Oncevay.
    Thursday Nov 1 at 11:00–12:30, at the Workshop on Noisy, User-generated Text (WNUT).
  • Learning X^2 – Natural Language Processing Across Languages and Domains. Barbara Plank
    Thursday Nov 1 at 14:00, keynote at the Universal Dependencies Workshop (UDW).
  • Low-resource named entity recognition via multi-source projection: Not quite there yet? Jan Vium Enghoff, Søren Harrison and Željko Agić
    Thursday Nov 1 at the Workshop on Noisy, User-generated Text (WNUT); 14:45–15:15 Lightning Talks & 15:15–16:30 Poster Session
  • Session 1B, Semantics is chaired by Natalie Schluter
    Friday Nov 2, 11:00-12:30
  • Distant Supervision from Disparate Sources for Low-Resource Part-of-Speech Tagging. Barbara Plank and Željko Agić
    Friday Nov 2 at 13:45 (Multilingual Methods I session, first talk after lunch).
  • Session 6D, Multilingual Methods II is chaired by Barbara Plank
    Saturday Nov 3, 11:00-12:30
  • The glass ceiling in NLP. Natalie Schluter.
    Saturday Nov 3, 13:45; Session 7B Social Applications II
  • When data permutations are pathological: the case of neural natural language inference. Natalie Schluter and Daniel Varab.
    Sunday Nov 4, 13:45. Semantic Parsing and Semantic Inference posters

We hope to see you there!

Sigrid Klerke joins NLP at ITU

We are delighted to welcome Dr. Sigrid Klerke to NLP at ITU! Sigrid completed her PhD at Copenhagen University and has since worked in industry. Dr. Klerke was a fundamental part of EyeJustRead, who develop an advanced tool that can support early reading through eye-tracking that enables students and teachers to better understand and work with the specific students challenges.

Her research focuses on finding ways to make good use of old and new language technologies for teaching the finer details of speaking a non-native language, and Sigrid will be doing post doc work with Barbara Plank.

ITU Copenhagen at NAACL 2018, New Orleans

We’re excited to be part of NAACL this year, in New Orleans. This conference is about using computational techniques to analyse and processing natural language – a powerful way of understanding how language works, and one of the hardest AI challenges.  You can find us in many places throughout the event.

At the Widening NLP workshop (WiNLP), Natalie Schluter will give an invited talk; “The glass ceiling in NLP”. Friday June 1st at 11am in Strand 12.

Barbara Plank will also be at the Widening NLP workshop, serving on the career panel. Friday June 1st in Strand 12, at 14.30.

The session on Phonology, Morphology and Word Segmentation (1) will be chaired by Barbara Plank. Empire B, 10.30-11.30, on Saturday, June 2nd.

On Saturday June 2nd, Natalie Schluter presents “The Word Analogy Testing Caveat“, identifying problems with analogy testing methods common in our field, and proposing solutions. Elite Hall B, 15.30-17.00.

Barbara Plank  will be on the Ethics in NLP panel in the Industry track; this is from 15.30-16.50, on Sunday June 3rd, in Empire D.

DuoLingo is a powerful education tool. Barbara Plank worked with Sigrid Klerke and Héctor Martínez Alonso on data from this source, as part of a shared task in the Building Education Applications workshop (BEA), which will be presented on Tuesday, June 5th at 14.00 in Strand 12A. The paper’s “Grotoco@SLAM: Second Language Acquisition Modeling with Simple Features, Learners and Task-wise Models “.

The Style-Var workshop on Tuesday June 5th covers stylistic variation. The closing invited talk is by Barbara Plank, “Author profiling from text and beyond”, in Bolden 3, at 16.00.

The PEOPLES workshop covers computational modeling of people’s opinions, personality, and emotions in social media. Barbara Plank is chairing this workshop, on Wednesday June 6th in Strand 11. 

Sub-word modelling is an effective tool that we’re almost all using. Barbara Plank published some great work on using this, and will give an invited talk at the SCLeM workshop on Wednesday June 6th at 11.00am:  Not All That Glitters is Gold (Bolden 6).

Enjoy the conference!