Resources

Code

For all our research code, see our ITU NLP github repository

Danish pipeline

With tokenization, part-of-speech tagging and parsing:

Til automatisk orddeling, annotering af ordklasse, og dependensanalysering af tekster på dansk.

Danish named entity recognition

For location, person, and organization names:

Til automatisk navnegenkendelse af steder, personer og organisationer af tekster på dansk.

Danish word representations

  • dansk-brown.tar.bz2 ; Brown Clusters induced on Danish text from Wikipedia and Common Crawl (input length |S|=134M tokens; window a=5000; vocab |V|=778K word types). These are generalised Brown clusters, so you can generate clusterings of any size instantly from the download (see README).

Bornholmsk resources

To work with Bornholmsk:

Te at arbja på Borrinjholmsk

Dansk NLP mailing list

https://mailman.itu.dk/mailman/listinfo/dansknlp

 

Vores forskningsartikler vedr. det danske sprog

The Lacunae of Danish Natural Language Processing. Andreas Kirkedal | Barbara Plank | Leon Derczynski | Natalie Schluter. Proceedings of the 22nd Nordic Conference on Computational Linguistics

Neural Cross-Lingual Transfer and Limited Annotated Data for Named Entity Recognition in Danish. Barbara Plank. Proceedings of the 22nd Nordic Conference on Computational Linguistics

Political Stance in Danish. Rasmus Lehmann | Leon Derczynski. Proceedings of the 22nd Nordic Conference on Computational Linguistics

Joint Rumour Stance and Veracity Prediction. Anders Edelbo Lillie | Emil Refsgaard Middelboe | Leon Derczynski. Proceedings of the 22nd Nordic Conference on Computational Linguistics

Bornholmsk Natural Language Processing: Resources and Tools. Leon Derczynski | Alex Speed Kjeldsen. Proceedings of the 22nd Nordic Conference on Computational Linguistics

DKIE: Open Source Information Extraction for Danish. Leon Derczynski | Camilla Vilhelmsen | Kenneth S. Bøgh. Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

Simple Natural Language Processing Tools for Danish. L Derczynski. arXiv preprint arXiv:1906.11608

Misinformation on Twitter During the Danish National Election: A Case Study.
Leon Derczynski, Torben Oskar Albert-Lindqvist, Marius Venø Bendsen, Nanna Inie, Viktor Due Pedersen, Jens Egholm Pedersen. Proceedings of the conference for Truth and Trust Online (TTO)