Jenna Kanerva
jmnybl@utu.fi ORCID identifier: https://orcid.org/ https://orcid.org/ 0000-0003-4580-5366 |
language technology, natural language processing, machine learning, corpus annotation
I am a doctoral researcher at the Department of Computing, University of Turku. I’m working as a part of the TurkuNLP research group focusing on language technology and natural language processing (NLP) related topics. I got my Master of Science degree in 2014 at the University of Turku (major subject computer science).
My PhD research focuses on the area of language technology, especially being interested in machine learning based methods for Finnish language processing. I also greatly enjoy and respect elementary corpus work after being part of the data collection and annotation effort of several language data resources built for Finnish language at the TurkuNLP group. After building the elementary resources, these datasets are used to develop several language processing tools based on the latest machine learning methods.
Starting from the year 2014, I have acted as a responsible/co-responsible person for the Introduction to Language Technology course lectured at the University of Turku each year. In addition to this, I have been lecturing/co-lecturing several courses/lectures related to language technology at the University of Turku, as well as being invited to give lectures as part-time teacher at the Arcada University of Applied Sciences and the University of Tampere (Pori unit). In order to advance as a teacher, I have completed a 25 ECTS study module of university pedagogy within the years 2019-2021.
- Semantic search as extractive paraphrase span detection (2024)
- Language Resources and Evaluation
(Refereed journal article or data article (A1)) - Understanding the structure and meaning of Finnish texts: From corpus creation to deep language modelling (2024) Kanerva Jenna
(Doctoral dissertation (article) (G5)) - FinGPT: Large Generative Models for a Small Language (2023) Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing Luukkonen Risto, Komulainen Ville, Luoma Jouni, Eskelinen Anni, Kanerva Jenna, Kupari Hanna-Mari, Ginter Filip, Laippala Veronika, Muennighoff Niklas, Piktus Aleksandra, Wang Thomas, Tazi Nouamane, Scao Le Teven, Wolf Thomas, Suominen Osma, Sairanen Samuli, Merioksa Mikko, Heinonen Jyrki, Vahtola Aija, Antao Samuel, Pyysalo Sampo
(Refereed article in conference proceedings (A4)) - Towards diverse and contextually anchored paraphrase modeling: A dataset and baselines for Finnish (2023)
- Natural Language Engineering
(Refereed journal article or data article (A1)) - Deep Learning and Film History: Model Explanation Techniques in the Analysis of Temporality in Finnish Fiction Film Metadata (2022)
- CEUR Workshop Proceedings
(Refereed article in conference proceedings (A4)) - GEMv2: Multilingual NLG Benchmarking in a Single Line of Code (2022) Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations Gehrmann Sebastian, Bhattacharjee Abhik, Mahendiran Abinaya, Wang Alex, Papangelis Alexandros, Madaan Aman, McMillan-Major Angelina, Shvets Anna, Upadhyay Ashish, Bohnet Bernd, Yao Bingsheng, Wilie Bryan, Bhagavatula Chandra, You Chaobin, Thomson Craig, Garbacea Cristina, Wang, Dakuo, Deutsch Daniel, Xiong Deyi, Jin Di, Gkatzia Dimitra, Radev Dragomir, Clark Elizabeth, Durmus Esin, Ladhak Faisal, Ginter Filip, Winata Genta Indra, Strobelt, Hendrik, Hayashi, Hiroaki, Novikova Jekaterina, Kanerva Jenna, Chim Jenny, Zhou Jiawei, Clive Jordan, Maynez Joshua, Sedoc João, Juraska Juraj, Dhole Kaustubh, Chandu Khyathi Raghavi, Perez-Beltrachini Laura, Ribeiro Leonardo F.R., Tunstall Lewis, Zhang Li, Pushkarna Mahima, Creutz Mathias, White Michael, Kale Mihir Sanjay, Eddine Moussa Kamal, Daheim Nico, Subramani, Nishant, Dusek Ondrej, Liang Paul Pu, Ammanamanchi Pawan Sasanka, Zhu Qi, Puduppully Ratish, Kriz Reno, Shahriyar Rifat, Cardenas Ronald, Mahamood Saad, Osei Salomey, Cahyawijaya Samuel, Štajner Sanja, Montella Sebastien, Jolly Shailza, Mille Simon, Hasan Tahmid, Shen Tianhao, Adewumi Tosin, Raunak Vikas, Raheja Vipul, Nikolaev Vitaly, Tsai Vivian, Jernite Yacine, Xu Ying, Sang Yisi, Liu Yixin, Hou Yufang
(Refereed article in conference proceedings (A4)) - Out-of-Domain Evaluation of Finnish Dependency Parsing (2022)
- LREC Proceedings
(Refereed article in conference proceedings (A4)) - Paimen, piika ja emäntä. Arvot ja ammatit suomalaisessa näytelmäelokuvassa 1907–2017 (2022)
- Lähikuva
(Refereed journal article or data article (A1)) - Textual Paraphrase Dataset for Deep Language Modelling (2022) European Language Grid: A Language Technology Platform for Multilingual Europe Kanerva Jenna, Ginter Filip, Chang Li-Hsin, Skantsi Valtteri, Kilpeläinen Jemina, Kupari Hanna-Mari, Piirto Aurora, Saarni Jenna, Sevón Maija, Tarkka Otto
(Refereed article in compilation book (A3)) - Towards Automatic Short Answer Assessment for Finnish as a Paraphrase Retrieval Task (2022) Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022) Chang Li-Hsin, Kanerva Jenna, Ginter Filip
(Refereed article in conference proceedings (A4)) - Finnish Paraphrase Corpus (2021)
- Linköping Electronic Conference Proceedings
(Refereed article in conference proceedings (A4)) - Quantitative Evaluation of Alternative Translations in a Corpus of Highly Dissimilar Finnish Paraphrases (2021) Proceedings for the First Workshop on Modelling Translation: Translatology in the Digital Age Chang Li-Hsin, Pyysalo Sampo, Kanerva Jenna, Ginter Filip
(Refereed article in conference proceedings (A4)) - Universal Lemmatizer: A sequence-to-sequence model for lemmatizing Universal Dependencies treebanks (2021)
- Natural Language Engineering
(Refereed journal article or data article (A1)) - WikiBERT Models: Deep Transfer Learning for Many Languages (2021)
- Linköping Electronic Conference Proceedings
(Refereed article in conference proceedings (A4)) - Dependency parsing of biomedical text with BERT (2020)
- BMC Bioinformatics
(Refereed journal article or data article (A1)) - The FISKMO project: Resources and tools for Finnish-Swedish machine translation and cross-linguistic research (2020) Proceedings of the 12th Language Resources and Evaluation Conference Jörg Tiedemann, Tommi Nieminen, Mikko Aulamo, Jenna Kanerva, Akseli Leino, Filip Ginter, Niko Papula
(Refereed article in conference proceedings (A4)) - Turku Enhanced Parser Pipeline: From Raw Text to Enhanced Graphs in the IWPT 2020 Shared Task (2020)
- Annual Meeting of the Association for Computational Linguistics
(Refereed article in conference proceedings (A4)) - Is Multilingual BERT Fluent in Language Generation? (2019)
- Linköping Electronic Conference Proceedings
(Refereed article in conference proceedings (A4)) - Neural Dependency Parsing of Biomedical Text: TurkuNLP entry in the CRAFT Structural Annotation Task (2019) Proceedings of the 5th Workshop on BioNLP Open Shared Tasks Thang Minh Ngo, Jenna Kanerva, Filip Ginter, Sampo Pyysalo
(Refereed article in conference proceedings (A4)) - Parse me if you can: Artificial treebanks for parsing experiments on elliptical constructions (2019) LREC 2018 - 11th International Conference on Language Resources and Evaluation Droganova K., Zeman D., Kanerva J., Ginter F.
(Refereed article in conference proceedings (A4))