Dr. Branislava Šandrih Todorović

Building language technology for the under-resourced.

Computational linguist and data scientist. Ten years building NLP tools, corpora, and models for Serbian, Slovenian, and other South Slavic languages — bridging rigorous linguistic methodology with modern deep learning.

01 / About

A computational linguist at the intersection of research, industry & education.

Dr. Branislava Šandrih Todorović is a Senior Data Scientist at NLB DigIT, working as an NLP expert in close collaboration with the Center of Excellence within NLB Bank in Ljubljana. She received her PhD at the University of Belgrade, Faculty of Mathematics in 2020 (Impact of Text Classification on Natural Language Processing Applications).

Her fields of research are machine learning and deep learning applied to the development of tools, resources, and models for the Serbian and Slovenian languages — from corpus construction and pre-training BERT and GPT-style models, to named entity recognition, terminology extraction, sentiment analysis, and authorship identification. She has published more than 30 papers in journals and proceedings of scientific conferences.

Alongside her research and industry work, Branislava is the founder of AIkademija — an initiative dedicated to making artificial intelligence accessible through education, public lectures, and community programs.

Until 2022, Branislava was engaged as a visiting researcher at the Research Group in Computational Linguistics at Wolverhampton University, Editorial Administrator for the Journal of Natural Language Engineering, and Editor in Chief of the Journal of Digital Humanities Infotheca. She is a member of the Society for Language Resources and Technologies JeRTeh.

Branislava has developed several NLP tools and established international connections through European projects (COST actions CA18231, CA16204, CA16105). During her PhD she attended summer schools on NLP and machine learning: LxMLS 2018, ESSLLI 2018, and DLinNLP 2019.

In 2021, she received the Annual Award of the Mathematical Institute of the Serbian Academy of Sciences and Arts in the field of computing for PhD students.

02 / Education

A long path through mathematics and language.

2015 – 2020
Faculty of Mathematics, University of Belgrade · summa cum laude
2014 – 2015
Faculty of Mathematics, University of Belgrade · magna cum laude, 9.81 / 10
2010 – 2014
Faculty of Mathematics, University of Belgrade · magna cum laude, 9.48 / 10
2006 – 2010
School of Electrotechnics "Nikola Tesla", Pančevo

Summer Schools & Seminars

2019
Summer School on Deep Learning in Natural Language Processing
2018
30th European Summer School in Logic, Language and Information
2018
8th Lisbon Machine Learning School
03 / Publications

30+ peer-reviewed works.

Spanning journals, conference proceedings, books, and a PhD thesis — with focus on Serbian & Slovenian language resources, NER, terminology, and text classification.

Articles in Journals

2022
Erdem, Erkut, et al. (incl. Branislava Šandrih). Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning. Journal of Artificial Intelligence Research, 73: 1131–1207. link
2022
Ranka Stanković, Mihailo Škorić, and Branislava Šandrih Todorović. Parallel Bidirectionally Pretrained Taggers as Feature Generators. Applied Sciences, 12(10): 5028. link
2022
Tanja Ivanović, Ranka Stanković, Branislava Šandrih Todorović, and Cvetana Krstev. Corpus-Based Bilingual Terminology Extraction in Power Engineering Domain. Terminology — International Journal of Theoretical and Applied Issues in Specialized Communication. link
2021
Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, and Mihailo Škorić. Annotation of the Serbian ELTeC Collection. Infotheca, 21(2): 43–59. link
2021
Данило Алексић и Бранислава Шандрих. Аутоматска ексцерпција парова речи за учење изговора у настави српског као страног језика. Српски језик, 26(1): 567–584. link
2020
Branislava Šandrih, Cvetana Krstev, and Ranka Stanković. Two Approaches to Compilation of Bilingual Multi-Word Terminology Lists from Lexical Resources. Natural Language Engineering, 26(4): 455–479. link
2019
Branislava Šandrih and Ranka Stanković. Extraction of Bilingual Terminology using Graphs, Dictionaries and GIZA++. Infotheca, 19(2). link
2019
Jelena Andonovski, Branislava Šandrih, and Olivera Kitanović. Bilingual Lexical Extraction Based on Word Alignment for Improving Corpus Search. The Electronic Library, 37(2). link
2019
Branislava Šandrih. SMS Sentiment Classification based on Lexical Features, Emoticons and Informal Abbreviations. Serdica Journal of Computing, 13(1-2). link
2018
Branislava Šandrih. Informatics for Library and Information Science students with special focus on Python. Infotheca, 18(1): 63–77. link
2017
Branislava Šandrih, Dušan Tošić, and Vladimir Filipović. Towards Efficient and Unified XML/JSON Conversion — a New Conversion Method. Transactions on Internet Research, 13(1): 58–64. link
2017
Branislava Šandrih, Vladimir Filipović, Saša Malkov, and Aleksandar Kartelj. Distributed Computing Among Independent Web Browsers Applied to Text and Image Processing. Review of the National Center for Digitization, 31: 30–39. link

PhD Thesis

2020
Branislava Šandrih. Impact of Text Classification on Natural Language Processing Applications. PhD thesis, University of Belgrade, Faculty of Mathematics. link

Conference Proceedings

2024
Danka Jokić, Ranka Stanković, and Branislava Šandrih Todorović. Abusive Speech Detection in Serbian using Machine Learning. NLP & AI for Cyber Security, Lancaster. link
2023
Цветана Крстев, Ранка Станковић, Бранислава Шандрих Тодоровић, Милица Иконић Нешић. Нове технологије за оживљавање старих текстова. Digital Humanities & Slavic Cultural Heritage II, pp. 79–96. link
2023
Branislava Šandrih Todorović, Katarina Josipović, and Jurij Kodre. Three Approaches to Client Email Topic Classification. RANLP 2023. link
2022
Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Duško Vitas, Mihailo Škorić, and Milica Ikonić Nešić. Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection. LREC 2022, Marseille. link
2021
Rina Zviel-Girshin et al. (incl. Branislava Šandrih Todorović). Developing Pedagogically Appropriate Language Corpora through Crowdsourcing and Gamification. EUROCALL 2021. link
2021
Danka Jokić, Ranka Stanković, Cvetana Krstev, and Branislava Šandrih. A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian. LDK 2021, Dagstuhl. link
2021
Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, and Milica Ikonić Nešić. Serbian NER&Beyond: The Archaic and the Modern Intertwined. RANLP 2021. link
2020
Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš Utvić, and Mihailo Škorić. Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian. LREC 2020, Marseille. link
2019
Cvetana Krstev, Jelena Jaćimović, Branislava Šandrih, and Ranka Stanković. Analysis of the First Serbian Literature Corpus of the Late 19th and Early 20th Century with the TXM Platform. DH_Budapest 2019. link
2019
Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, and Aleksandra Marković. SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian. eLex 2019, Sintra. link
2019
Branislava Šandrih, Cvetana Krstev, and Ranka Stanković. Development and Evaluation of Three Named Entity Recognition Systems for Serbian — The Case of Personal Names. RANLP 2019. link
2019
Бранислава Шандрих, Ранка Станковић, Мирјана Гочанин. Чији је пример? Анализа лексичких обележја на примерима Речника САНУ. Међународни славистички центар, 48(3): 299–316. link
2018
Branislava Šandrih. Fingerprints in SMS Messages: Automatic Recognition of a Short Message Sender Using Gradient Boosting. CLIB 2018, Sofia. link
2018
Бранислава Шандрих и Душко Витас. Квантитативни преглед језика кратких порука. Међународни славистички центар, 47(3): 155–165. link
2018
Cvetana Krstev, Branislava Šandrih, Ranka Stanković, and Miljana Mladenović. Using English Baits to Catch Serbian Multi-Word Terminology. LREC 2018, Paris. link
2017
Cvetana Krstev, Duško Vitas, Miloš Utvić, and Branislava Šandrih. The New Clothes for an Old Cookbook. LTC 2017, Poznań. link
2017
Ranka Stanković, Branislava Šandrih, Olivera Kitanović, Ivan Obradović, and Miloš Manić. An E-Learning Approach to Social Sciences. eLearning 2017, Belgrade. link
2016
Branislava Šandrih. Mogući doprinos LaTeX-a u obrazovnom procesu. Slobodan softver u obrazovanju, Sremski Karlovci. link

Proceedings & Books (Editor)

2019
Venelin Kovatchev, Irina P. Temnikova, Branislava Šandrih, Ivelina Nikolova (editors). Proceedings of the Student Research Workshop Associated with RANLP 2019. link

Abstracts in Books of Abstracts

2020
Tanara Zingano Kuhn, Branislava Šandrih Todorović, et al. Crowdsourcing Pedagogical Corpora for Lexicographical Purposes. EURALEX XIX. link
2021
Tanara Zingano Kuhn, Rina Zviel-Girshin, Špela Arhar Holdt, Branislava Šandrih Todorović, et al. Gamifying the Path to Corpus-Based Pedagogical Dictionaries. eLex 2021. link
2019
Peter Dekker, Tanara Zingano Kuhn, Branislava Šandrih, and Rina Zviel-Girshin. Corpus Cleaning via Crowdsourcing for Developing a Learner's Dictionary. eLex 2019. link
2019
Tanara Zingano Kuhn, Peter Dekker, Branislava Šandrih, Rina Zviel-Girshin, Špela Arhar Holdt, and Tanneke Schoonheim. Crowdsourcing Corpus Cleaning for Language Learning Resource Development. EuroCALL 2019, p. 159. link
2018
Branislava Šandrih. SMS Sentiment Classification based on Emoticons, Informal Abbreviations and other Text Features. QUALICO 2018, p. 73. link
2017
Branislava Šandrih, Vladimir Filipović, Saša Malkov, and Aleksandar Kartelj. Globalna izračunavanja u mreži internet pregledača – primena u obradi slika. NCD 2017, Belgrade. link
04 / Lab & Initiatives

Tools, resources & AI education.

Publicly available NLP toolkits and language resources, mostly for Serbian — alongside AIkademija, an initiative bringing artificial intelligence into public conversation.

European Research Projects

05 / Work

A decade across research and industry.

Founder — AIkademija
2025 — present

Public AI literacy initiative — lectures, workshops, and community programs introducing artificial intelligence to general audiences. Includes a lecture series at the Pančevo City Library and ongoing public-engagement activities.

Senior Data Scientist · Team Lead
2022 — present
NLB DigIT, Belgrade — closely collaborating with the Center of Excellence within NLB Bank, Ljubljana

NLP solutions for the bank: sentiment, topic, and priority classification for client emails (Slovenian); voice transcription & sentiment; RAG-powered knowledge bases; BERT-based QA bots; LLM-based zero-shot/few-shot/RAG evaluation pipelines; contract classification; market-risk early-warning system. Led a 10+ member data science team (2023–2025).

Assistant Professor — Faculty of Philology
2016 — 2022
Courses taught
  • Informatics for Librarians (BSc)
  • Practicum of Informatics (BSc)
  • Digital Text (BSc)
  • Structure of Information (BSc)
  • Language Technologies (BSc)
  • Multimedia Documents (BSc)
  • Information Retrieval (BSc)
  • Advanced Methods in IR (MSc)
  • Advanced Language Technologies (MSc)
  • Structuring & Management of Web Content (MSc)
Multidisciplinary Studies — University of Belgrade
2017 — 2022
  • Programming for Linguists (MSc)
  • Introduction to Cognitive Linguistics (MSc)
  • Natural Language Processing (PhD)
  • Machine Learning (PhD)
Junior Lecturer — Faculty of Mathematics
2015 — 2016
  • Programming (BSc)
  • Introduction to Computer Architecture (BSc)
  • Object-Oriented Programming (BSc)

Software Development

06 / Mentorship

Students I've guided.

Master Students

PhD Students

07 / Service

Awards, editorial roles, and program committees.

Awards & Internships

Invited Seminars

Program Committees & Reviewing

Editorial & Conference Organization

Memberships

Languages

08 / Personal

Beyond the research.

In August 2021, Branislava welcomed her first son into the world — Mihailo (Miki).

In June 2024, her family grew again with the arrival of Danilo (Daki).

🌻 ❤️ 🌻