Tools

1. Advanced Interactive Sinhala Dictionary

2. Tamil POS Tagger is a deep learning based POS tagger which is developed using Stanza framework, and trained using 11K POS tagged sentences along with fasttext model of Facebook.

Tamil POS Tagger

3. ThamizhiMorph is a morphological analyser cum generator which is developed using Finite-State Transducer approach. This tool can accept text, either inform of word or sentence, and provide the analysis.

ThamizhiMorph

4. ThamizhiUDp is a neural-based dependency parser, which provides a complete pipeline for the dependency parsing of the Tamil language text using Universal Dependency formalism.

ThamizhiUDp

5. SinMorphy A Morphological analyzer for the Sinhala language. The current version of SinMorphy can handle 1.6 million words including nouns, verbs, particles, adjectives, and adverbs.

SinMorphy

6. Ottos 2.0 is tool built to extract PDFs written in Sinhala and Tamil legacy fonts and convert the text into Unicode. Currently it supports FM fonts, Shree Tamil, TAM fonts.

Ottos 2.0

7. SenCAT stands for Sentiment Categorization. This platform allows you to find sentiment of Sinhala and English texts using novel deep learning models. All you need is to call an API request containing your text string. Our backend will then process the text and give you the sentiment.

SenCAT

8. SimDocSin is a cross-lingual document similarity checking tool from the University of Moratuwa. It currently handles Sinhala and English and may be extended for other languages.It may be used to extract parallel data from the web or multilingual corpora for your NLP project.

SimDocSin

9. SinSRL is the first-ever semantic role labeller for the Sinhala language which uses both “Neural Network” and “Annotation Projection” approaches to annotate a given Sinhala Sentence with semantic tags defined in Propbank corpus.

SinSRL

10. SinSpell is a Sinhala spell checker and corrector which identifies writing errors accurately and auto-correct evident misspelt words and provides better suggestions for remaining misspelt words.

SinSpell

11. Glossary Dataset extracted from the Official Language Department and the system developed by the National Language Processing Center

Glossary

NLPC Tools

Natural Language Processing Centre,
University of Moratuwa, Sri Lanka

NLPC Tools

Natural Language Processing Centre, University of Moratuwa, Sri Lanka

Natural Language Processing Centre,
University of Moratuwa, Sri Lanka