Computational Linguistics Research Group
The Computational Linguistics Research Group (CLRG) at AU-KBC Research Centre works on the scientific study of language from a computational perspective. We develop computational models of various linguistic phenomena, with the aim of building practical natural language processing systems.
Our research interests span a broad range of topics in Computational Linguistics and Natural Language Processing. Our work has combination of traditional and contemporary linguistic knowledge based approaches with statistical and machine learning methods.
The group focuses on converting unstructured data to structured data and translation. We work at intra and inter sentential (Discourse) level. At the Discourse level we do cognitive analysis of discourse such as coherence analysis, anaphora and connective resolution etc. Language families we are interested in are Dravidian, Indo -European and Indo-Aryan.
Glimpses of our research work
Works in Tamil Computing
1. Nigazaayvi - நிகழாய்வி
A Tamil Mobile app that fetches events from Web - "It brings Events into your hand"
Extracts the latest events happening across the globe and provides the user with:
• the people associated with the event
• the place in which the event happens
• the cause & effect of the event
'Nigazhaayvi' is available in the link here
2. Machine Translation (MT) Systems (Tamil <==> Malayalam, Tamil <==> Hindi
We have developed Indian Language - Indian Language Machine Translation Systems focusing on Tamil to X Indian language and vice-versa.
• Tamil - Hindi MT system is available on link TA-HI MT System
3. Corpus and Other Lexical Resources Released
• We have released Tamil Part-of-Speech (POS), Named Entity (NE) annotated corpus free for research purposes, enabling researchers across the world to enhance research in Tamil and other Indian languages. Recently we have released a huge (500K word) POS annotated corpus (here).• Other corpora and lexical resources such as "Tamil WordNet" have also been released. The details are available in the link (here).
4. 'Searchko' - A Tamil Web Portal
Searchko is a Tamil portal, which has a Tamil search engine as the main constituent. It has news aggregation and AdTrans an automatic advertisement translation system. More about searchko at www.searchko.co.inWork on Malayalam Computing
1. Malayalam NLP Stack
Malayalam NLP Stack includes Morphological analyer, POS Tagger, Chunker and Named Entity Recogniser Malayalam Stack demo linkWork on Social Media
1. Sentiment Analyser on Twitter Data
A web API which gives the sentiment polarity (Positive/Negative/Neutral) for tweets. Given a phrase fetches from web tweets consisting of the phrase. These fetched tweets are analysed for polarityThis Web API is available in the link <TweetSentiSys>