AU-KBC RESEARCH CENTRE

New Tamil POS Annotated Corpus and POS Engine Release: on 24th May, 2016 at 6.30 PM in WILDRE -3, LREC Conference ...More Details

'Searchko' on Dinamalar (26th July 2010)

Anna University students develop portal with search engine in Tamil - Deccan Chronicle (28th June 2010)

'Searchko' on The Telegraph (14th March 2010)

Workshops Conducted

iDravidian 2017 - 3rd Symposium on Natural language Processing for Dravidian Languages (iDravidian 2017)

9th International Conference on Natural Language Processing (ICON 2011)

8th Discourse Anaphora and Anaphora Resolution 2011 (DAARC 2011)

7th Discourse Anaphora and Anaphora Resolution Colloquium (DAARC 2009)

Talk

Title: The Annotation and Use of Multimodal Corpora as Basis for Modeling Plausible Communicative Devices

Speaker: Costanza Navarretta
     University of Copenhagen

Date

Time

Abstract

Human communication is naturally multimodal: people use their whole body when they talk to each other both as speakers and listeners. Communication is cooperative, and it is influenced by many factors: the social activity, the participants, their number, roles, degree of familiarity, position in the room, physical capacities etc. Determining the relation between the various modalities including speech and non-verbal behaviorssuch as head movements, facial expressions, gaze, body postures and hand gestures isimportant for understanding how humans communicate and for modeling human-like behaviors in various types of devices and applications.

She will describe the collection, annotation and use of multimodal corpora of various types of video-recorded conversationswith focus on interactive and referential communicative behaviors as well as behaviors displaying affective states. She will also shortly discuss intercultural studies on multilingual comparable data.

J .Allwood, L. Cerrato, K. Jokinen, C. Navarretta and P. Paggio. The MUMIN coding scheme for the annotation of feedback in multimodal corpora: a prerequisite for behavior simulation. In Language Resources and Evaluation.Special Issue. J.-C. Martin, P. Paggio, P. Kuehnlein, R. Stiefelhagen, F. Pianesi (Eds.) Multimodal Corpora for Modeling Human Multimodal Behavior, Volume 41, Nr. 3-4:273-287, 2007, Springer, www.springerlink.com

C. Navarretta and P. Paggio. Classification of Feedback Expressions in Multimodal Data.Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden, Juli 11-16, 2010, pp. 318-324.

P. Paggio and C. Navarretta. Learning to classify the feedback function of head movements in a Danish Corpus of first encounters. In Proceedings of ICMI 2011 Workshop Multimodal Corpora for Machine Learning: Taking Stock and Road mapping the Future , Alicante, Spain November 2011, 8 pages.

C. Navarretta, E. Ahls?n, J. Allwood, K. Jokinen, P. Paggio. Feedback in Nordic First-Encounters: a Comparative Study. To appear in Proceedings of LREC 2012, May 2012, Istanbul, Turkey.

C. Navarretta and P. Paggio. Verbal and Non-Verbal Feedback in Different Types of Interactions.To appear in Proceedings of LREC 2012, May 2012, Istanbul, Turkey.

C. Navarretta. Annotating and Analyzing Emotions in a Corpus of First Encounters. In Proceedings of the 3rd IEEE International Conference on Cognitive Infocommunications, Kosice, Slovakia, 2-5 December 2012.

Talk

Title: Detection of Plagiarism and Text Reuse

Speaker: Alberto Barrón-Cedeño 
     Department of Information Systems and Computation,
     Universidad Politécnica de Valencia, Spain

Date

Time

Abstract

In the last years, the plethora of text resources easily reachable on the WWW has increased the cases of text reuse and plagiarism. A countermeasure to such phenomenon is the generation of tools for automatic text reuse detection.

In this tutorial, we will give an overview of the state-of-the-art of plagiarism detection as well as the freely available tools and the commercial ones. Special emphasis will be given to the analysis of cross-language text reuse and plagiarism, a nearly approached problem. Moreover, we will discuss the results of the 2nd PAN competition on plagiarism detection (sponsored by Yahoo! Research) which is held in the framework of CLEF and where also Indian teams participated. (http://pan.webis.de)

Talk

Title: Figurative Language Processing: Mining Humour and Irony from Social Media

Speaker: Prof. Paolo Rosso
     Department of Information Systems and Computation,
     Universidad Politécnica de Valencia, Spain

Date

Time

Abstract

Figurative language is one of the most arduous topics facing natural language processing (NLP). Unlike literal language, the former takes advantage of linguistic devices, such as metaphor, analogy, ambiguity, irony, and so on, in order to project more complex meanings which, the most of the times, represent a real challenge, not only for computers, but for humans as well. This is the case of humor and irony. This presentation aims at showing how two specific domains of figurative language? humor and irony, may be automatically handled by means of considering linguistic devices, such as ambiguity and incongruity, and meta-linguistic devices, such as polarity and emotional scenarios. We especially focus on discussing how underlying knowledge, which relies on shallow and deep linguistic layers, may represent relevant information to automatically identify figurative usages of languages. In particular, and contrary to the most of the researches which deal with figurative language, we aim at identifying figurative usages regarding language in social media.

iDravidian 2017 - 3rd Symposium on Natural language Processing for Dravidian Languages (iDravidian 2017)
9th International Conference on Natural Language Processing (ICON 2011)
8th Discourse Anaphora and Anaphora Resolution 2011 (DAARC 2011)
7th Discourse Anaphora and Anaphora Resolution Colloquium (DAARC 2009)

International Workshop on Referential Entity Resolution 2008
Workshop on Named entity Resolution for ILMT and CLIA Project
Indo-German Phase I and Phase II (with Prof. C.N. Krishnan)