Workshop Program

January 22-23rd 2015, Clock Tower Centennial Hall, Kyoto University, Japan

January 22nd

10:10-10:40 Invited Talk (Prof. Toru Ishida, Donghui Lin, Yohei Murakami)

"The Language Grid: Past, Present, and Future"

Abstract: To develop a multilingual environment that can handle various situations in various communities, existing language resources should be easily shared and customized. We proposed and developed the Language Grid as service-oriented collective intelligence; it allows users to freely create language services from existing language resources and combine those language services to develop new services to meet their own requirements. This talk explains the past, present and future of the Language Grid. First, we introduce the design concept, service architecture and research achievements of the Language Grid. Then, we describe how the Language Grid is used as a platform for current services science research by using the example of YMC-Viet: a youth mediated communication project in Vietnam. Finally, we introduce the concept of Open Language Grid, which aims at providing a worldwide language service infrastructure for language resource sharing through global collaboration.

Bio: Toru Ishida has been a professor of Kyoto University since 1993. His research interest lies with autonomous agents and multiagent systems, and he has been working on this theme for more than twenty years. He served as a program co-chair of the second ICMAS, a chair of the first PRIMA, and a general co-chair of the first AAMAS. He was also an editor-in-chief of Journal on Web Semantics (Elsevier) and an associate editor of IEEE PAMI, and Journal on Autonomous Agents and Multi-Agent Systems (Springer). He was a board member of the International Foundation on Autonomous Agent and Multiagent Systems (IFAAMAS). He has also started workshop/conference on Digital Cities and Intercultural Collaboration. Since 2006, he has been running the Language Grid project.

Yohei Murakami has been an associate professor of Unit of Design at Kyoto University since 2014. He received his Ph.D. degree in informatics from Kyoto University in 2006. His research interests lie in services computing and multi agent systems, and he has been working on the language grid for almost ten years. He founded the Technical Committee on Services Computing in the Institute of Electronics, Information and Communication Engineers in 2009.

Donghui Lin has been an assistant professor of Kyoto University since 2012. He received his Ph.D. degree in informatics from Kyoto University in 2008. His research interests include services computing, business process management and intercultural collaboration. He served as a program co-chair of International Conference on Culture and Computing 2013-2015, and a program committee member of recent major international conferences in the area of services computing and services science.

10:40-11:10 Invited Talk (Prof. Nancy Ide)

"The Language Applications Grid"

Abstract: The Language Application (LAPPS) Grid project is establishing a framework that enables language service discovery, composition, and reuse and promotes sustainability, manageability, usability, and interoperability of natural language Processing (NLP) components. It is based on the service-oriented architecture (SOA), a more recent, web-oriented version of the "pipeline" architecture that has long been used in NLP for sequencing loosely-coupled linguistic analyses. The LAPPS Grid provides access to basic NLP processing tools and resources and enables pipelining such tools to create custom NLP applications, as well as composite services such as question answering and machine translation together with language resources such as mono- and multi-lingual corpora and lexicons that support NLP. The transformative aspect of the LAPPS Grid is that it orchestrates access to and deployment of language resources and processing functions available from servers around the globe and enables users to add their own language resources, services, and even service grids to satisfy their particular needs.

Bio: Nancy Ide is Professor and Chair of Computer Science at Vassar College in Poughkeepsie, New York, USA. She has been in the field of computational linguistics for over 30 years and made significant contributions to research in word sense disambiguation, computational lexicography, discourse analysis, and the use of semantic web technologies for language data. She is the founder of the Text Encoding Initiative, and developer of the XML Corpus Encoding Standard and the ISO LAF/GrAF representation format for linguistically annotated data. She has developed major corpus resources for American English, including the Open American National Corpus (OANC) and the Manually Annotated Sub-Corpus (MASC), and was a pioneer in efforts toward open data and resources. She is Co-Editor-in-Chief of the journal Language Resources and Evaluation and Editor of the Springer book series Text, Speech, and Language Technology. She has been the Principal Investigator (PI) or co-PI on multiple major US National Science Foundation and EU-funded projects, and is currently co-PI of the LAPPS Grid project.

11:10-12:00 1st Session: Language Resources and Services(25min. x 2)

Building Uyghur Dependency Treebank: Design principles, Annotation Schema and Tools
Mairehaba Aili, Aziguli Xialifu, Saimaiti Maimaitimin, Maihefureti

Vietnamese multimedia Agricultural information retrieval system as an Info Service
Nhut Pham, Nam Cao, Thi Luong, Hieu Pham, Quan Vu

12:00-12:20 Short Presentation(5min. x 4)

Building Contemporary Uyghur Grammatical Information Dictionary
Jiamila Wushouer, Wayiti Abulizi, Kahaerjiang Abiderexiti, Tuergen Yibulayin, Maierhaba Aili, Saimaiti Maimaitimin

Design and Implementation of The Language Grid based Short-Message Translation Assistant
Mirsalijan Sabit, Winira Musajan, Marhaba Eli

Building Indonesian Local Language Detection Tools Using Wikipedia Data
Puji Martadinata, Bayu Distiawan Trisedya, Ruli Manurung, Mirna Adriani

Kachako: all-in-one automated NLP platform and multiple domain UIMA toolkit
Yoshinobu Kano

13:30-14:45 2nd Session: Metadata/Annotation(25min. x 3)

Combining and extending data infrastructures with linguistic annotation services
Stelios Piperidis, Dimitrios Galanis, Juli Bakagianni, Sokratis Sofianopoulos

The Language Application Grid Web Service Exchange Vocabulary
Nancy Ide, Keith Suderman, Marc Verhagen, James Pustejovsky

The LAPPS Interchange Format
Marc Verhagen, Keith Suderman, Di Wang, Nancy Ide, Chunqi Shi, Jonathan Wright, James Pustejovsky

January 23rd

09:30-10:20 Special Talk (Dr. Kohichi Takeda)

"Watson Question-Answering System and Linguistic Resources"

Abstract: In this talk, I would like to introduce technical foundations of the Watson Question-Answering System, and focus on the importance of linguistic resources for such a project. It will be shown that domain-dependent applications - healthcare diagnosis and contact center engagement support - are also significantly dependent on available linguistic resources. Open linguistic resources and question-answer pairs (examples) clearly play a key role for efficient and scalable system development.

Bio: Koichi Takeda has been leading the natural language processing and text analytics research in IBM Research - Tokyo for the last 30 years since he joined IBM. He has made numerous achievements in the filed, including an inter-lingual machine translation framework, a pattern-based machine translation framework, textual information visualization, and text mining. In 2003, he made the life science literature mining system for the entire MEDLINE citation data. He became the member of the Watson Question-Answering project in December 2007, and continues to work for the Watson applications.

10:30-11:00 Invited Talk (Dr. Nicoletta Calzolari)

"An excursus through policy issues: from dreams to reality"

Abstract: Language Technology (LT) is a data-intensive field and major breakthroughs have stemmed from a better use of more and more Language Resources (LRs). The challenges ahead depend on a coherent strategy involving not only the best methods and technologies but also many LR related dimensions.
I will highlight some policy issues that must be considered when making up a strategy for the future of the field: issues such as sharing resources, services and tools, adopting the paradigm of accumulation of knowledge and allowing replicability of research results.
In the paradigm of open, distributed language infrastructures based on sharing LRs, services and tools, the only way for our field to achieve the status of a mature science lies in initiatives enabling to join forces both in the creation of large LR pools and in big collaborative experiments using these LRs. This will serve better the needs of language applications, enabling building on each other achievements, integrating results (also with Linked Data), and having them accessible to various systems.
This requires also an effort to push towards a culture of "service to the community" where everyone has to contribute. This "cultural change" is not a minor issue. In this respect I will point out how initiatives like the LRE Map, Share your LRs, ISLRN, are steps towards promoting the concept of Open Science. I will therefore highlight the role of ELRA and LREC in pushing towards this vision.

Bio: Nicoletta Calzolari Zamorani is Research Associate and former Director of Research and Director (2003-08) of the Institute of Computational Linguistics-CNR, Pisa.
Received an Honorary Doctorate in Philosophy from the University of Copenhagen and awarded the title of "ACL Fellow" in the ACL (Association for Computational Linguistics) Fellows founding group for "significant contributions to computational lexicography, and for the creation and dissemination of language resources".
Coordinating international, European (recently the EC FLaReNet Network), national projects/strategic initiatives.
President of ELRA (European Language Resources Association), Permanent member of ICCL, vice-president of META-TRUST, chair of ISO/TC 37/SC 4, committee member of ISO/TC 37/AG 0, former convenor of the ISO Lexicon WG, president of the PAROLE Association, former chair of the Scientific Board of CLARIN, former member of the ACL Exec, of the META-NET Council, of the ESFRI Social Sciences and Humanities Working Group, and of many International Committees and Advisory Boards (e.g., ELSNET, SENSEVAL, ECOR, SIGLEX).
General Chair of LREC (since 2004), of COLING 2016, and COLING-ACL-2006. Invited speaker, member of program committees, organiser of many international conferences/workshops.
Co-editor-in-chief of the Journal Language Resources and Evaluation, Springer. Member of journal editorial/advisory boards. More than 400 publications.

11:00-11:30 Invited Talk (Mr. Khalid Choukri)

"The MLi Hub, a European project for the specification of the next generation of language Grid"

Abstract: The talk is about the objectives of the MLI project , a European funded project, that aims at collecting and compiling the specification of the next generation of language grids with an emphasis on the infrastructural aspects as well as the resources and the access to language technology components.

Bio: Dr. Khalid Choukri obtained an Electrical Engineering degree (1983) from École Nationale de l'aviation civile (ENAC, Toulouse, France), and a Master Degree (1984) and Doctoral degree (1987) in Computer sciences and Signal processing at the École Nationale Supérieure des Télécommunications (ENST, Télécom ParisTech) in Paris, France, in Partnership with Alcatel-Lucent. Since 1998, worked as on speech processing, oral dialogues, managing several European R&D funded projects. Since 1995, he has been the executive director, now Secretary General, of the European Language Resources Association (ELRA) , the Founder and Managing Director of the distribution agency (ELDA), and part of the organizing committee of the Language Resources and Evaluation Conference (LREC), a major event of the Language technology field with more than 1200 attendees and over 600 publications.

11:30-12:00 Invited Talk (Prof. Núria Bel)

"The users of a language service infrastructure"

Abstract: The talk will present the Spanish CLARIN Center of Competence which has been deployed taking into account the particular characteristics of researchers in humanities and social sciences. Developed in the framework of Linked Open Data, each web services is linked to descriptions of the use of the tool it gives access to, like project descriptions, research papers in different areas, etc.

Bio: Dr. Núria Bel is Associate professor at the department of Translation and Language Sciences of the Universitat Pompeu Fabra and Academic Secretary of the IULA Institut for Applied Linguistics also of the UPF. Her area of research is Natural Language Processing, in particular the creation and induction of language resources. She coordinated the project PANACEA and has participated in infrastructure-related projects: CLARIN, METANET4U and DASISH.

13:30-14:45 3rd Session: Service Platform/Service Management (25min. x 3)

A Policy-Aware Parallel Execution Control Framework for Language Application
Mai Xuan Trang, Yohei Murakami, Toru Ishida

Intellectual Property Rights Management with Web Service Grids
Christopher Cieri, Denise DiPersio

Language Mashup: Personal Grid for Language Resources
Masayuki Otani, Takao Nakaguchi, Yohei Murakami, Donghui Lin, Toru Ishida

14:45-16:00 4th Session: Application (25min. x3)

Mining Opinion Polarity from Multilingual Song Lyrics
Qian Liu, Zhiqiang Gao

Collaborative Philology on the way to Web Services: the case of CoPhiWordnet
Federico Boschetti, Riccardo Del Gratta, Angelo Del Grosso, Monica Monachini, Ouafae Nahli

Effectiveness of Keyword and Semantic Relation Extraction for Knowledge Map Generation
Virach Sornlertlamvanich, Canasai Kruengkrai