RIKEN Logo
RIKEN Press Release March 31, 2009

RIKEN adopts Semantic Web as the data-release standard of its database-construction infrastructure

- Launching the operation of the Life Science Networking System (RIKEN SciNeS) -

  • An information system was developed that promotes RIKEN's researchers from various academic fields to play a hub role in international data collaboration.
  • Efficiency is demonstrated in the life-science database integration project in which large volumes of data need to be handled.
  • The system is expected to function as a new academic medium for publishing individual databases as entire sets of research results.
  • The system promotes sharing of data under the Creative Commons Public License

Summary

RIKEN has established the common infrastructure "RIKEN SciNeS" in order to implement a large-scale release of data that complies with the international standard of "Semantic Web format" by supplying cross-sectionally within RIKEN the construction infrastructure systems for databases that have life science as their primary focus. This is a research result of RIKEN's Bioinformatics and Systems Engineering division (RIKEN BASE; Director, Tetsuro Toyoda).

Life science has recently developed into a discipline which handles vast amounts of data. Alongside presenting research findings in the form of research papers, this has led to increased opportunities for individuals to release their research findings in the form of databases that can be accessed via the Web. However, whilst publication of the data in research papers is developed with specialized media such as scientific journals, individual researchers wanting to publish and release a database must launch and manage their own websites. The operation costs for maintaining continuous service after database release has been a particular burden for researchers. Moreover, as uncoordinated websites launched by individual researchers became too numerous, the number of websites not complying with international standards increased. As a result, such sites have become confusing for users, and thus an integrated use of the services has been hampered.

In this regard, RIKEN BASE developed "RIKEN SciNeS" as a common infrastructure that enables researchers to publish individual databases as entire sets of their own research results, without requiring the researchers to maintain web servers. RIKEN SciNeS assists researchers to organize virtual research projects in cyberspace themselves, and it is envisaged that it will accommodate tens of thousands or more of research project groups. This newly developed database construction system compartmentalizes each research project in a highly confidential manner and allows each project a flexible setup for the management of unpublished information and of the research operational flow mediated via large-scale data. This capability is expected to improve data management within the research projects and to serve as a new type of academic medium. In addition, groups of databases constructed with this platform adopt the international standards and they are easily released. Therefore, RIKEN will continue the intensive operation and maintenance of such databases via the "RIKEN Hub Database", allowing RIKEN's researchers from various academic fields to play a hub role in international collaborative data research.

The present research was conducted as an internal collaborative promotional project with strategic discretionary research funding within RIKEN. RIKEN SciNeS is released as a trial version compatible for browsing with the Firefox Web browser. It was released on March 31 together with the RIKEN Hub-database.


1. Overview

Following the recent development of the Internet, the Web has become an increasingly popular medium for publishing research findings as well academic presentations. Most recently, the process of releasing individual databases on the Web is frequently used to report research findings. However, while the specialized medium of academic journals has developed for the publication of research articles, for the publication of databases, individual researchers themselves must launch and manage their own websites. In addition, even after publication, the operation costs for maintaining continuous service have been a huge burden for researchers. Moreover, as uncoordinated websites launched by individual researchers became too numerous, the number of websites not complying with international standards increased. As a result, such sites have become confusing for users, and thus an integrated use of the services has been hampered.

Generally, Wiki is being used when a large number of people collaborate to create content on the Web. Wiki is a convenient tool for many people to add content that is to be read and interpreted directly by people rather than by machines; it is like an encyclopedia. However, in the life science research field, there is a need for such content to be compared automatically with vast amounts of experimental data by computer and its outcome to be used for the interpretation of the experimental data. Therefore, it has been difficult to construct databases with such content using Wiki alone. As a result, "an incubation infrastructure for database construction" which can be freely utilized by each researcher with long-term high security maintained throughout the construction of the database to its release has been highly demanded. In addition, due to the global shortage of specialists in the field of bioinformatics, it has become necessary to develop a new database-construction infrastructure system offering content creation and quality control capabilities by standardizing formats and vocabulary, thus allowing automated, integrated processing of various types of content by computer.

To date, the significance of providing a superior information infrastructure internationally has not been fully recognized in Japan. Hence, there was a tendency that even the research results databases to which RIKEN researchers have made considerable contributions would be made visible through an information infrastructure promptly provided by other institutes. Therefore, it became an urgent task to upgrade our system to a superior database-construction infrastructure system that enables RIKEN researchers to demonstrate leadership and play a hub role in international collaborative data-type research projects.


2. Research methods and results

RIKEN BASE developed "RIKEN SciNeS" as a common infrastructure that enables the publication of individual databases as entire sets of research results without requiring researchers to maintain web servers (Figure 1). RIKEN SciNeS has the following characteristics:
  • Allows parallel operation of tens of thousands of individual database construction processes by a large number of researchers via the Internet.
  • Allows flexible setting of the operational flow that mediates large-scale data and facilitates personal collaboration and automatic processing.
  • Enables database construction in an unreleased state while it compartmentalizes each active group at high security.
  • Enables direct publication of the constructed database from the infrastructure.
  • Enables researchers to continually update the contents within the infrastructure without being liable for the system maintenance costs even after the publication.
  • Allows for the easy dissemination of data that comply with various international standard formats.
RIKEN SciNeS is a database-construction infrastructure system that has been developed with the goal of assisting researchers to be able to organize virtual research projects in cyberspace themselves, and for accommodating tens of thousands or more of research project groups (Figure 2). In order to allow compartmentalization of each project at high security, and a flexible setup of the management of unpublished information and of the research operational flow mediating large-scale data, we developed a unique technology that meets international standards, such as Semantic Web, provided with security management. As a result, in addition to the writing of content using the Wiki function, we made it possible to create content by standardizing the formatting and vocabulary, and to enable the integrated processing of data automatically by computer. RIKEN SciNeS is a user-friendly interface that allows data creation in a Semantic Web format, and thus a large number of researchers can collaborate and efficiently create research data that can be used for various purposes. The unified data created in the Semantic Web standard can be automatically and easily converted into a specific data format used in various technological fields, making it possible to disseminate data that comply with various international standard formats. As such, it can be used as a novel type of academic medium for individual researchers to publish their databases. RIKEN BASE will be maintaining the operation of the databases published through this infrastructure as the "RIKEN Hub Database".

Traditionally, Japan has failed to provide timely information infrastructure regarding international research collaboration. This resulted in a tendency where even the research results databases to which RIKEN researchers have made considerable contributions would be made visible through an information infrastructure promptly provided by other institutes. RIKEN SciNeS can be used for various purposes as an information collaboration infrastructure for international collaborative projects that deal with semantically well-structured, large-scale data. Therefore, it will provide Japanese researchers with opportunities to demonstrate leadership as a core party in international collaborative research (Figure 3).


3. Future plans

RIKEN is using the RIKEN SciNeS infrastructure to promote the integrated-database project commissioned by the Ministry of Education, Culture, Sports, Science and Technology. We will soon release a plant omics integrated database as well as an experimental protein crystallization database containing more than 10 million plus data records through RIKEN SciNeS in the Semantic Web format that meets international standards. Furthermore, in the future we plan to enable data downloads and promote sharing of research result data under the Creative Commons Public License.

The Semantic Web expresses the meaning and relationship of each datum using a format that is possible for automated interpretation by computer. Therefore, it is promising that automated interpretation from various aspects by computer will be attempted rather than just simply browsing the vast numbers of databases available. As a result, this will enable the use of web-science results in life science.

Furthermore, RIKEN SciNeS will serve as an information infrastructure for uniting life science researchers that are dispersed at distant locations, and there is an even greater expectation in the future for the use of information collaboration infrastructures in the translational research field. Currently, in collaboration with the RIKEN Research Center for Allergy and Immunology, we plan to use the RIKEN SciNeS infrastructure for constructing a network linking primary immunodeficiency specialists and clinicians in Asia. We plan to continue to support international research projects in the future.

Figure 1
Figure 1. RIKEN BASE systematically increases standardized database publications and bioinformatics collaborations within RIKEN by providing four distinct functions: "Publication Function" to collaborate with individual researchers to publish databases from the platform provided by BASE; "Integration Function" to integrate the databases for worldwide scientific convenience; "Collaboration Function" to provide researchers with a secure database platform which is necessary to collaborate internationally; and "Cyber Infrastructure Function" to provide transparently accessible web-service interfaces with global standards. Hence, RIKEN BASE bridges individual research activities and the world's database integration projects.

Figure 2
Figure 2. RIKEN BASE has developed an original scalable technology making a single system capable of hosting numerous heterologous databases simultaneously and securely. BASE is now providing an information infrastructure in which hundreds of database activities are underway. Using our infrastructure, RIKEN has started promoting numerous database releases based on the international standard called "Semantic Web." The system is accessible worldwide on the Internet at http://database.riken.jp/.

Figure 3
Figure 3. The Life Science Networking System, "RIKEN SciNeS" for short, unites worldwide intelligence and the latest technologies to contribute to the wide range of life sciences and cultivation of human resources.

RIKEN
Public Relations Office

2-1, Hirosawa, Wako, Saitama 351-0198 Japan

Contact person:
Ms. Saeko Okada

Phone:
+81-48-467-4094

Fax:
+81-48-462-4715

E-mail:
koho@riken.jp

Internet:
http://www.riken.jp/

RIKEN, one of Japan's leading research institutes, conducts basic and applied experimental research in a wide range of science and technology fields including physics, chemistry, medical science, biology and engineering. Initially established as a private research foundation in Tokyo in 1917, RIKEN became an independent administrative institution in 2003. For more information, visit www.riken.jp
[Go top]
Copyright(c) RIKEN, Japan. All rights reserved