News & Media


May 31, 2011

Web interface defines new paradigm for life science data sharing

Multiple life-science databases accessible through RIKEN Scientists' Networking System using new Semantic-JSON programming interface

A new lightweight web service interface for accessing massive amounts of life science research data across multiple public and private domains has been developed by researchers at RIKEN, Japan's flagship research institute. Through the powerful RIKEN Scientists' Networking System (SciNetS), the service provides a secure, flexible and light weight interface to millions of data records and their network of semantic relationships, ushering in a new era of collaboration, analysis and information-sharing for life science research and applied innovation.

Gene annotation, protein structure analysis, plant ontologies, transcriptomes - dramatic increases in the size, variety and complexity of data resources in the life sciences have accentuated the challenges of data analysis in the information age. Adding to these challenges, much of the data handled at each step of the research process is private, making integration with public data more difficult and hindering collaboration. Overcoming these challenges requires systems for securely integrating data resources and making their information widely available through a flexible interface.

The RIKEN Bioinformatics And Systems Engineering (BASE) division, Japan's leading research institute focusing on the integration and publication of life science research data, has now developed such an interface. Referred to as Semantic-JSON, the interface accesses a "virtual laboratory cloud centre" also developed at BASE named the Scientists' Networking System (SciNetS), which brings together, as of May 2011, a total of 192 public database projects both internal and external to RIKEN. SciNetS creates common ground for sharing life science data resources by linking these resources together in a network of semantic relationships based on standardized Semantic Web techniques.

Semantic-JSON provides a flexible interface to SciNetS on the web, enabling bioinformaticians to access specific data from across the SciNetS network using the programming languages and information tools they normally use in their research. The interface does so by defining a set of simple but relevant commands for accessing and searching SciNetS data and their semantic relationships, delivering results in the widely-used JavaScript Object Notation (JSON) format.

Already, RIKEN has successfully applied Semantic-JSON to a number of projects, including international data collaborations on mouse phenotypes, domestic integrated database projects, and the GenoCon International Rational Genome Design Contest. Looking ahead, RIKEN plans to use the interface to distribute life science data across its research centres and with international collaborators via the SciNetS project, broadening the life-sciences Semantic Web data universe and promising to achieve not just comprehensive understanding of various life phenomena, but also collaborative breakthroughs for medicine, industry and the environment.

This research result will appear in the online version of the British scientific journal Nucleic Acids Research on June 1.


Life science research depends crucially on the availability of informatics infrastructure for systematically storing and integrating vast amounts of diverse bioinformatics data. Indeed, a deep understanding of data collected using today's cutting-edge bioinformatics technologies is impossible without this infrastructure, yet conventional databases are limited in the types of data they can handle. For more sophisticated processing and analysis, infrastructure is needed that can simultaneously sort and organize the vast variety of different types of life science data and make this data available for public use.

At the RIKEN Bioinformatics And System Engineering (BASE) division, researchers have developed a novel research infrastructure around a set of virtual laboratories (collaboration via the cloud) that allows researchers to store massive amounts of life-sciences data and schematically and semantically organise relationships between individual records in a virtually-constructed, closed, secure data space. This collaboration centre, the Scientists' Networking System (SciNetS), does more than just publish data from RIKEN to the web. As an infrastructure for life science data sharing, it also encourages new forms of research collaboration, enabling scientific discoveries not possible through individual research activities alone.

Fully exploiting this collaborative potential, however, requires that SciNetS data be made available on the web through an easy-to-use interface, to be accessed and analysed via commonly-used programming languages. Semantic-JSON is the technological innovation which makes this possible.


To encourage its worldwide distribution and use, data organized in SciNetS is formatted according to the Semantic Web standard, a data format which is understandable not only to humans, but also to computers. The new Semantic-JSON programming interface (, developed at BASE and made available for public use as of June 1, enables bioinformaticians to access this Semantic Web data on the web via the programming languages and information tools they normally use in their research. Data obtained through the interface is described in the widely-used, highly-portable JavaScript Object Notation (JSON) format, freeing researchers from depending on any specific programming languages for their data analysis.

Semantic-JSON also achieves a second major advance in life science research by bridging the gap between public data available for general use, and private data held by individual researchers or research groups. Researchers often need to unite public and private data for analysis; yet doing so is far from trivial due to differences in access permissions across virtual laboratories. Freely releasing such data, on the other hand, poses significant security issues. What is thus needed is a technology to enable virtual laboratories to manage their own data access permissions in a secure way, while also accessing relationship information and merging (public and private) original data from different virtual labs.

To accomplish this union of data, Semantic-JSON employs a trick similar to the URL shortening tools used on common social media services such as Twitter. The Semantic-JSON interface shrinks URLs for data internal and external to SciNetS into shorter identifiers, and uses these to lookup permissions for specific data, returning only the data appropriate to the access privileges of a given user. Unlike conventional URL shortening services, however, a short identifier in Semantic-JSON points to not only a URL but to a wealth of relationship between data, thus realising a unified domain semantic web structure.

By incorporating such security considerations, Semantic-JSON achieves a form of data access not implemented in conventional Semantic Web data tools. Researchers can thus access both public and private original data on SciNetS under a data access control, and use Semantic-JSON to traverse individual virtual labs, obtaining relationships not only for public data but for private data as well. Simply by selecting a single data item, a user can access related public and (depending on their privileges) private data from different data constellations, enabling deeper integration of widely-dispersed data resources.

RIKEN BASE has already applied Semantic-JSON to the implementation of a tool that allows users to create programs on their web browsers by accessing SciNetS data. This tool was successfully employed in 2010 by contestants in GenoCon, the first International Rational-Genome-Design Contest, for designing Arabidopsis plant genome sequences using data managed on SciNetS.

Future applications

Since its foundation in 2008, research at RIKEN BASE has focused on the development, through SciNetS, of an infrastructure for enabling collaboration between researchers (virtual laboratory centre). Internationally, BASE has played a key role in the release of data in Japan for an international collaboration on Arabidopsis and mouse phenotypes. In Japan, BASE is one of the core institutions supporting activities of the Japan Science and Technology Agency (JST) Bio-sciences Database Centre. In each of these roles, the interface for data interchange is of key importance. By enabling this interchange for data published from virtual laboratories on SciNetS, Semantic-JSON achieves a major milestone, opening the door to data sharing via a variety of different devices such as mobile phones and PCs.

Through the use of SciNetS and Semantic-JSON, RIKEN aims to broaden the application of research results to society, developing the life-sciences information infrastructure necessary to accelerate data schematisation research both in Japan and across the world.


  • Norio Kobayashi, Manabu Ishii, Satoshi Takahashi, Yoshiki Mochizuki, Akihiro Matsushima and Tetsuro Toyoda. "Semantic-JSON: a lightweight web service interface for Semantic Web contents integrating multiple life-science databases." Nucleic Acids Research, 2011, doi:10.1093/nar/GKR353


Tetsuro Toyoda, Director
Bioinformatics And Systems Engineering (BASE) division
RIKEN Yokohama Institute, RIKEN
Tel: +81-(0)45-503-9111 / Fax: +81-(0)45-503-9533

Jens Wilkinson
RIKEN Global Relations and Research Coordination Office
Tel: +81-(0)48-462-1225 / Fax: +81-(0)48-463-3687

Virtual laboratory cloud centre: SciNetS

Figure 1: Virtual laboratory cloud centre: SciNetS

The SciNetS cloud service provides virtual laboratories that undertake advanced research activities by collaboration among scientists on the Web, achieving systematic sharing of life-science data resources obtained utilising the latest bioinformatics technologies.

Integrated databases on RIKEN SciNetS

Figure 2: Integrated databases on RIKEN SciNetS

Pink circles represent individual "virtual laboratory" projects. Yellow squares and green circles denote respectively organisational reality of centres at RIKEN and organisations outside RIKEN. Blue lines show the number of links between data in proportion to thickness. Red lines show relationships between organisations that produced the data, and green lines show comprehensive collaboration within RIKEN.

The Semantic-JSON concept

Figure 3: The Semantic-JSON concept

Semantic-JSON extends the concept of short URL services to Semantic Web. It also provides functions of data access control, data search and inference and access to biomedical raw data such as DNA sequences.