RIKEN Center for Advanced Intelligence Project Succinct Information Processing Unit
Unit Leader: Yasuo Tabei (D.Sc.)
Research Summary

Massive datasets, so called big data, are ubiquitous in research and industry. Data mining researchers/practitioners face the problem of processing and analyzing such huge datasets for knowledge discoveries in various fields. However, coping with big data is a challenge because of its huge computational cost. One important approach for solving this bottleneck in big data era is to (i) build indexes from datasets as a preprocessing by using space-efficient data structures and (ii) process datasets on the indexes.
Succinct data structure (SDS) is a space-efficient representation for data structures while supporting fast data operations on the representation. Recently, various types of SDSs have been proposed for compactly representing and indexing strings, trees, graphs, set of integers and so on. We research on basics of SDSs and their applications to artificial intelligence and knowledge discovery for scalable information processing.
Main Research Fields
- Computer Science
Related Research Fields
- Biology & Biochemistry
- Pharmacology & Toxicology
Research Subjects
- Data compression
- Data mining
- Artificial intelligence
Selected Publications
Papers with an asterisk(*) are based on research conducted outside of RIKEN.
- 1.*Tabei, Y., Saigo, H., Yamanishi, Y., Puglisi, S. J.:
"Scalable partial least squares regression on grammar compressed data matrices"
In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), (2016) - 2.*Tabei, Y., Yamanishi, Y., Kotera, M.:
"Simultaneous prediction of enzymatic orthologs from chemical transformation patters for de novo metabolic pathway reconstructions"
In Proceedings of the 23rd International Conference on Intelligent Systems for Molecular Biology (ISMB), (2016) - 3.*Bellazougui, D., Coding, P., Puglisi, S. J., Tabei, Y.:
"Access, rank, and select in grammar-compressed string"
In Proceedings of the 23rd European Symposium on Algorithms (ESA), (2015) - 4.*Maruyama, S., Tabei, Y.
"Fully-online grammar compression in constant space"
In Proceedings of the 24th Data Compression Conference, (2014) - 5.*Tabei, Y., Kishimoto, A., Massaki K., Yamanishi, Y.:
"Succinct interval-splitting tree for scalable similarity search of compound-protein pairs with property constraints"
In Proceedings of the 19nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), (2013) - 6.*Tabei, Y., Takabatake, Y., Sakamoto, H.:
“A succinct grammar compression”
In proceedings of the 24th Annual Symposium on Combinatorial Pattern Matching (CPM), (2013) - 7.*Maruyama, S., Tabei, Y., Sakamoto, Y., Sadakane, K.:
"Fully-online grammar compression"
In Proceedings of the 20th String Processing and Information Retrieval(SPIRE), (2013) - 8.*Tabei, Y., Pauwels, E., Stove, V., Takemoto, K., Yamanishi, Y.:
"Identification of chemogenomic features from drug-target interaction networks using interpretable classifiers"
In Proceedings of the 11th European Conference on Computational Biology (ECCB), (2012) - 9.*Tabei, Y.:
"Succinct Multibit Tree: Compact representation of multibit trees by using succinct data structures in chemical fingerprint searches"
In Proceedings of the 12th workshop on algorithms in bioinformatics (WABI), (2012) - 10.*Tabei, Y., Tsuda, K.:
"Kernel-based similarity search in massive graph data bases with wavelet trees"
In Proceedings of the 11th SIAM International Conference on Data Mining (SDM), (2011)
Related Links
Lab Members
Principal investigator
- Yasuo Tabei
- Unit Leader
Core members
- Yoshitaka Yamamoto
- Visiting Scientist
- Masakazu Ishihata
- Visiting Scientist
- Hiroto Saigo
- Visiting Scientist
Contact Information
Nihonbashi 1-chome Mitsui Building, 15th floor,
1-4-1 Nihonbashi,
Chuo-ku, Tokyo
103-0027, Japan
Email: yasuo.tabei [at] riken.jp