can effectively exploit the I/O performance of clusters with Gbit/sec-class flash memories. In this paper, we first outline our prototype MapReduce system which utilizes distributed key-value store. And we perform an extensive benchmark for evaluating existing open-source implementations of key-value stores., Data-Intensive Scalable Computing (DISC) MapReduce DISC Gbit/sec SSD (Solid State Drive) Gbit/sec MapReduce MapReduce Preliminary Evaluation of Fast Flash Memory Oriented Key Value Stores Hirotaka Ogawa, Hidemoto Nakada, Akinobu Mita,, Takahiro Hirofuchi, Ryousei Takano and Tomohiro Kudoh The practical needs of efficient execution of large-scale data-intensive applications propel the research and development of Data-Intensive Scalable Computing (DISC) systems, which manage, process, and store massive data-sets in a distributed manner. MapReduce is a representative of such DISC systems. On the other hand, today, HPC community is going to be able to utilize very fast SSDs (Solid State Drives) with Gbit/sec-class read/write performance. However, coupling such very fast storage devices with MapReduce systems, much of the benefits of devices can easily be lost because of software overhead incurred by MapReduce systems themselves. To resolve these problems, we are aiming to design and implement a novel DISC system called SSS, which. Data-Intensive Scalable Computing (DISC) MapReduce ) DISC MapReduce Google File System (GFS) ) MapReduce I/O NAND SSD (Solid State Drive) HPC Fusion-io iodrive TM3) duo PCI-Express SSD Gbit/sec / Gbit/sec MapReduce MapReduce / National Institute of Advanced Industrial Science and Technology (AIST) / Fixstars Corporation c Information Processing Society of Japan
Map Map Reduce (in_key, in_value) Map Reduce (in_key, in_value) (out_key, int_val) out_key (out_key, list(int_val)) (out_key, out_value) Map Map Shuffle Sort Reduce 力力 (out_key, out_value) Split Split Split Split 3 Part Part Server Server Server Server Server Server Server Server MapReduce MapReduce Execution. MapReduce. MapReduce MapReduce MapReduce Map Shuffle & Sort Reduce 3 Map Shuffle & Sort Reduce key-value Google File System (GFS) ) HDFS (Hadoop Distributed File System) ) MapReduce M Map M Map Map R Reduce R Reduce Google ) MapReduce Map/Reduce Map/Reduce MapReduce Hadoop MapReduce 5) Google Hadoop JobTracker TaskTracker c Information Processing Society of Japan
Hadoop DataNode TaskTracker Map. MapReduce MapReduce GFS/HDFS.. memcached Web () () Map, Reduce (3) 3 (distribution-key, local-key, value) distribution-key, local-key Python dist- key set <dist- key, local- key, val> KVS KVS 3 Client get <dist- key, local- key> KVS KVS () distribution-key ( ) (distribution-key, local-key, value) distribution-key local-key.. MapReduce MapReduce distribution-key distribution-key MapReduce ( ) Map Map distribution-key distributionkey distribution-key 3 c Information Processing Society of Japan
dist- key dist- key dist- key3 Map Map Map int- key int- key int- key3 dist- key dist- key dist- key3 Map int- key int- key3 int- key Shuffle & Sort int- key int- key3 int- key5 Client Master Shuffle & Sort int- key int- key int- key3 int- key int- key int- key3 Reduce int- key int- key int- key3 Reduce Reduce Reduce int- key int- key int- key3 KVS MapReduce Map distribution-key distribution-key Reduce Reduce Shuffle distribution-key distribution-key distribution-key Reduce Reduce distribution-key.3 : < file,, Hello World Bye World > < file,, Hello SSS Goodbye SSS > < file, file > Map distribution-key file Map < file,, Hello World Bye World > local-key Map ID < Hello, map-word, > < World, map-word, > < Bye, map-word, > < World, map-word3, > distribution-key < Hello, World, Bye, World > distribution-key file Map < Hello, map-word, > < SSS, map-word, > < Goodbye, map-word, > < SSS, map-word3, > distribution-key < Hello, SSS, Goodbye, SSS > Map distribution-key < Hello, World, Bye, SSS, Goodbye > Reduce distribution-key Hello Reduce < Hello, map-word, > < Hello, map-word, > file, file c Information Processing Society of Japan
< Hello, count, > distribution-key < Hello > Reduce < Hello, count, > < World, count, > < Bye, count, > < SSS, count, > < Goodbye, count, > Reduce distribution-key < Hello, World, Bye, SSS, Goodbye >. MapReduce Map Reduce distribution-key Shuffle & Sort Map Google Hadoop Map (TaskTracker) Reduce (TaskTracker) Map Reduce Map Reduce GFS/HDFS iodrive TM ( ) 3 iodrive distribution-key Google Hadoop MapReduce 3. MapReduce (Map/Reduce ) Gb Ethernet Fusion-io iodrive Myrinet Myri-G (MemcacheDB 6) Tokyo Tyrant 7) Hail Cloud Computing Project ) Chunkd) 3. GbE NIC 5 c Information Processing Society of Japan
CPU Intel Xeon E53 @.66GHz x Memory GbE NIC NAND Type Write Bandwidth Read Bandwidth IOPS Access Latency Bus Interface GB (DDR-667) Myricom Myri-G iodrive 6GB Catalog Spec Single Level Cell (SLC) 67MB/s (3K packet size) 75MB/s (3K packet size) 6,6 (K read packet size) 93,99 (75/5 r/w mix K packet size) 6 s Read PCI-Express x Fusion-io iodrive 6GB( ) CentOS 5..6.3.6.3 Myri-G.5.-.5..-. iodrive..7. Myri-G iodrive.6.3 3. MemcacheDB Tokyo Tyrant Chunkd MemcacheDB BerkeleyDB Tokyo Tyrant Tokyo Cabinet Chunkd () MemcacheDB memcachedb: svn revision 9 db-..6 libevent-..3-stable Tokyo Tyrant tokyotyrant-..39 tokyocabinet-.. chunkd cld: fcfcc53c6937e9fb7dbbfe33a63 chunkd: 6f6dcfc5b7cd9b33ed99b99 chunkd 3 6 3.3 Value set/get Key 6 () Value,,,,,,,,,,,,, 6 6,,, / ( Value * ) 3. Value Value ( ) Value Tokyo Tyrant get 5 KB Value Tokyo Tyrant Value Chunkd Chunkd Value Chunkd set get 6 c Information Processing Society of Japan
9 tokyotyrant-get- tokyotyrant-get- tokyotyrant-get- IT Throughput [MB/sec] 7 6 5 3 6 5 Get Throughput of Tokyo Tyrant (Value,,-,,) open Value. gridool 9) MapReduce gridool 5. Gbit/sec MapReduce ) Dean, J. and Ghemawat, S.: MapReduce: simplified data processing on large clusters, Communications of the ACM, Vol.5, No., pp.7 3 (). ) Ghemawat, S., Gobioff, H. and Leung, S.-T.: The Google file system, SOSP 3: Proceedings of the nineteenth ACM symposium on Operating systems principles, New York, NY, USA, ACM, pp.9 3 (3). 3) Fusion-io: Fusion-io :: Products, http://www.fusionio.com/products.aspx. ) Borthakur, D.: HDFS Architecture, http://hadoop.apache.org/core/docs/ current/hdfs design.html. 5) Apache Hadoop Project: Hadoop, http://hadoop.apache.org/. 6) Chu, S.: MemcacheDB, http://memcachedb.org/. 7) Hirabayashi, M.: Tokyo Tyrant: network interface of Tokyo Cabinet, http:// 97th.net/tokyotyrant/. ) Garzik, J.: Hail Cloud Computing Wiki, http://hail.wiki.kernel.org/index. php/main Page. 9) Yui, M.: gridool: An Infrastructure of Parallel Job Execution on Grid, http: //code.google.com/p/gridool/. 6 6 6 tokyotyrant-set- memcachedb-set- chunkd-set- Benchmark result (Value bytes, 6,, sets) 6 NEDO 7 c Information Processing Society of Japan
6 tokyotyrant-set- memcachedb-set- chunkd-set- tokyotyrant-set- memcachedb-set- chunkd-set- 6 6 6 6 7 Benchmark result (Value, bytes, 6,, sets) 9 Benchmark result (Value, bytes, 6, sets) tokyotyrant-set- memcachedb-set- chunkd-set- 5 tokyotyrant-set- memcachedb-set- chunkd-set- 6 5 5 6 Benchmark result (Value,, bytes, 6, sets) 6 Benchmark result (Value, bytes, 6, sets) tokyotyrant-set- memcachedb-set- chunkd-set- 6 6 Benchmark result (Value,, bytes, 6 sets) c Information Processing Society of Japan
tokyotyrant-set- memcachedb-set- chunkd-set- 6 tokyotyrant-get- memcachedb-get- chunkd-get- 6 6 6 6 Benchmark result (Value,, bytes, 6 sets) 5 Benchmark result (Value bytes, 6,, gets) tokyotyrant-set- memcachedb-set- chunkd-set- tokyotyrant-get- memcachedb-get- chunkd-get- 6 6 6 6 6 3 Benchmark result (Value,, bytes, 3 sets) 6 Benchmark result (Value, bytes, 6,, gets) tokyotyrant-set- memcachedb-set- chunkd-set- 5 tokyotyrant-get- memcachedb-get- chunkd-get- 6 5 5 6 6 Benchmark result (Value,, bytes, 6 sets) 7 Benchmark result (Value, bytes, 6, gets) 9 c Information Processing Society of Japan
7 tokyotyrant-get- memcachedb-get- chunkd-get- 6 tokyotyrant-get- memcachedb-get- chunkd-get- 6 5 5 3 3 6 6 Benchmark result (Value, bytes, 6, gets) Benchmark result (Value,, bytes, 6 gets) 6 tokyotyrant-get- memcachedb-get- chunkd-get- 6 tokyotyrant-get- memcachedb-get- chunkd-get- 5 5 3 3 6 6 9 Benchmark result (Value,, bytes, 6, gets) Benchmark result (Value,, bytes, 3 gets) 5 5 tokyotyrant-get- memcachedb-get- chunkd-get- 5 tokyotyrant-get- memcachedb-get- chunkd-get- 35 3 5 5 5 5 5 6 6 Benchmark result (Value,, bytes, 6 gets) 3 Benchmark result (Value,, bytes, 6 gets) c Information Processing Society of Japan