MapTask 678 Map 関数バッファ管理モジュールリングバッファ 45#$% *+,-./ 0123!"#$% &'() 外部記憶装置 1 MapReduce IFIle IFIle MapReduce 25% MapReduce 2 MapReduce OS

DEIM Forum 2014 D1-3 MapReduce 180 8585 3 9 11 E-mail: {ozawa.tsuyoshi,oikawa.kazuki,onizuka.makoto,honjo.toshimori}@lab.ntt.co.jp MapReduce 1 Google, Facebook, Yahoo! MapReduce MapReduce MapReduce MapReduce 1. MapReduce, Hadoop, MapReduce 1 Google, Facebook, Yahoo! [2], [6], [16], [18] MapReduce Map Map Reduce [6] MapReduce ETL RDBMS ( [4], [13]) MapReduce 2 1 Mapper 1 Reducer Mapper Map MapTask Reducer Reduce ReduceTask MapReduce Reducer 1 WordCount MapTask 1 Reducer Reducer 2 1 Map- Task Reducer IO MapReduce 1 Hadoop [1] gzip bzip2 LZO LZ4 Snappy 2 MapReduce Reducer [12] : 1 Map inmapper combining 2 stripes 3 Map memory-backed join Map MapTask Map MapReduce [12] 1 Hadoop [17] 2 MapReduce MapReduce

MapTask 678 9:;<<=>?>@;A> Map 関数バッファ管理モジュールリングバッファ 45#$% *+,-./ 0123!"#$% &'() 外部記憶装置 1 MapReduce IFIle IFIle MapReduce 25% MapReduce 2 MapReduce OSS Hadoop 3 4 5 6 2. 3 2 Hadoop IFile Key1の長さ (1-5バイト) (Key1, Value1) Value1の長さ (1-5バイト) (Key2, Value2) (Key2, Value3) Key1 のバイト列 (Key3, Value4) Value1 のバイト列 Hadoop Shuffle (IFile) 2. 1 MapReduce 1 MapReduce MapReduce Map Reduce 2 Map D Key/Value (K 1, V 1 ) Map Key/Value (K 2, V 2 ) Reduce Key Shuffle Reduce Key/Value (K 3, V 3 ) D map(k 1, V 1 ) {K 2, V 2 } shuffle({k 2, V 2 }) {K 2, {V 2 }} reduce(k 2, {V 2 }) (K 3, V 3 ) MapReduce DFS MapReduce InputSplit InputSplit MapTask MapTask Mapper Map Key/Value Key shuffle ReduceTask Key ReduceTask Reducer Reduce Key/Value DFS 2. 2 Hadoop Shuffle MapReduce MapTask MapReduce Hadoop 2

Hadoop Map Key Value Key Value Key Value IFile IFile 3 IFile Key Key Value 1 Key Value Key Value MapReduce IFile Writer/Reader Key Value MapReduce [17] 2. 3 MapReduce Task MapTask ReduceTask 2 MapTask MapTask ReduceTask MapTask ReduceTask MapTask MapTask/ReduceTask Hadoop Shuffle MapTask/ReduceTask MapTask [11] Hadoop MapTask 3. 3. 1 OLAP C-Store MonetDB [14], [17] MapTask Map 関数型 1 用のバッファ (Key, Value) Keyの一部バッファ管理モジュール列指向バッファ Valueの一部 Valueの一部型 2 用のバッファ Keyの一部型 3 用のバッファ型 4 用のバッファ外部記憶装置 CIFile CIFile 4 numtypes:1バイト ( 型の個数 ) CIFile Type1:1バイト ( 含まれている型 1) ヘッダ型 1 用のバッファ Type2:1バイト ( 含まれている型 2) Type3:1バイト型 2 用のバッファ型 3 用のバッファ ( 含まれている型 3) lentype1: 4バイト ( 型 1のバッファ長 ) lentype2: 4バイト ( 型 2のバッファ長 ) lentype2: 4バイト ( 型 3のバッファ長 ) 5 (CIFile) [22] 4 MapTask CIFile(Columnar IFile) 3. 2 CIFile CIFile 5 CIFile CIFile 1 numtypes 1

numtypes 4 numtypes CIFile Key Value 4. CIFile IFile Key Value MapTask MapTask [12] 2 Key Value MapTask TextInputFromat Key Value Raw An apple is red An apple apple is is red 1 An apple is, apple is red Mapper Key Value 1 Reducer Reducer MapReduce Hadoop IFile.Writer CIFile DataOutput Amazon EC2 m2.4xlarge 4. 1 IFile CIFile bzip2 Snappy bzip2 Hadoop Bzip2Codec Snappy snappy-java [15] Brisk [5] SnappyCodec PUMA Benchmark Suite [7] Wikipedia 50GB (file87) 8355840 28 1 IFile CIFile IFile Key Value CIFile CIFile Snappy 25% 4. 2 CIFile CIFile PUMA Benchmark Suite Wikipedia 50GB (file87) 2 CIFile MapTask Raw p4delta [23] 4. 3 CIFile Map- Task IFile CIFile Snappy PUMA Benchmark Suite Wikipedia 50GB 8K IO IFile CIFile 3 CIFile IFile 21% IO 33% IO MapReduce 5. Spark [21] MapReduce (DAG) DSL Spark RDD [20] DSL Shark [19] Spark SQL DB Spark Shark RDD Spark/Shark CIF [8] [10] IO IO MapReduce CIF

1 IFile CIFile ( ) bzip2 ( ) Snappy ( ) IFile Raw 9206558 1361111 3554898 CIFile Raw 9025188(2% ) 1214726(11% ) 3283301(8% ) IFile 18806076 1819063 6095471 CIFile 17923310(5% ) 1633830(11% ) 4591160(25% ) 2 CIFile ( ) bzip2 ( ) Snappy ( ) Raw 99917 32749((67% ) 54374(45% ) Raw 8274496 915891(88% ) 2850701(65% ) 442138 173477(60% ) 296318(32% ) 13957109 1448904(89% ) 4128572(70% ) 3 IFile CIFile ( ) Snappy ( ) IFile 4307 36799837380 CIFile 2880(33% ) 29365516361(21% ) co-location RCFile [9] ORCFile [3] MapReduce MapReduce MapReduce blob 2 1:1 CIF RCFile ORCFile CIFile CIF RCFile ORCFile CIF RCFile ORCFile Shuffle CIFile MapReduce 6. MapReduce Shuffle CIFile Hadoop MapReduce Hadoop Hive [1] : Apache Hadoop, http://hadoop.apache.org/. [2] : Apache Hadoop Wiki, http://wiki.apache.org/hadoop/ PoweredBy. [3] : Create a new Optimized Row Columnar file format for Hive, https://issues.apache.org/jira/browse/ HIVE-3874 (2013). [4] : Treasure Data s Plazma: Columnar Cloud Storage http://blog.treasure-data.com/post/53534943282/ treasure-datas-plazma-columnar-cloud-storage (2013). [5] DataStax: Brisk, https://github.com/riptano/brisk. [6] Dean, J. and Ghemawat, S.: MapReduce: simplified data processing on large clusters, Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6, OSDI 04, Berkeley, CA, USA, USENIX Association, pp. 10 10 (2004). [7] Faraz Ahmad, Seyong Lee, M. T.: PUMA: Purdue MapReduce Benchmarks Suite. [8] Floratou, A., Patel, J. M., Shekita, E. J. and Tata, S.: Column-oriented Storage Techniques for MapReduce, Proc. VLDB Endow., Vol. 4, No. 7, pp. 419 429 (2011). [9] He, Y., Lee, R., Huai, Y., Shao, Z., Jain, N., Zhang, X. and Xu, Z.: RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems, Proceedings of the 2011 IEEE 27th International Conference on Data Engineering, ICDE 11, Washington, DC, USA, IEEE Computer Society, pp. 1199 1208 (2011). [10] Kaldewey, T., Shekita, E. J. and Tata, S.: Clydesdale: Structured Data Processing on MapReduce, Proceedings of the 15th International Conference on Extending Database Technology, EDBT 12, New York, NY, USA, ACM, pp. 15 25 (2012). [11] Li, B., Mazur, E., Diao, Y., McGregor, A. and Shenoy, P.: A platform for scalable one-pass analytics using MapReduce, SIGMOD 11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, New York, NY, USA, ACM, pp. 985 996 (2011). [12] Lin, J. and Dyer, C.: Data-Intensive Text Processing with MapReduce, Morgan and Claypool Publishers (2010). [13] Ohta, K.: Hadoop meets Cloud with Multi-Tenancy, http: //www.slideshare.net/treasure-data/hadoop-meets-cloud-with-multitena (2013). [14] Peter Boncz, Marcin Zukowski, N. N.: MonetDB/X100: Hyper-Pipelining Query Execution, Conference on Innovative Data Systems Research 2005 (2005). [15] Saito, T. L.: snappy-java, https://code.google.com/p/

snappy-java/. [16] Silberstein, A. E., Sears, R., Zhou, W. and Cooper, B. F.: A batch of PNUTS: experiences connecting cloud batch and serving systems, Proceedings of the 2011 ACM SIG- MOD International Conference on Management of data, SIGMOD 11, New York, NY, USA, ACM, pp. 1101 1112 (2011). [17] Stonebraker, M., Abadi, D. J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O Neil, E., O Neil, P., Rasin, A., Tran, N. and Zdonik, S.: C-store: A Column-oriented DBMS, Proceedings of the 31st International Conference on Very Large Data Bases, VLDB 05, VLDB Endowment, pp. 553 564 (2005). [18] Thusoo, A., Shao, Z., Anthony, S., Borthakur, D., Jain, N., Sen Sarma, J., Murthy, R. and Liu, H.: Data warehousing and analytics infrastructure at facebook, Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, SIGMOD 10, New York, NY, USA, ACM, pp. 1013 1020 (2010). [19] Xin, R. S., Rosen, J., Zaharia, M., Franklin, M. J., Shenker, S. and Stoica, I.: Shark: SQL and Rich Analytics at Scale, Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD 13, New York, NY, USA, ACM, pp. 13 24 (2013). [20] Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M. J., Shenker, S. and Stoica, I.: Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing, Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI 12, Berkeley, CA, USA, USENIX Association, pp. 2 2 (2012). [21] Zaharia, M., Chowdhury, M., Franklin, M. J., Shenker, S. and Stoica, I.: Spark: Cluster Computing with Working Sets, Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud 10, Berkeley, CA, USA, USENIX Association, pp. 10 10 (2010). [22] Zukowski, M.: Balancing Vectorized Query Execution with Bandwidth Optimized Storage, PhD thesis, Universiteit van Amsterdam (2009). [23] Zukowski, M., Heman, S., Nes, N. and Boncz, P.: Super- Scalar RAM-CPU Cache Compression, Proceedings of the 22Nd International Conference on Data Engineering, ICDE 06, Washington, DC, USA, IEEE Computer Society, pp. 59 (2006).

MapTask 678 Map 関数 バッファ管理モジュール リングバッファ 45#$% *+,-./ 0123!"#$% &'() 外部記憶装置 1 MapReduce IFIle IFIle MapReduce 25% MapReduce 2 MapReduce OS

MapTask 678 Map 関数バッファ管理モジュールリングバッファ 45#$% *+,-./ 0123!"#$% &'() 外部記憶装置 1 MapReduce IFIle IFIle MapReduce 25% MapReduce 2 MapReduce OS