PostgreSQL 9.3パーティションの効果検証

PostgreSQL 9.3 パーティションの効果検証テクノロジーコンサルティング事業統括オープンソース部高橋智雄 2014 年 7 月

変更履歴版日付作成修正者説明 1.0 2013/12/16 日本 HP 高橋智雄初版作成 1.1 2014/7/8 日本 HP 高橋智雄テンプレート等を修正 2

はじめに本書は PostgreSQL9.3 のパーティション表の検索に関する性能を評価したレポートです性能評価は同一の SQL 文を非パーティション表とパーティション表で実行しその処理時間の差異を比較することで評価していますパーティション表の効果としてメンテナンス性の向上に加え検索処理の性能が向上する点を検証しました 3

目次 1. 検証環境構成と事前準備 (1) インフラ構成 (2) PostgreSQL 環境 (3) 非パーティション表作成 (4) パーティション表作成 2. 検索処理時間測定 (1) 実行する検索処理 (2) 非パーティション表検索 ( デフォルト : インデックス使用 ) (3) 非パーティション表検索 (Seq Scan) (4) パーティション表検索 ( デフォルト : Seq Scan ) (5) パーティション表検索 ( インデックス使用 ) 3. まとめ 4

1. 検証環境構成と事前準備

1-1. インフラ構成ハードウェアサーバー HP ProLiant SL335s G7 CPU AMD Opteron 4176HE 2.4GHz 2P/12Core メモリ 32GB 内蔵 HDD 900GB 4 (RAID10により論理容量は1.8TB) NIC Gigabit Ethernet 仮想化環境仮想化ソフトウェア VMware ESXi 5.0u2 仮想マシンCPU 4core 仮想マシンメモリ 16GB 仮想マシンHDD 500GB 1 OS Red Hat Enterprise Linux 6.2 (x86_64) PostgreSQL PostgreSQL 9.3.0 64bit 6

1-2. PostgreSQL 9.3 環境 (1/2) PostgreSQL 動作環境 PostgreSQL 起動ユーザーインストールディレクトリデータディレクトリ WALディレクトリ ARCHIVEファイル配置ディレクトリ postgresユーザー PATH 環境変数 PGDATA 作業用ディレクトリ postgres /usr/local/pgsql /disk1/data /disk2/pg_xlog /disk3/archive /usr/local/pgsql/bin を追加 /disk1/data /work (postgres ユーザーで読み書き可能となっている前提 ) データディレクトリ WAL ディレクトリは実際には同一 LUN 上に作成 7

1-2. PostgreSQL 9.3 環境 (2/2) PostgreSQL の主なパラメータ [/disk1/data/postgresql.conf] listen_addresses = '*' shared_buffers = 4GB checkpoint_segments = 64 logging_collector = on 8

1-3. 非パーティション表作成 (1/5) テーブル作成 # su - postgres $ createdb testdb $ psql testdb 携帯電話の通話情報と料金を管理するテーブルのイメージ以下の SQL を実行 create table test_calllog( customer_id char(12) not null, phone_number char(11) not null, call_start_date char(8) not null, call_start_time char(6) not null, call_duration numeric(8,0) not null, call_charge numeric(8,0) not null ); 9

1-3. 非パーティション表作成 (2/5) データ作成用 SQL 文 [ /work/make_calllog.sql ] insert into test_calllog values ( trim(to_char(generate_series(:start,:end), '000000000000')), --customer_id trim(to_char(generate_series(:start,:end), '00000000000')), -- phone_number trim(to_char( current_date - round((random() * 365)::integer, 0)::integer, 'YYYYMMDD')), trim(to_char( current_timestamp, 'HH24MISS')), round((random() * 3600)::integer, 0), 30 ); customer_id: 連番 phone_number: 連番 call_start_date: 実行日から1 年前までのランダムな日付 call_start_time:current_timestampの時刻部分 call_duration:0から3600までのランダムな値 call_charge:30に固定 10

1-3. 非パーティション表作成 (3/5) SQL 文実行用シェルを利用して 10 億レコードのデータ作成 [ /work/make_calllog.sh ] #!/bin/sh i=1 TIMES=1000 UNIT=1000000 SID=1 EID=`expr ${SID} + ${UNIT} - 1` while [ ${i} -le ${TIMES} ] do psql testdb -f /work/make_calllog.sql -v start=${sid} -v end=${eid} i=`expr ${i} + 1` SID=`expr ${SID} + ${UNIT}` EID=`expr ${EID} + ${UNIT}` 11 done

1-3. 非パーティション表作成 (4/5) シェルの実行とプライマリキー検索用索引の作成 $ nohup /work/make_calllog.sh & この方式では insert 文を繰り返すので長時間かかるあらかじめ CSV を作成し COPY 文でロードする方が効率は良い $ nohup psql -c "alter table test_calllog add constraint test_calllog_pk primary key (customer_id, phone_number, call_start_date, call_start_time)" testdb & 実際は 1 行 $ nohup psql -c "create index test_calllog_idx1 on test_calllog ( call_start_date )" testdb & 検証環境ではプライマリーキー検索用索引の作成にそれぞれ約半日合計 1 日かかった 12

1-3. 非パーティション表作成 (5/5) データサイズの確認 $ psql testdb testdb=# testdb=# select pg_size_pretty(pg_total_relation_size('test_calllog')); pg_size_pretty ---------------- 175 GB (1 row) 13

1-4. パーティション表作成 (1/6) パーティション表の親テーブルを作成 # su - postgres $ psql testdb 非パーティション表と同じ構造以下の SQL を実行 create table part_test_calllog( customer_id char(12) not null, phone_number char(11) not null, call_start_date char(8) not null, call_start_time char(6) not null, call_duration numeric(8,0) not null, call_charge numeric(8,0) not null ); 14

1-4. パーティション表作成 (2/6) 子テーブルおよび制約インデックスの作成 # su - postgres $ psql testdb _YYYYMM を末尾に付加して月毎のパーティションを作成 create table part_test_calllog_201210 ( CHECK( call_start_date >= '20121001' and call_start_date < '20121101') ) INHERITS (part_test_calllog); alter table part_test_calllog_201210 add constraint part_test_calllog_201210_pk primary key (customer_id, phone_number, call_start_date, call_start_time); create index part_test_calllog_201210_idx on part_test_calllog_201210 (call_start_date); 後略 (201310 まで同様に作成 ) 15

1-4. パーティション表作成 (3/6) INSERT トリガー用関数の作成 CREATE OR REPLACE FUNCTION func_part_test_calllog_insert() RETURNS TRIGGER AS $$ BEGIN IF ( NEW.call_start_date >= '20121001' AND NEW.call_start_date < '20121101' ) THEN INSERT INTO part_test_calllog_201210 VALUES (NEW.*); ELSIF ( NEW.call_start_date >= '20121101' AND NEW.call_start_date < '20121201' ) THEN INSERT INTO part_test_calllog_201211 VALUES (NEW.*); 中略 ELSIF ( NEW.call_start_date >= '20131001' AND NEW.call_start_date < '20131101' ) THEN INSERT INTO part_test_calllog_201310 VALUES (NEW.*); ELSE RAISE EXCEPTION 'ERROR! Date out of range.'; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql; 中略 16

1-4. パーティション表作成 (4/6) INSERT トリガ作成 CREATE TRIGGER trig_part_test_calllog_insert BEFORE INSERT ON part_test_calllog FOR EACH ROW EXECUTE PROCEDURE func_part_test_calllog_insert(); 17

1-4. パーティション表作成 (5/6) 非パーティション表のデータを利用してパーティション表にデータをロード $ nohup psql -c "insert into part_test_calllog select * from test_calllog" testdb & 検証環境では約 25 時間を要した 18

1-4. パーティション表作成 (6/6) データサイズの確認 $ psql testdb testdb=# testdb=# select pg_size_pretty( pg_database_size('testdb') - pg_total_relation_size('test_calllog')); pg_size_pretty ---------------- 208 GB (1 row) testdb=# select pg_size_pretty(pg_total_relation_size('part_test_calllog_201310')); pg_size_pretty ---------------- 14 GB (1 row) パーティション表の合計は 208GB 検証で検索するパーティションは 14GB( 全体の約 7%) 19

2. 検索処理時間測定

2-1. 実行する検索処理 2013 年 10 月の通話回数合計通話時間合計通話料金をもとめる SQL を 3 回実行し平均時間を求めるそれぞれデフォルトの状態と実行計画を変更させた状態で測定を行う非パーティション表 select count(*), sum(call_duration), sum(call_charge) from test_calllog where call_start_date between '20131001' and '20131031' ; パーティション表 select count(*), sum(call_duration), sum(call_charge) from part_test_calllog where call_start_date between '20131001' and '20131031' ; 21

2-2. 非パーティション表検索 ( デフォルト : インデックス使用 ) 処理時間 ( 秒 ) 実行計画 22 1 回目 2 回目 3 回目平均 3,571 3,625 3,577 3,591 testdb=# explain analyze select count(*), sum(call_duration), sum(call_charge) from test_calllog where call_start_date between '20131001' and '20131031' ; QUERY PLAN ----------------------------------------------------------------------------------------------------- Aggregate (cost=14098819.93..14098819.94 rows=1 width=9) (actual time=3556514.991..3556514.992 rows=1 loops=1) -> Bitmap Heap Scan on test_calllog (cost=1750658.62..13586962.83 rows=68247614 width=9) (actual time=106438.201..3432933.369 rows=69095685 loops=1) Recheck Cond: ((call_start_date >= '20131001'::bpchar) AND (call_start_date <= '20131031'::bpchar)) Rows Removed by Index Recheck: 929558067 -> Bitmap Index Scan on test_calllog_idx1 (cost=0.00..1733596.72 rows=68247614 width=0) (actual time=106409.711..106409.711 rows=69095685 loops=1) Index Cond: ((call_start_date >= '20131001'::bpchar) AND (call_start_date <= '20131031'::bpchar)) Total runtime: 3556533.867 ms (7 rows) インデックスを使用する実行計画となっている

2-3. 非パーティション表検索 (Seq Scan)(1/2) 実行計画が Seq Scan になるようにパラメータを変更 testdb=# set enable_bitmapscan=off; SET testdb=# testdb=# set enable_indexscan=off; SET 23 testdb=# select name, setting from pg_settings where name like 'enable%'; name setting ----------------------+--------- enable_bitmapscan off enable_hashagg on enable_hashjoin on enable_indexonlyscan on enable_indexscan off enable_material on enable_mergejoin on enable_nestloop on enable_seqscan on enable_sort on enable_tidscan on (11 rows)

2-3. 非パーティション表検索 (Seq Scan)(2/2) 処理時間 ( 秒 ) 1 回目 2 回目 3 回目平均 810 815 813 813 実行計画 testdb=# explain analyze select count(*), sum(call_duration), sum(call_charge) from test_calllog where call_start_date between '20131001' and '20131031' ; QUERY PLAN ------------------------------------------------------------------------------------------------------ Aggregate (cost=26324435.58..26324435.59 rows=1 width=9) (actual time=858397.025..858397.025 rows=1 loops=1) -> Seq Scan on test_calllog (cost=0.00..25812578.48 rows=68247614 width=9) (actual time=49.807..791015.290 rows=69095685 loops=1) Filter: ((call_start_date >= '20131001'::bpchar) AND (call_start_date <= '20131031'::bpchar)) Rows Removed by Filter: 930904315 Total runtime: 858423.650 ms (5 rows) Seq Scan が選択されているデフォルト状態で選択された実行計画よりも 4 倍以上高速 24

2-4. パーティション表検索 ( デフォルト :Seq Scan) 処理時間 ( 秒 ) 1 回目 2 回目 3 回目平均 167 131 135 144 非パーティション表のデフォルト状態よりも 30 倍高速実行計画 25 testdb=# explain analyze select count(*), sum(call_duration), sum(call_charge) from part_test_calllog where call_start_date between '20131001' and '20131031' ; QUERY PLAN ------------------------------------------------------------------------------------------------------ Aggregate (cost=2266979.99..2266980.00 rows=1 width=9) (actual time=160513.650..160513.650 rows=1 loops=1) -> Append (cost=0.00..1748762.32 rows=69095689 width=9) (actual time=0.101..107414.956 rows=69095685 loops=1) -> Seq Scan on part_test_calllog (cost=0.00..0.00 rows=1 width=28) (actual time=0.002..0.002 rows=0 loops=1) Filter: ((call_start_date >= '20131001'::bpchar) AND (call_start_date <= '20131031'::bpchar)) -> Seq Scan on part_test_calllog_201310 (cost=0.00..1748762.32 rows=69095688 width=9) (actual time=0.096..92189.284 rows=69095685 loops=1) Filter: ((call_start_date >= '20131001'::bpchar) AND (call_start_date <= '20131031'::bpchar)) Total runtime: 160513.784 ms (7 rows) デフォルトで Seq Scan が選択されている

2-5. パーティション表検索 ( インデックス使用 )(1/2) インデックスを使用した実行計画になるようにパラメータを変更 testdb=# set enable_seqscan=off; SET testdb=# select name, setting from pg_settings where name like 'enable%'; name setting ----------------------+--------- enable_bitmapscan on enable_hashagg on enable_hashjoin on enable_indexonlyscan on enable_indexscan on enable_material on enable_mergejoin on enable_nestloop on enable_seqscan off enable_sort on enable_tidscan on (11 rows) 26

2-5. パーティション表検索 ( インデックス使用 )(2/2) 処理時間 ( 秒 ) 1 回目 2 回目 3 回目平均 230 234 234 233 Seq Scan よりも処理時間が長くなる実行計画 27 testdb=# explain analyze select count(*), sum(call_duration), sum(call_charge) from part_test_calllog where call_start_date between '20131001' and '20131031' ; QUERY PLAN ------------------------------------------------------------------------------------------------------ Aggregate (cost=10004283987.36..10004283987.37 rows=1 width=9) (actual time=262048.574..262048.574 rows=1 loops=1) -> Append (cost=10000000000.00..10003765769.69 rows=69095689 width=9) (actual time=81719.992..203146.731 rows=69095685 loops=1) -> Seq Scan on part_test_calllog (cost=10000000000.00..10000000000.00 rows=1 width=28) (actual time=0.003..0.003 rows=0 loops=1) Filter: ((call_start_date >= '20131001'::bpchar) AND (call_start_date <= '20131031'::bpchar)) -> Bitmap Heap Scan on part_test_calllog_201310 (cost=2017007.37..3765769.69 rows=69095688 width=9) (actual time=81719.988..186096.782 rows=69095685 loops=1) Recheck Cond: ((call_start_date >= '20131001'::bpchar) AND (call_start_date <= '20131031'::bpchar)) -> Bitmap Index Scan on part_test_calllog_201310_idx (cost=0.00..1999733.45 rows=69095688 width=0) (actual time=81717.462..81717.462 rows=69095685 loops=1) Index Cond: ((call_start_date >= '20131001'::bpchar) AND (call_start_date <= '20131031'::bpchar)) Total runtime: 262048.733 ms (9 rows) インデックスを使用した実行計画になっている

3. まとめ

3. まとめテーブル構造非パーティション表パーティション表アクセスパス ( 実行計画 ) インデックス使用 ( デフォルト ) 平均処理時間 ( 秒 ) 3,591 Seq Scan 813 Seq Scan 144 ( デフォルト ) インデックス使用 233 検索処理においてもパーティション表は有効データ量が非常に多いためにキャッシュに乗り切らないテーブルからある程度大量のレコードを検索するような処理の場合特に有効 29

付録

参考テーブル構造非パーティション表パーティション表別環境での測定結果も同様アクセスパス ( 実行計画 ) 別環境 1 における処理時間別環境 2 における処理時間インデックス使用 551 146 ( デフォルト ) Seq Scan 171 264 Seq Scan 47 58 ( デフォルト ) インデックス使用 52 64 31

参考 ( 続き ) 別環境 1 別環境 2 サーバー DL380p Gen8 DL980 G7 CPU タイプ Intel Xeon CPU E5-2690 (2.90GHz Cache 20MB) CPU( コア ) 数 2P16C 8P80C Intel Xeon CPU E7-4870 (2.40GHz Cache 30MB) メモリ 128GB (8GB PC3-12800x16) 2TB (16GB PC3L-10600x128) HDD 300GB 6Gbps SAS 15K rpm 2.5inch x 8 900GB 6Gbps SAS 10K rpm 2.5inch x 8 DB 領域同上 HP 3PAR StoreServ 7400 (900GB 6Gbps SAS 10K rpm 2.5inch HDD x20 で VRAID1) OS Red Hat Enterprise Linux 6.3 (x86-64) Red Hat Enterprise Linux 6.3 (x86-64) 32

Thank You