IPSJ-CVIM - PDF Free Download

STHOG 1 1 1 STHOG STHOG Pedestrian Matching across Cameras using STHOG Features Ryo Kawai, 1 Yasushi Makihara 1 and Yasushi Yagi 1 In this paper, we propose a method of pedestrian matching across CCTV cameras for the purpose of cross-camera tracking of pedestrians. Spatio-Temporal Histograms of Oriented Gradient (STHOG) is adopted as pedestrian s feature, because the STHOG is segmentation-free feature and also robust to hue and brightness difference across cameras. First, a STHOG sequence is extracted from an image sequence captured by each camera. Then, a distance between two STHOG sequences is calculated based on phase synchronization and a pedestrian with the minimum distance among the galleries is identified. In experiments, we used image sequences captured by CCTV cameras in an elementary school and a university, respectively. We compared the proposed method with the previous segmentation-free method and confirmed the effectiveness of the proposed method. 1. 2011 1 10 STHOG 1 Osaka University 1 c 2011 Information Processing Society of Japan

2 3 STHOG 4 5 2. 2.1 1) 7) 3 8) 3 2.2 9) 13) Sarkar 14) 2.3 Kobayashi Otsu 15) CHLAC (Cubic Higher-order Local Auto-Correlation) 2 HLAC 3 Cai 16) Covariance Descriptor Covariance Descriptor R,G,B 3 Covariance Descriptor 2 STHOG STHOG 3. 3.1 STHOG AdaBoost 17) Wang Yagi 18) 20) 2 c 2011 Information Processing Society of Japan

3.2 3.2.1 c 1 i 1 c 2 i 2 STHOG STHOG c i j a c i,j J c i A c i = {a c i,0,, a c i,j c i 1 } p 1 a c i,j a c i,j+p 1 p A c i,j D(A c 1 i1,j 1, A c 2 i2,j 2 ) = p 1 t=0 a c 1 i1,j 1 +t ac 2 i2,j 2 +t (1) p k = J c 1 i 1 /p A c 1 i1,0, Ac 1 i1,p,, A c 1 i 1,(k 1)p Ac 1 i1,lp(0 l < k) m(a c 1 i1,lp, Ac 2 i2 ) = min D(A c 1 0 s J c i1,lp, Ac 2 i2,s ) (2) 2 p i 2 A c 1 i1,lp m(a c 1 i1, A c 2 i2 ) = min 0 l<k m(ac 1 i1,lp, Ac 2 i2 ) (3) m(a c 1 i1, A c 2 i2 ) 3.2.2 STHOG STHOG (Spatio-Temporal Histograms of Oriented Gradient) 17) I(x, y) = I 2 x + I 2 y ϕ(x, y) = tan 1 (I y/i x) HOG 21) ϕ θ STHOG [ ] I I = [I x, I y, I t ] T = x, I y, I T (4) t I ϕ θ (5) (7) I(x, y, t) = Ix 2 + Iy 2 + It 2 (5) ( ) θ(x, y, t) = tan 1 I t I (6) 2 x + Iy 2 ( ) ϕ(x, y, t) = tan 1 Iy (7) I x ϕ θ I 10 10 3 2 2 1 5 7 1 100 140 3 100 : 140 STHOG 100 140 0 θ < 180 0 ϕ < 90 9 (2 2 1) (5 7 1) 140 9 2 140 2520 3 c 2011 Information Processing Society of Japan

情報処理学会研究報告表 1 小学校の防犯カメラの仕様 Table 1 Specification of CCTVs in the elementary school. 機種フレームレート解像度 AXIS 社製ネットワークカメラ AXIS 223M 9fps 1600 1200 ピクセル (a) Left (b) Right1 (c) Right2 図 2 小学校の防犯カメラ映像の例 Fig. 2 Example of images captured by CCTVs in the elementary school. Approx. 6m 表 2 大学構内のカメラの仕様 Table 2 Specification of cameras in the university. 図 1 小学校の防犯カメラの配置 Fig. 1 Arrangement of CCTVs in the elementary school. 機種フレームレート解像度 4. 実 Point Grey Research 社製 IEEE1394 カメラ Flea2 30fps 640 480 ピクセル験 30m 4.1 データセット 30 deg. 4.1.1 小学校の防犯カメラ映像 4m 兵庫県芦屋市立精道小学校以下単に小学校と呼ぶに設置された 3 台の防犯カメラに C より撮影された児童の登校の様子の画像を利用する 3 台のカメラの仕様は表 1 に示す通 B A D Approx. 40m りであるただし解像度に関して STHOG 特徴の取得は 960 720 ピクセルにリサイ図 3 大学構内のカメラの配置 Fig. 3 Arrangement of cameras in the university. ズしてから行うものとするカメラは約 10m の高さに図 1 のように設置されており赤で網掛けした領域を児童が歩行するメラの仕様は表 2 に示す通りであり被験者は図 3 における赤色の線上をそれぞれ往復す図 2 にデータセットの例を示す実験では Left と Right1 Right1 と Right2 についてカメラ間での人物照合を行う Left と Right1 は図 1 に示した通り撮影視点が若干異なるるカメラ A,B とカメラ C,D の間に十分な距離をとっているのは大きく異なる背景条件がカメラ間の色合いはほぼ一致している一方で Right1 と Right2 は撮影視点はほぼ同下でのデータを収集するためであるまた明るさや色合い等はカメラ間で統一せず自動一だがカメラ間の色合いが異なるこれらに映った 30 人の STHOG 特徴系列を比較の対ゲイン制御によって撮影した 4 台のカメラで往復の歩容を撮影し 1 人当たり合計 8 通りの歩容画像が得られるその象とした 4.1.2 大学構内の映像画像に対してシーケンス番号を表 3 のように定義するここでカメラの記号と歩行方向大阪大学産業科学研究所内の道路以下大学構内と呼ぶにおいて 2m の高さに図 3 は図 3 において定義したものと同一であるこの内本論文ではカメラに対する歩行方向がほぼ同じであるが背景が異なるシーケのように設置した 4 台のカメラを用いて 27 人の被験者の歩容映像を撮影した 4 台のカ 4 c 2011 Information Processing Society of Japan

情報処理学会研究報告表 3 シーケンス番号の定義 Table 3 Definition of sequence number. カメラ歩行方向番号 A A B B C C D D 左向き 0 1 2 3 4 5 6 7 右向き左向き右向き左向き右向き左向き右向き (a) カメラ B シーケンス 2 (b) カメラ D シーケンス 6 図 4 大学構内の映像の例 Fig. 4 Example of images captured by cameras in the university. ンス 2 とシーケンス 6 についての照合の性能を評価する図 4 にそれぞれのカメラからの (b) ノルムの比較 STHOG 特徴画像の例を示す (a) CMC 曲線 4.2 比較手法図 5 小学校の防犯カメラ Left と Right1 に対する照合結果 Fig. 5 Matching result using CCTVs (Left and Right1) in the elementary school. 4.2.1 色特徴本研究においては防犯カメラ間の移動を追跡することを課題としているがあるカメラの視界から消えて他のカメラに映るまでの短時間のうちに服装は変わらないという仮定の STHOG 特徴の有用性の考察は CMC(Cumulative Match Characteristic) 曲線を用いもと服の色を初めとした色の特徴を特徴量として人物を照合することを考えるる CMC 曲線とは 1 対 N 認証においてプローブがどのギャラリーと類似しているか画像から色相彩度明度を求め以下の二つのヒストグラムを STHOG 特徴と同様にについて照合結果に基づき順位付けをした際各順位までに正解が入っている割合を示しセル毎に作成し特徴量とするたものである中でも CMC 曲線におけるランクが 1 位の時の識別率すなわち特徴が最色相を階級としてピクセル毎の彩度を該当する階級に投票したヒストグラムも類似していると算出した人が実際にその本人である確率以下単に認証率と呼ぶに特明度を階級として単純なピクセル数を投票したヒストグラムに着目し考察を進めるなお CMC 曲線には L2 ノルムを用いた場合の結果を示すこの特徴量は STHOG 特徴と同じくフレーム毎に算出されるため照合方法は STHOG 最適なノルムの考察は STHOG 特徴に関して CMC における上位の認証率を L2, L1, 特徴の方法に準ずるなお STHOG 特徴の二つのヒストグラムと合わせ四つのヒスト L0.5, L0.4, L0.3, L0.2, L0.1 の各ノルムで比較したグラフを基に行うグラムを用いた照合も行った 4.3.1 小学校の防犯カメラ映像 4.2.2 CHLAC 特徴図 5, 図 6 に小学校の防犯カメラ映像による照合結果を示す図 5(b) 図 6(b) のノルム 2.3 節で述べた CHLAC 特徴を比較対象に用いるまずフレーム間差分を計算し閾値別の比較での凡例の数字は CMC 曲線におけるランクを示す以降の結果についても同様 8 で 2 値化した後隣接画素フレームから 3 次までの高次局所自己相関を計算し合計である 251 次元の特徴ベクトルを抽出する CHLAC の比較には正規化自己相関を用いたまず STHOG 特徴単独の場合を見てみるとほぼ同じ方向の Right1 と Right2 では認 4.3 性能評価証率は 100% となっておりノルムを変更しても性能の低下は見られない図 6 しかし照合の結果と考察を述べる本論文では 2 項目に関して考察するまず STHOG 特徴 Left と Right1 のように方向が変わると認証率は 80% となることから観測方向の違い自体の有用性の考察そして照合における最適なノルムの考察であるが認識性能にも大きく影響することが分かる図 5 なおノルムを変更すると L0.4 か 5 c 2011 Information Processing Society of Japan

(a) CMC (b) STHOG (a) CMC (b) STHOG 6 Right1 Right2 7 Fig. 6 Matching result using CCTVs (Right1 and Right2) in the elementary school. Fig. 7 Matching result using cameras in the university. L0.2 93% 2 100% Left Right1 93% Right1 Right2 53% STHOG Left Right1 93% Right1 Right2 67% CMC STHOG CHLAC Left Right1 13% Right1 Right2 30% 2 Left Right1 4.3.2 7 STHOG 26% L1 52% L0.5 L0.4 L0.5 L0.3 18% 22% CHLAC 18% STHOG 0 STHOG 8(a) 6 c 2011 Information Processing Society of Japan

(a) (b) (c) 8 Fig. 8 Improved result using cameras in the university. STHOG 8(b) 1 26% 67% L2 L0.4 8(c) STHOG 12 12 27 15 12 9 9 Spatial gradient ϕ 4 9 12 Fig. 9 Comparing result of subject 12. (Left: probe, Right: gallery) 5. STHOG 2 STHOG CHLAC STHOG CHLAC L0.5 L0.3 7 c 2011 Information Processing Society of Japan

4.3.2 STHOG Covariance Descriptor 22 1) Kass, M., Witkin, A. and Terzopoulos, D.: Snakes: Active Contour Models, International Journal of Computer Vision, Vol.1, No.4, pp.321 331 (1988). 2) Vol.40, No.3, pp.1127 1137 (1999). 3) Turk, M. and Pentland, A.: Eigenfaces for recognition, J. Cognitive Neuroscience, Vol.3, No.1, pp.71 86 (1991). 4) Moghaddam, B. and Pentland, A.: Probabilistic Visual Learning for Object Representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.19, No.7, pp.696 710 (1997). 5) D-II Vol.84, No.3, pp.500 508 (2001). 6) Viola, P. and Jones, M.: Robust Real-Time Face Detection, International Journal of Computer Vision, Vol.57, No.2, pp.137 154 (2004). 7) Mita, T., Kaneko, T. and Hori, O.: Joint Haar-like Features for Face Detection, Tenth IEEE International Conference on Computer Vision (ICCV 05), Vol.2, pp. 1619 1626 (2005). 8) Bronstein, A., Bronstein, M. and Kimmel, R.: Three-Dimensional Face Recognition, International Journal of Computer Vision, Vol.64, No.1, pp.5 30 (2005). 9) Vol.49, No.2(CVIM22), pp.76 85 (2007). 10) Cuntoor, N., Kale, A. and Chellappa, R.: Combining Multiple Evidences for Gait Recognition, Proceedings of IEEE International Conference on Acoustics, Speech, and SignalProcessing, Vol.3, pp.33 36 (2003). 11) Han, J. and Bhanu, B.: Individual Recognition Using Gait Energy Image, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.28, No.2, pp.316 322 (2006). 12) Murase, H. and Sakai, R.: Moving Object Recognition in Eigenspace Representation: GaitAnalysis and Lip Reading, Pattern Recognition Letters, Vol. 17, pp. 155 162 (1996). 13) Nixon, M. and Carter, J.: Automatic Recognition by Gait, Proceedings of the IEEE, Vol.94, No.11, pp.2013 2024 (2006). 14) Sarkar, S., Phillips, J., Liu, Z., Vega, I., Grother, P. and Bowyer, K.: The HumanID Gait Challenge Problem: Data Sets, Performance, and Analysis, IEEE Transactions of Pattern Analysis and Machine Intelligence, Vol.27, No.2, pp.162 177 (2005). 15) Kobayashi, T. and Otsu, N.: Action and Simultaneous Multiple-Person Identification Using Cubic Higher-order Local Auto-Correlation, International Conference on Pattern Recognition, pp.741 744 (2004). 16) Cai, Y., Takala, V. and Pietikainen, M.: Matching Groups of People by Covariance Descriptor, Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul (2010). 17) Hua, C., Makihara, Y. and Yagi, Y.: Pedestrian Detection by Combining the Spatio and Temporal Features, Meeting on Image Recognition and Understanding (2010). 18) Wang, J. and Yagi, Y.: Adaptive Mean-Shift Tracking with Auxiliary Particles, IEEE Transactions on Systems, Man and Cybernetics -Part B, Vol.39(6), pp.1578 1589 (2009). 19) Wang, J. and Yagi, Y.: Visual tracking and segmentation using appearance and spatial information of patches, Proceedings of 2008 IEEE International Conference on Robotics and Automation, Anchorage, Alaska, USA (2010). 20) Wang, J. and Yagi, Y.: Tracking and segmentation using Min-Cut with consecutive shape priors, Paladyn. Journal of Behavioral Robotics, Versita, co-published with Springer-Verlag GmbH, Vol.1, No.1, pp.73 86 (2010). 21) Dalal, N. and Triggs, B.: Histograms of Oriented Gradients for Human Detection, IEEE Conference on Computer Vision and Pattern Recognition, Vol.1, pp.886 893 (2005). 8 c 2011 Information Processing Society of Japan