RI-002 Encoding-oriented video generation algorithm based on control with high temporal resolution Yukihiro BANDOH, Seishi TAKAMURA, Atsushi SHIMIZU 1 1T / CMOS [1] 4K (4096 2160 /) 900 Hz 50Hz,60Hz 240Hz 300Hz [2] FA [3] [4] [5] [6] [7] [8] [9] NTT 30Hz 1000Hz [10] 1 2 2.1 (2 +1) i ˆf(x, imδ t, w i, p i ) = w i [j]f(x, (im + M 2 +p i+j)δ t ) j= i δ t t = jδ t (j = 0, 1, ) f(x, t) t x (x = 0,, X 1) (1) 5
2.2 (a) = 1, M = 9, p i 1 = p i = p i+1 = 0 (b) = 1, M = 9, p i 1 = 0, p i = 1, p i+1 = 1 1: M 2 M 2 w i [j] w i [j] = 1 j= w i w i = (w i [ ],, w i [ ]) p i 0,, ±P M (1) Mδ t 2 + 2P + 1 M 1 M, p i 1,p i,p i+1 (a)(b) M = 9, = 1 (a) (b) p i 1 = 0, p i = 1, p i+1 = 1 N γ n = (γ n [ ],, γ n [ ]), (n = 0,, N 1) 2P + 1 N (2P + 1) N Γ N = (γ 0,, γ N 1 ) X K ˆf(x, imδ t, w i, p i ) X K B[k] (k = 0, 1,, K 1) ˆf(x, (i 1)Mδ t, w i 1, p i 1 ) B[k] (k = 0, 1,, K 1) ( d i = (d i [0],, d i [K 1]) e i (x, w i, w i 1, p i, p i 1 ) = ˆf(x, imδ t, w i, p i ) ˆf(x d i [k], (i 1)Mδ t, w i 1, p i 1 ) x (x = 0,, X 1) e i (x, w i, w i 1, p i, p i 1 ) e i (w i, w i 1, p i, p i 1 ) Ψ(w i, w i 1, p i, p i 1 ) = R h + R d (d i ) R e (e i (w i, w i 1, p i, p i 1 )) (2) R e (e i (w i, w i 1, p i, p i 1 )) R d (d i ) d i R h (2) Ψ() i w i p i i 1 w i 1 p i 1 Φ[w i, p i ] = M 1 k=0 X 1 {f(x, (im+k)δ t ) ˆf(x, imδ t, w i, p i )} 2 x=0 (3) im t < im + M i i Ξ[(w i, w i 1, p i, p i 1 ] = Ψ[w i, w i 1, p i, p i 1 ] + λφ[w i, p i ] (4) 6
i i+1 i+2 2: 2.3 S i (w i, p i ) S i (w i, p i ) = w 0,,w i 1 Γ N p 0,,p i 1 i Ξ[w j, w j 1, p i, p i 1 ] (6) j=1 S i (w i, p i ) i w i p i w i,p i Ξ[w i, w i 1, p i, p i 1 ] w i 1, p i 1 S i (w i, p i ) (4) J/M (w 0,, w J/M 1, p 0,, p J/M 1 ) = arg w 0,,w J/M 1 Γ N p 0,,p J/M 1 J/M 1 i=1 Ξ[w i, w i 1, p i, p i 1 ] (5) P = 0 N = 3 2 3 γ 0, γ 1, γ 2 γ 0, γ 1, γ 2 J/M 3 J/M N 2P + 1 {N (2P + 1)} J/M (w 0,, w J/M 1, p 0,, p J/M 1 ) Ξ[w i, w i 1, p i, p i 1 ] w i,p i w i 1, p i 1 (5) w i p i (i = 1,, J/M 1) N = 3 S i (w i, p i ) = w i 1 Γ N p i 1 {Ξ[w i, w i 1, p i, p i 1 ]+S i 1 (w i 1, p i 1 )} (7) S i 1 (w i 1, p i 1 ) S i (w i, p i ) (7) S i (w i, p i ) Ξ[w i, w i 1, p i, p i 1 ] + S i 1 (w i 1, p i 1 ) Γ N p i w i n i n i (7) ˆn i 1 (n i, p i ) ˆp i 1 (n i, p i ) (7) (5) w J/M 1 Γ N S J/M 1 (w J/M 1, p J/M 1 ) (8) p J/M 1 (7) (5) (w 0,, w J/M 1, p 0,, p J/M 1 ) {N (2P + 1)} 2 J/M J/M 1 i=1 Ψ[w i, w i 1, p i, p i 1 ] (w 0,, w J/M 1, p 0,, p J/M 1 ) (8) w J/M 1, p J/M 1 w J/M 1, p J/M 1 (w J/M 1, p J/M 1 ) = arg S J/M 1 (w J/M 1, p J/M 1 ) w J/M 1 Γ N p J/M 1 w J/M 1 n J/M 1 J/M 1 n J/M 1 p J/M 1 J/M 2 ˆn J/M 2 (n J/M 1, p J/M 1 ), ˆp J/M 2 (n J/M 1, p J/M 1 ) 7
J/M 2 w J/M 2 = γ ˆnJ/M 2 (n J/M 1,p J/M 1 ), p J/M 2 = ˆp J/M 2(n J/M 1, p J/M 1 ) w J/M 3 = γ ˆn J/M 3 (n J/M 2,p J/M 2 ), p J/M 3 = ˆp J/M 3 (n J/M 2, p J/M 2 ),, w 0 = γ ˆn0 (n 1,p 1 ), p 0 = ˆp 0 (n 1, p 1) 3 RGB (24 bits/pixel) YCbCr (8 bits/pixel) 1000 [Hz] 900 640 480 [ ] 2 ( Building A Building B ) ( Ship ) x264 lossless mode I P GOP M = 32 = 1 31.25 [Hz], 3 0, ±1, ±2 5 1 2 2.3 1 n = 0 (2 + 1)δ t [ ] 3.01% 3 (n = 0 ) 5 1: ( n ) n 0 (1/3,1/3,1/3) 1.000 1 (29/96,19/48,29/96) 0.991 2 (13/48,11/24,13/48) 0.967 3 (35/96,13/48,35/96) 0.991 4 (19/48,5/24,19/48) 0.967 5 3 (±2) 27.6 % ( Building A ), 20.7 % ( Building B ), 17.2 % ( Ship ) 1 Bulding B (n = 0) (n = 1, 3) Bulding B Bulding A n = 0, 1, 3 (n = 2, 4) Ship n = 4 4 8
2: [bits/pixel] [bits/pixel] [%] Building A 2.54 2.49 2.04 Building B 2.80 2.77 1.23 Ship 3.69 3.48 5.77 3: [%] (n ) (a) Building A 0 0.00 0.00 0.00 0.00 3.45 1 6.90 3.45 10.34 3.45 0.00 n 2 3.45 10.34 3.45 10.34 0.00 3 3.45 10.34 13.79 10.34 0.00 4 10.34 0.00 0.00 0.00 0.00 (b) Building B 0 0.00 6.90 3.45 3.45 0.00 1 10.34 24.14 6.90 3.45 0.00 n 2 0.00 0.00 3.45 3.45 0.00 3 6.90 20.69 3.45 3.45 3.45 4 0.00 0.00 0.00 0.00 0.00 (c) Ship 0 0.00 0.00 0.00 0.00 0.00 1 0.00 3.45 0.00 0.00 0.00 n 2 0.00 0.00 0.00 0.00 0.00 3 3.45 0.00 3.45 0.00 0.00 4 10.34 34.48 27.59 17.24 3.45 3.01% [1] K. Hanzawa, Y. Kato, R. Kuroda, H. Mutoh, R. Hirose, H. Toaga, K. Takubo, Y. Kondo, and S. Sugawa. A global-shutter CMOS image sensor with readout speed of 1Tpixel/s burst and 780Mpixel/s continuous. IEEE Int. Solid-State Circuits Conf. Digest of Technical Papers, pp. 382 384, 2012. [2] Y. Kuroki, T. Nishi, S. Kobayashi, H. Oyaizu, and S. Yoshimura. A psychophysical study of improvements in motion-image quality by using high frame rates. Journal of the Society for Information Display, Vol. 15, No. 1, pp. 61 68, 2007. [3] Y. Chen, K. Rose, J. Han, and D. Mukherjee. A pre-filtering approach to exploit decoupled prediction and transform block structures in video coding. Proc. IEEE Int. Conf. Image Process., pp. 4137 4140, 2014. [4] L. J. Kerofsky, R. Vanam, and Y. A. Reznik. Improved adaptive video delivery system using a perceptual pre-processing filter. Proc. IEEE Global Conf. Signal & Inf. Process., 2014. [5] N. Tsapatsoulis, K. Rapantzikos, and C. Pattichis. An embedded saliency map estimator scheme: Application to video coding. Int. J. Neural Syst., Vol. 17, No. 4, pp. 289 304, 2007. [6] C. Dikici and H. I. Bozma. Attention-based video streag. EURASIP J. Signal Process.: Image Commun., 2010. [7] A. Ben Hamida, M. Koubaa, H. Nicolas, and C. Ben Amar. Spatio-temporal video filtering for video surveillance applications. IEEE Int. Conf. Multimedia and Expo Workshops, 2013. [8] J. Ohm. Advances in scalable video coding. Proc. IEEE, Vol. 93, No. 1, pp. 42 56, 2005. [9] A. Golwelkar and J. Woods. Motion-compensated temporal filtering and motion vector coding using biorthogonal filter. IEEE Trans. Circuits Syst. Video, Vol. CSVT-17, No. 4, 2007. [10],.. (A), Vol. J96-A, No. 8, pp. 562 571, 2013. 9