1 n m (ICA = independent component analysis) BSS (= blind source separation) : s(t) =(s 1 (t),...,s n (t)) R n : x(t) =(x 1 (t),...,x n (t)) R m 1 i s i (t) a ji R j 2 (A =(a ji )) x(t) =As(t) (1) n = m 3 A full rank t =1, 2, 3,...,T x(1),...,x(t ) s(1),...,s(t ) ( T>n) 1 7 2 ICA 3 stiefel manifold sparse coding 1
A s(t) =A 1 x(t) s(t) s 1 (t),...,s n (t) 4 A 1 W 5 y(t) =W x(t) (2) s(t) y(t) s(t) i.i.d. (independently identically distributed) p s (s) 6 i.i.d. t s 1,...,s n p s (s) = n p si (s i ) (3) i=1 p i (s i ) s i p si (s i )= p s (s)ds i 7. i 1,...,i n 0 a 1,...,a n a 1 s i1,...,a n s in well-defined 1 x 1,x 2 y 1 = a 11 x 1 + a 12 x 2, y 2 = a 21 x 1 + a 22 x 2, (4) y 1,y 2 4 S () 5 6 t 7 s i s i s (s 1,...,s i 1,s i+1,...,s n ) 2
2( ) s =(s 1,...,s n ) x =(x 1,...,x n ) y = W x y =(y 1,...,y n ) s y ( ) x 1,x 2,y 1,y 2 exp(φ 1 (x 1 )), exp(φ 2 (x 2 )), exp(ψ 1 (y 1 )), exp(ψ 2 (y 2 )) x 1,x 2 y 1,y 2 p(x 1,x 2 ) = exp(φ 1 (x 1 )+φ 2 (x 2 )), q(y 1,y 2 ) = exp(ψ 1 (y 1 )+ψ 2 (y 2 )), (5) B y = Ax Pr[x B] = p x (x)dx = p x (A 1 dy y) B B det A = Pr[y B ]= p y (y)dy (6) B (B B A ) p y (y) = p x(a 1 y) det A (7) x 1,x 2 y 1,y 2 Jacobian ( c = det a 11 a 12 a 21 a 22 p(x 1,x 2 )=cq(y 1,y 2 ), φ 1 (x 1 )+φ 2 (x 2 )=ψ 1 (a 11 x 1 + a 12 x 2 )+ψ 2 (a 21 x 1 + a 22 x 2 ) + log c. (9) ) (8) 3
x 1,x 2 a 11 a 12 ψ1 (a 11 x 1 + a 12 x 2 )+a 21 a 22 ψ2 (a 21 x 1 + a 22 x 2 )= a 11 a 12 ψ1 (y 1 )+a 21 a 22 ψ2 (y 2 )=0. (10) a 11 a 12 =0 a 11 a 12 0 C = {(x 1,x 2 ) y 1 = const.} C (x 1,x 2 ) y 2 C y 2 ψ 2 (y 2 ) = const. (11) ψ 1 (y 1 ) = const. ψ i (y i )=α i y 2 i + β i y i + γ i (12) α i < 0 : 1. 2. (fmri, ) 3. 2 0 E[s i s j ]=0(i j) 4
x =[x(1),...,x(t )] x = UDV (13) x T n full-rank U T n U T U = I, V n D n x V UD 8 UD D 1 U U W x 3 p y (y) = n p yi (y i ) (14) i=1 0 Kullback-Leibler 9 D[p y (y) q y (y)] = E p y[log p y (y) log q y (y)] (15) 8 U, D U =[U 1,U 2 ],D = diag[d 1,D 2 ] ( ) U 1 D 1 Frobenius x 9 KL 5
q y (y) = n i=1 p y i (y i ) n L(W )=D[p y (y) p yi (y i )] (16) W H( ) = E[log p ( )] n L(W )= H(y i ) H(y) (17) y = W x p y (y) = p x(w 1 y) det W i=1 H(y) = = px (W 1 y) det W = i=1 p y (y) log p y (y)dy log p x(w 1 y) dy det W p x (x)(log p x (x) log det W )dx (18) = H(x) + log det W (19) L(W )= n H(y i ) log det W H(x) (20) i=1 H(x) W det W =1 10 L(W ) y i 10 6
det W x(1),...,x(t ) E T H(y i ) E T [log p yi (y i )] (21) E T [f(y i )] = 1 T T f(y i (t)) (22) t=1 log p yi (y i ) log q i (y i ) 11 1. θ q yi (y i )=f(y i ; θ), (23) 2. q yi (y i )=E T [f(y i (t); θ)] (24) f 11 q i (y i ) 7
3. y i 0, 1 Gram-Charlier [ q i (y i )=φ(x) 1+ k=3 κ k k! h k(x) ] (25) h k (x) k κ k k 12 log c x (ω) = k=1 κ k k! (iω)k. (27) 0 x κ 1 = E[x], κ 2 = E[x 2 ], κ 3 = E[x 3 ], κ 4 = E[x 4 ] 3E[x 2 ] 2 κ 3,κ 4 ( ) 3 0 4 ( ) 12 c x (ω) = exp(iωx)p x (x)dx (26) 8
5 L q (W )= n E T [log q i (y i )] log det W (28) i=1 W ( ) ( ) W L q (W ) L q (W )/ W W = W ɛ L q(w ) W. (29) det W det W>0 det W D ij j =1,...,n det W = n w ij D ij, (30) i=1 log det W = 1 det W det W = 1 det W (D ij) (31) =( / w ij ). W 1 =(1/ det W )(D ji ) W =(W 1 ) log det W = W (32) 9
log q i (y i ) = ( log q i(y i ) )= ( log q n i( k=1 w ikx k ) ) w ij w ij = ( d log q i(y i ) dy i x j )=ϕ(y)x (33) φ i (y) = d log q i (y i )/dy i. L q (W )=E T [ϕ(y)x ] W = E T [ϕ(y)y I]W (34) f 2 W 1 W 2 ( ) f(w ) f(w )= (35) w ij grad W f(w ) Lie Lie W GL(n) W 1 I W T W GL(n) V 1,V 2 T W GL(n) V 1,V 2 W 1 T I T I R n n g(v 1,V 2 ) = tr[w V 1 V 2W 1 ] = tr[w 1 W V 1 V 2] (36) G grad W f = G 1 vec[ f(w )] (37) vec V T W GL(n) V 2 = g(v,v )=c 10
W = W ɛv f(w )=f(w ɛv ) (38) V ɛ f(w )= ɛtr [ f(w ) V ] + o(ɛ) (39) Lagrange λ L(V )= ɛtr [ f(w ) V ] λ(c tr [ W 1 W V V ] ) (40) V ɛ f(w ) +2λW 1 W V = 0 (41) V = ɛ 2λ f(w )W W (42) c ɛ/(2λ) =1 V = f(w )W W (43) f(w )=L q (W ) (34) grad W f(w )=E T [ϕ(y)y I]W (44) 13 13 11
(whitening) sphering Lie W O(n) T W O(n) W t W (0) = I W (t) W (t) =I, W(0) = W (45) Ẇ (0) W + W Ẇ (0) = 0 (46) Ẇ (0) + Ẇ (0) = 0 (47) V T I O(n) V V c(t) =(I +tv/2)(i tv/2) 1 ċ(0) = V O(n) V T I O(n). T I O(n) SO(n) Lie so(n) W T I O(n) W T W O(n) tr[v1 V 2] (isometric) (GL(n) W 1 W ) I R n n V T I R n n T I O(n) T O(n) V = V V 2 + V + V. (48) 2 O(n) 3 (M,g) (R m,h) g h a M f grad M a f grad Rm a f T am O(n) R n n f T W O(n) 12
1. W T I R n n f W f (49) 2. T I R n n (48) W f 1 2 (W f f W ) (50) 3. W T W O(n) 1 2 (W f f W ) 1 2 ( f W f W ) (51) 1/2 grad W f = f W f W (52) W = W ɛgrad W f O(n) W O(n) grad W f I O(n) X ψ(i, t, X) = exp(tx) (53) W O(n) 1. grad W f T I O(n) f W f W W f f W (54) 2. I W f f W exp(t(w f f W )) (55) 3. W exp(t(w f f W )) W exp(t(w f f W )) (56) 13
X so(n) Θ α (X) =(I + X α )α/2 (I X α ) α/2, α 0, α R (57) α =1 ( I + X 1 ) α =2 Cayley α Θ 1 (X) ={(I + X)(I + X) } 1/2 (I + X) (58) Θ 1 (X) =(I + X 2 )(I X 2 ) 1 (59) Θ (X) = exp(x) (60) Θ α (tx) t 2 Θ α (tx) =I + tx + t 2 X2 2 + O(t3 ) (61) FastICA 6 ICA A p(x; θ, ξ) (θ ξ ) u(x; θ, ξ) = p(x; θ, ξ) (62) θ v(x; θ, ξ) = p(x; θ, ξ) (63) ξ 14
14 z(x, θ) ξ θ, ξ 15 E,ο[z(x, θ)] = 0, (67) [ ] det K 0, K = E,ο z(x, θ) (68) θ E,ο[z(x, θ)z(x, θ) ] < (69) T z(x(t), θ) = 0 (70) t=1 ˆθ M M T 14 Crameŕ-Rao 1 T K 1 E,ο[zz ]K (71) V 1 n [ ] 1 Gu G uv G vu G v (64) V θ 1 n (G u G uv G 1 v G vu ) 1 (65) V θ 1 n G 1 u (66) 15 z θ R(θ) 15
ICA F (y,w)=i ψ(y) (72) y 16 F ij i j W i = j E yi,y j [ψ(y i )y j ] = 0 (73) 1 E yi [ψ(y i )y i ] (74) y i E yi [ψ(y i )y i ] 1 0 7 8 Stiefel [1, 2, 3] 16 ψ 16
[1] In ( ) I, (2002) [2] A. Hyvärinen, J. Karhunen, E. Oja: Independent Component Analysis, John Wiley & Sons (2001) [3] (2004) 17