5 ()
( ) ( ) ( ) p(a) a I(a) p(a) p(a) I(a) p(a) I(a) (2)
(self information) p(a) = I(a) = 0 I(a) = 0 I(a) a I(a) = log 2 p(a) = log 2 p(a) bit 2 (log 2 ) (3)
I(a) 7 6 5 4 3 2 0 0.5 p(a) p(a) = /2 I(a) = p(a) = I(a) = 0 (4)
: boy girl : I(boy) = log 2 2 = bit I(girl) = log 2 2 = bit 2 /8 I( ) = log 2 2 3 = 3 bit I( ) = log 2 7 8 = log 2 7 + log 2 8 = 2.807 + 3 = 0.93 bit (5)
E E E 2 E I(E) = I(E ) + I(E 2 ) 52 A I( A) = log 2 52 5.7 bit I( ) = log 2 4 = 2 bit A I(A) = log 2 3 3.7 bit I( A) = I( ) + I(A) (6)
( )..... log a b = c b = a c ( ) 2 log a b = log 0 b log 0 a 3 log a (xy) = log a (x) + log a (y) x 4 log a y = log a(x) log a (y) 5 log a x y = y log a x 6 log a x = log a x ( 5 y = ) log 2 x 2 log 2 x = log 0 x/ log 0 2 = log 0 x/0.300 3.3223 log 0 x. (7)
(average information) ( ) A A = {a, a 2,..., a n } n p(a i ) ( ) I(a i ) H(A) H(A) = n p(a i )I(a i ) = i= p(a i ) p i n p(a i ) log 2 p(a i ) i= n H(A) = p i log 2 p i i= bit (8)
0 H(A) log 2 n bit A a i p(a i ) = 0 H(A) = 0 0 A p(a i ) = /n H(A) = log 2 n (9)
8 : p( ) = 4, p( ) = 2 H(A) = p( ) = 4, p( ) = 0 4 p i log 2 p i i= = 4 log 2 4 2 log 2 2 4 log 2 4 0 log 2 0 = 2 4 + 2 + 2 0 =.5 bit 4 x 0 x log 2 x 0 (0)
(entropy) H = K k n k ln n k K n k k H = i p i log 2 p i ( ) ()
K 40% 30% 30% H = 0.4 log 2 0.4 0.3 log 2 0.3 0.3 log 2 0.3 =.57 bit 00% H =.0 log 2.0 0 log 2 0 0 log 2 0 = 0 bit 00% 0 (2)
(maximum entropy) (2 ) : ( ) a a 2 A = p p 2 (p + p 2 = ) H = p log 2 p p 2 log 2 p 2 p + p 2 = H : L = p log 2 p p 2 log 2 p 2 + λ( p p 2 ) L/ p i = log 2 p i + λ = 0 L/ λ = p p 2 = 0 log 2 p = log 2 p 2 H max = log 2 = bit 2 (3)
g(x) = 0 f(x) λ L = f(x) λg(x) x L = f λ g = 0, L λ = 0 d + x,...,x d, λ d + (4)
f ( x) = const. f g(x) = 0 g g(x) = 0 f(x) f = λ g (5)
n 2 n ( ) a a 2 a n A = p p 2 p n H = n p i log 2 p i i= 2 : ( ) n n L = p i log 2 p i + λ p i i= L/ p i = log 2 p i + λ = 0 L/ λ = n i= pi = 0 p = p 2 = = p n H max = log 2 bit n i= (6)
H max = 6 i= 6 log 2 6 = log 2 6 = 2.585 bit (A Z 27 ) 27 H max = i= 27 log 2 27 = log 2 27 = 4.755 bit (945 ) 945 H max = i= 945 log 2 945 = log 2 945 = 0.925 bit (7)
(entropy function) 2 H = p log 2 p p 2 log 2 p 2 p = p p 2 = p H(p) = p log 2 p ( p) log 2 ( p) H(p) H(p) p (8)
A 0.6 ( 0.4) B 0.9 ( 0.) H(A ) = H(0.6) = H(0.4) 0.97 bit H(B ) = H(0.9) = H(0.) 0.496 bit B A (9)
(joint entropy) : ( ) a a A = 2 p(a ) p(a 2 ) B = ( ) b b 2 p(b ) p(b 2 ) A B A B AB : ( ) (a, b AB = ) (a, b 2 ) (a 2, b ) (a 2, b 2 ) p p 2 p 2 p 22 (a i, b j ) = a i b j p ij = p(a i b j ) AB H(AB) = p ij log 2 p ij i j (20)
(conditional entropy) H(AB) H(AB) = p(a i b j) log 2 p(a i b j) i j = p(a i )p(b j a i ) log 2 p(a i )p(b j a i ) i j = p(a i)p(b j a i){log 2 p(a i) + log 2 p(b j a i)} i j = p(a i )p(b j a i ) log 2 p(a i ) i j p(a i)p(b j a i) log 2 p(b j a i) i j = p(a i ) log 2 p(a i ) p(b j a i ) i j i p(a i) p(b j a i) log 2 p(b j a i) j j p(b j a i ) = H(A) (2)
2 j p(b j a i ) log 2 p(b j a i ) a i b j 2 a i H(B A) H(B A) = i p(a i ) j p(b j a i ) log 2 p(b j a i ) H(B A) H(AB) H(AB) = H(A) + H(B A) H(AB) = H(B) + H(A B) H(AB) = H(BA) (22)
(Shannon s fundamental inequality) : H(A B) H(A), H(B A) H(B) ( ) A: B: B A H(AB) = H(A) + H(B A) H(AB) = H(A) + H(B A) H(A) + H(B) A B (23)
H(AB) = H(A) + H(B A) H(A) + H(B) A B A: B: A B 0 H(A B) H(A) H(AB) (24)
H(AB) H(A B) H(B A) H(A) H(B) H(AB) = H(A) + H(B A) = H(B) + H(A B) (25)
5. 45% 35% 2% 8% H 2 5.2 48 3 5.3 A 3 75% A 3 30% (26)