深層学習技術の進展 ImageNet Classification 画像認識 音声認識 自然言語処理 機械翻訳 深層学習技術は これらの分野において 特に圧倒的な強みを見せている Figure (Left) Eight ILSVRC-2010 test Deep images and the cited4: from: ``ImageNet Classification with Networks et the al. pr TheConvolutional correct labelneural is written under, Alex eachkrizhevsky image, and https://www.nvidia.cn/content/tesla/pdf/machine-learning/imagenet-classification-with-deep-convolutional-nn.pdf with a red bar (if it happens to be in the top 5). (Right) Fiv remaining columns show the six training images that prod
f : R n R f : R n R
f Θ : R n R Θ
f Θ : R n R y y
f Θ : R n R m Θ = {W 1, b 1, W 2, b 2, } h h Wh + b g h W h b
( ) h W h b
Θ = {W 1, b 1, W 2, b 2, } y = f Θ (x) loss(y, y) x y
D = {(x 1, y 1 ), (x 2, y 2 ),, (x T, y T )} B = {(x b1, y b1 ), (x b2, y b2 ),, (x bk, y bk )} G B (Θ) = 1 K K k=1 loss(y bk f Θ (x bk )) Step 1 () Θ := Θ 0 Step 2 () B Step 3 () g := G B (Θ) Step 4 () Θ := Θ αg Step 5 () Step 2
loss(y, y)
λ β α β α β j i = λ j + k B( j)\i α k j α i j = 2 tanh 1 1 tanh ( k A(i)\j 2 βk i )
λ β α β α β j i = λ j + k B( j)\i w k,j α k j α i j = 2 tanh 1 1 tanh ( k A(i)\j 2 βk i )
N x M A w y
Input layer Hidden layer Output layer round 1.0 0.8 0.6 0.4 0.2 0.0 0 50 100 150 200 250 1.0 0.8 0.6 0.4 0.2 0.0 0 50 100 150 200 250 Index of signal component Fig. 2. Sparse signal recovery for a 6-sparse vector. (top: the original sparse signal x, bottom: the output y = Φ θ (x) from the trained neural network. n =256,m=120)
r t = s t + βa T (y As t ) s t+1 = η(r t ; τ),
x = y Ax 2 2 + λ x 1 r t = s t + βa T (y As t ) s t+1 = η(r t ; τ),
r t = Bs t + Sy s t+1 = η(r t ; τ t )
r t = s t + γ t W (y As t ), s t+1 = η MMSE (r t ; τ 2 t ), { y vt 2 Ast 2 2 = max Mσ2 trace(a T A) }, ϵ τ 2 t = v 2 t N (N +(γ2 t 2γ t )M)+ γ2 t σ 2 N trace(ww T ),
r t = s t + γ t W (y As t ), s t+1 = η MMSE (r t ; τ 2 t ), { y vt 2 Ast 2 2 = max Mσ2 trace(a T A) }, ϵ τ 2 t = v 2 t N (N +(γ2 t 2γ t )M)+ γ2 t σ 2 N trace(ww T ), 3 2 output of MMSE estimator 1 0-1 -2-3 -3-2 -1 0 1 2 3
y x
TISTA LISTA LAMP # of params T T (N 2 + MN +1) T (NM +2) [2] M. Borgerding and P. Schniter, Onsager-corrected deep learning for sparse linear inverse problems, 2016 IEEE Global Conf. Signal and Inf. Proc. (GlobalSIP), Washington, DC, Dec. 2016, pp. 227-231.
NMSE of TISTA, LISTA and AMP; N A i,j N (0, 1/M), N = 500, M = 250, SNR = 40dB. N 0-5 -10-15 TISTA LISTA AMP NMSE [db] -20-25 -30-35 -40-45 2 4 6 8 10 12 14 16 iteration
Three sequences of learned parameters γ t ; A i,j N (0, 1/M), N = 500, M = 250, p = 0.1, SNR = 40dB. N 6 5.5 5 TISTA1 TISTA2 TISTA3 4.5 4 value 3.5 3 2.5 2 1.5 1 0 2 4 6 8 10 iteration
Three sequences of learned parameters γ t ; A i,j N (0, 1/M), N = 500, M = 250, p = 0.1, SNR = 40dB. N 6 5.5 5 TISTA1 TISTA2 TISTA3 4.5 4 value 3.5 3 2.5 2 1.5 1 0 2 4 6 8 10 iteration
N = 500, M = 250, p = 0.1, A i,j { 1, 1}, SNR = 40 db -5-10 TISTA LISTA -15 NMSE [db] -20-25 -30-35 -40-45 2 4 6 8 10 12 14 16 iteration
N M H x w y
mulas that are based on those of ISTA: r t = s t + γ t W (y Hs t ), ( ) rt s t+1 = tanh, θ t Fig. 1. The -th layer of the TI-detector. The trainable param
mulas that are based on those of ISTA: r t = s t + γ t W (y Hs t ), ( ) rt s t+1 = tanh, θ t Fig. 1. The -th layer of the TI-detector. The trainable param
10 0 10-1 10-2 BER 10-3 10-4 10-5 10-6 TI-detector(T=50) MMSE IW-SOAV(L=1,K itr =50) IW-SOAV(L=2,K itr =50) IW-SOAV(L=5,K itr =50) 0 5 10 15 20 25 SNR per receive antenna(db) R. Hayakawa and K. Hayashi, Convex optimizationbased signal detection for massive overloaded MIMO systems, in IEEE Trans. Wireless Comm., vol. 16, no. 11, pp. 7080-7091, Nov. 2017. Fig. 3. BER performance for.
value value 9 8 7 6 5 4 3 2 1 0 3 2 1 5 10 15 20 25 30 35 40 45 50 index t γ t θ t 0 5 10 15 20 25 30 35 40 45 50 index t
y x
y x
y x
W 1 y + b 1 relu W i h i 1 + b i W T h T 1 + b T y α ỹ...... h 1......... h i...... h T 1 soft staircase functions f( ; S, σ 2 )
W 1 y + b 1 relu W i h i 1 + b i W T h T 1 + b T y α ỹ...... h 1... f(r; S, sigma 2 ) 2 1.5 1 0.5 0-0.5-1...... h i sigma 2 = 0.0 sigma 2 = 0.1 sigma 2 = 0.5...... h T 1 soft staircase functions f( ; S, σ 2 ) -1.5-2 -2-1.5-1 -0.5 0 0.5 1 1.5 2 r
W 1 y + b 1 relu W i h i 1 + b i W T h T 1 + b T y α ỹ...... h 1......... h i...... h T 1 soft staircase functions f( ; S, σ 2 ) y
f(r; S, sigma 2 ) 2 1.5 1 0.5 0-0.5-1 sigma 2 = 0.0 sigma 2 = 0.1 sigma 2 = 0.5 W 1 y + b 1 relu W i h i 1 + b i y............... h 1 h i W T h T 1 + b T α ỹ...... h T 1 soft staircase functions f( ; S, σ 2 ) -1.5-2 -2-1.5-1 -0.5 0 0.5 1 1.5 2 r x y x
10 0 10-1 5 4 3 PEGReg252x504 PEGReg504x1008 Bit error rate 10-2 10-3 10-4 y 2 1 0-1 -2 10-5 10-6 Max steps=25-3 Max steps=100 Max steps=500-4 No quantization -5 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 SNR (db) -2-1.5-1 -0.5 0 0.5 1 1.5 2 x
GAN VAE NADE, Wavenet NICE, Glow
GPU GPU CPU ()
NVIDIA TESLA GPU Google Tensor Processing Unit (TPU) 出典 http://www.nvidia.co.jp/object/tesla-servers-jp.html 出典 http://itpro.nikkeibp.co.jp/atcl/ncd/14/457163/052001464/