12 DCT A Data-Driven Implementation of Shape Adaptive DCT 1010431 2001 2 5
DCT MPEG H261,H263 LSI DDMP [1]DDMP MPEG4 DDMP MPEG4 SA-DCT SA-DCT DCT SA-DCT DDMP SA-DCT MPEG4, DDMP,, SA-DCT,, ο i
Abstract A Data-Driven Implementation of Shape Adaptive DCT MASAKAZU HASHIMOTO In recent years, picture compression and extension systems, such as MPEG, H261 and H263, etc., have become standards. Also, We are improving the performance of DDMP (Data Driven Multimedia Processor) as an one chip processor for image processing[1]. DDMP is a programmable device and it has enough flexibility for the various application tools of MPEG4. This research proposes a method of data driven type parallel processing for SA-DCT which is heavy tasks in image one of compression and extension the good of this paper is optimizing DDMP to MPEG4 functions. Shift operations of pixels are required by SA-DCT additional ordinal DCT. The method solves some problems in the current DDMP instruction set to utilize the parallelism of SA-DCT. Therefore, the search and examination on the bottleneck of SA-DCT and its solution is urgently required. The improvement in the speed by addition of the new instruction is proposed, and the possibility of realizations shown by a description of simple hardware mechanisms. The new added instructions should give more flexibility and can be applied in various fields of application. In this paper, a rough estimation of hardware cost is describes. key words MPEG4, DDMP, data-driven, SA-DCT, parallel processing, instruction ii
1 1 2 SA-DCT 5 2.1... 5 2.2 SA-DCT... 6 2.3 SA-DCT... 7 2.4 DDMP... 10 2.5 SA-DCT... 11 3 SA-DCT 13 3.1... 13 3.2... 14 3.2.1... 15 3.2.2... 15 3.3... 16 3.3.1 (h-read, v-read)... 16 3.3.2 2 (brpx, brln)... 19 3.3.3... 22 3.4 SA-DCT... 22 4 24 4.1... 24 4.2... 24 iii
4.3... 26 4.4... 27 4.4.1 SA-DCT... 27 4.4.2 DCT... 28 5 29 31 32 iv
1.1... 3 1.2... 4 2.1 SA-DCT... 6 2.2 SA-DCT... 7 2.3 SA-DCT... 9 2.4 DDMP... 10 2.5 DDMP-4G (OCP)... 11 3.1... 16 3.2 h-rad... 17 3.3 v-read... 18 3.4 2... 19 3.5 brpx... 20 3.6 brln... 21 3.7 SA-DCT ()... 23 v
3.1 ( DCT SA-DCT )... 13 4.1 ( SA-DCT )... 27 4.2 ( DCT )... 28 vi
1 MPEG H261,H263 TV DVD MPEG ISO/IEO (Moving Picture coding Expert Group) 1988 MPEG1MPEG2MPEG4 [2][3] MPEG1 CD-ROM MPEG2 MPEG4 TV MPEG4 MPEG4 MPEG4version2 MPEG4version2 3 MPEG4 1
MPEG4 MPEG1,2 DCT(Discrete Cosine Transform) 1 DCT () () DCT Shape Adaptive DCT (SA-DCT) SA-DCT MPEG4 SA-DCT DDMP (Data-Driven Multimedia Processors) 1 8600MOPS(Mega Operations Per Second) DDMP-4G MPEG4 VOP(Video Object Plane: ) 8 8 DCT DCT 1.1 DC AC 2
DCT 1.1 VOP VOP DCT 2 8 DC 8 1.1 1.2 VOP VOP VOP VOP VOP MPEG4-version2 DCT SA-DCT [4]SA-DCT DCT 3
DC DCT DCT DCT 0 VOP VOP VOP 1.2 SA-DCT 2 DCT 2 SA-DCT DDMP SA-DCT DDMP 3 DDMP DDMP-4G SA-DCT 4 3 4
2 SA-DCT 2.1 SA-DCT SA-DCT SA-DCT 8 8 2.1 3 1 () SA-DCT 3 1. VOP 2. DCT 3. 5
2.2 SA-DCT 2.1 SA-DCT 2.2 SA-DCT SA-DCT DCT SA-DCT DCT 2.2 VOP SA-DCT 1 DCT 1 DCT [5] DCT MPEG4 6
2.3 SA-DCT 188 DCT VOP + DCT + DCT 2.2 SA-DCT 2.3 SA-DCT 2.2 VOP SA-DCT 8 1 7
2.3 SA-DCT DCT 1 8 DCT DCT 2.3 1 1 1 DCT18 8 DCT DCT 8
2.3 SA-DCT 188 DCT VOP SA-DCT DCT DCT DCT + DCT SA-DCT DCT + DCT DCT DCT 2.3 SA-DCT 9
2.4 DDMP 2.4 DDMP DDMP DDMP 1 8600MOPS DDMP 2 () 2 DDMP 1 2.4 DDMP x y z DDMP t 2.4 DDMP 10
2.5 SA-DCT 2.5 SA-DCT DDMP DCT DDMP 2.3 DDMP-4G SA-DCT DDMP-4G 1 64 2.5 DDMP-4G OCP(Operation and Control Processor) DDMP-4G DCT PE MUL,INT(, ) IO() PE MUL(SYC) PE TBL ETM ( ) () INT ( ) GNT 2.5 DDMP-4G (OCP) DDMP 2 () 11
2.5 SA-DCT 1 1 /SA-DCT DCT DCT SA-DCT DCT DCT (N a N 2 ) 2 N 2 1 N 2 + N1N 2 2 [6] DCT SA-DCT SA-DCT DCT DCT [4] SA-DCT DCT DCT DCT 12
3 SA-DCT 3.1 DDMP-4G SA-DCT SA-DCT DDMP-4G DDMP-4G 3.1 ( DCT SA-DCT ) DCT 1 2 3 4 5 6 7 8 DCT 1 1 1 1 1 1 1 1 SA-DCT 6.3 6.3 6.6 6.9 7.2 8.1 9.3 8.4 DCT SA-DCT 3.1 SA-DCT () DCT DCT SA-DCT DCT SA-DCT 6 10 2.5 DDMP-4G 13
3.2 3.2 SA-DCT (a) 1 SA-DCT SA-DCT (b) 2 DDMP DDMP 3 2 (c) 3 DDMP 1 1 1 and or 1. 2. 14
3.2 3.2.1 1 2.3 (a) DDMP DDMP-4G () DDMP-4G 1 DDMP 3.2.2 SA-DCT DCT DCT 8 1 1 (b),(c) 15
3.3 3.3 DDMP SA-DCT 3.3.1 (h-read, v-read) (c) h-read, v-read v (0,0) (0,n-1) h token:= [line,pixel,data] 0,0,0 [i,j,n] [i,j+n-1,an] : [i,j+2,a3] [i,j+1,a2] [i,j,a1] h-read (i,j) n (px) h-read:horizontally read 0,0,0 [i,j,n] [i+n-1,j,an] : [i+2,j,a3] [i+1,j,a2] [i,j,a1] v-read (i,j) n (ln) v-read:vertically read 3.1 16
3.3 ffl h-read 3.1 (i=, j=) ( ) n= ( + j) = ) n () ffl v-read v-read n= ( + i) = ) n () (line,pixel,data) =(1,0,0) h-read (1,0,8) Data+pixel= 0+0= 0 (line,pixel,data) =(1,0,0) h-read 0 8 0 7 0x0000000f 0x000000ff0xffffffff (1,1,0x0000000f) (1,0,0x000000ff)(1,7,0xffffffff) pixel0 pixel 3.2 h-rad 17
3.3 3.2 (1,0,0)=(,, ) (a,0,8) h-read 07 8 (line,pixel,data) =(1,0,0) v-read (1,0,8) Data+line= 0+1= 1 (line,pixel,data) =(1,0,0) (2,0,0x00000fff) (1,0,0x000000ff)(8,0,0x00000001) v-read 1 8 line1 line 0 1 7 8 0x0000000f 0x000000ff0xffffffff 0x00000001 3.3 v-read 3.3 v-read 07 8 1 () 18
3.3 3.3.2 2 (brpx, brln) (b) brpx, brln 2 [i,j,n] [i,j,a] brpx aj a a+1 a+m m=max(j) brpx:branch by pixel num. [i,j,n] [i,j,a] brln ai a a+1 a+m m=max(i) brln:branch by line num. 3.4 2 ffl brpx 3.4 a ( j) n (a + j) = ) n 19
3.3 ffl brln brln a ( i) n (a + i) = ) n (line,pixel,data) =(1,0,1) brpx (1,0,2) pixel+data= 0+2 = 2 1 2 3 DCT1 DCT2 DCT8 (line,pixel,data) =(1,0,1) brpx (1,0,1) 1 2 3 DCT1 DCT2 DCT8 3.5 brpx 3.5 (1,0,1)=(,, ) (1,0,2) brpx 2 0 2 20
3.3 (line,pixel,data) =(1,0,1) brln (1,0,2) line+data= 1+2 = 3 1 2 3 DCT1 DCT2 DCT8 (line,pixel,data) =(1,0,1) brln (1,0,1) 1 2 3 DCT1 DCT2 DCT8 3.6 brln 3.6 brln 2 1 3 DCT 21
3.4 SA-DCT 3.3.3 4 SA-DCT 1 2 0 (brpx brln ) 3.4 SA-DCT 3.7 SA-DCT 2 DCT 2 DCT DCT 2 (brpx) DCT 1 DCT DCT 1 1 (v-read) 1 DCT 1 DCT 22
3.4 SA-DCT SA-DCT DCT (brpx) DCT DCT DCT DCT (h-read) SA-DCT DCT (brln) DCT DCT DCT DCT 3.7 SA-DCT () 23
4 4.1 DDMP-4G DDMP 1 1 DDMP-4G 12bit 4.2 ffl 1 DDMP () ffl 1 24
4.2 ( DCT) ffl DDMP 1 () () 64 (8 8) 16 (4 4) 1024 64(1 ) 16() = 1024() 30 30() 1024() = 30720 0,4,8,16,32,64 7 25
4.3 4.3 (1) : () (2) : (3) : 1 (DCT =s) 1 (fps) 1 (s) (1) (2) SA-DCT (3) 1 SA-DCT 1 26
4.4 4.4 SA-DCT DDMP-4G 4.4.1 SA-DCT SA-DCT SA-DCT DDMP-4G SA-DCT 4.1 SA-DCT 0 1 SA-DCT 1 DCT 4.1 ( SA-DCT ) SA-DCT SA-DCT 0 1.27 1 4 1.32 1.04 8 1.36 1.08 16 1.49 1.21 32 1.55 1.27 64 1.66 1.38 27
4.4 4.4.2 DCT DCT DCT DCT DCT SA-DCT 4.2 DCT 1 DCT DCT 1 SA-DCT 4.2 ( DCT ) DCT SA-DCT 0 1 2.05 4 1 2.14 8 1 2.23 16 1 2.49 32 1 2.62 64 1 2.83 28
5 MPEG4 SA-DCT 2 SA-DCT DDMP SA-DCT 3 SA-DCT DDMP SA-DCT 4 SA-DCT SA-DCT DCT SA-DCT DCT DCT Java SA-DCT 29
DCT SA-DCT DDMP SA- DCT 30
31
[1] H. Terada, S. Miyata, and M. Iwata, DDMP's: self-timed super-pipelined datadriven multimedia processors," Proc. of IEEE, 87(2), 282 296 (1999). [2], MPEG-4," (1999). [3] K. R. Rao, and J. J. Hwang,,,," (1999). [4],,,, MPEG-4,", 53(4), 485 491 (1999). [5],,,, MPEG-4 SA-DCT VLSI," TECHNICAL REPORT OF IEICE., 2000(35), (2000). [6] W. H. Chen, C. H. Smith, and S. C. Fralick, A Fast Computational Algorithm for the Discrete Cosine Transform," IEEE Trans. Commun., 25(9), 1004 1009 (1997). 32