NAIST-IS-MT0151122 2004 2 6
( )
, C,, C, CASE,, NAIST-IS- MT0151122, 2004 2 6. i
A Tool for Refactoring C Programs with Preprocessor Directives Masakazu Yoshida Abstract Refactoring, a change made to improve the internal structure of software without altering its external behavior, has become an important technique for software developers and maintainers. Since refactoring applied by hand is error-prone, we need tools based on program analysis for automated refactoring when we use refactoring in the real world software development. Unfortunately, there are not such tools for programming languages, like programming language C, because preprocessor directives mixed in programs make preprocessed source code look different from unpreprocessed one. In this study, we propose a program representation model which consists of three interrelated representations of a program: unpreprocessed source code, preprocessed source code and abstract syntax tree for further program analysis. And we show how refactoring C programs with preprocessor directives can be supported by an automated tool based on our model. Keywords: refactoring, preprocessor, C language, CASE tools, program analysis Master s Thesis, Department of Information Processing, Graduate School of Information Science, Nara Institute of Science and Technology, NAIST-IS-MT0151122, February 6, 2004. ii
1. 1 1.1......................... 1 1.2............................. 5 2. 7 2.1...... 7 2.2....................... 8 2.3................................. 9 3. 10 3.1........................ 10 3.2.................... 16 3.3................................. 16 4. 17 4.1................... 17 4.2...................... 18 4.3...... 25 4.4....... 27 4.5................................. 28 5. 29 5.1.................................. 29 5.2....................... 30 5.3...................... 34 5.4................................. 40 6. 41 6.1............................. 41 6.2................................. 42 iii
6.3................................. 46 7. 49 7.1....... 49 7.2............ 51 7.3................................. 51 8. 52 55 56 iv
1........... 4 2........... 4 3. 4 4................... 11 5.............. 17 6............... 22 7.............. 24 8............. 25 9 CRef........................... 29 10 CRef............... 31 11....... 36 12.............. 47 13.............. 48 v
1............... 3 2.............. 15 3................. 16 4................. 20 5................. 20 6............... 21 7............. 21 8 typedef............ 21 9.. 23 10 C............................ 24 11............ 42 vi
1. 1.1 refactoring Fowler [5] Smalltalk[16] C++[14] Java[5] 3 [16] 1. 1
2. precondition 3. Smalltalk Java C C++ C C C cpp # 1 3 CASE 2
#include #define, #undef #if, #else, #endif 1 1 2 1 2 3 3 if C C C 3
#include <stdio.h> #include <assert.h> #define OK 0 int main(int argc, char *argv[]) { int c; while ( (c = getc(stdin))!= EOF ) { assert ( c < 0x100 ); putc(c, stdout); } return OK; } 1 # 2 "sample.c" 2 int main(int argc, char *argv[]) { int c; while ( (c = _IO_getc ( stdin ) )!= (-1) ) { ((void) (( c < 0x100 )? 0 : ( assert_fail ("c < 0x100", "sample.c", 12, PRETTY_FUNCTION ), 0))) ; _IO_putc ( c, stdout ) ; } return 0 ; } 2 #if defined( TARGET_A ) if ( driver_probe_a() == 0 ) { #else # if define( TARGET_B ) if ( driver_probe_b() == 0 ) { # else /* others */ if ( driver_probe_generic() == 0 ) { # endif #endif driver_init(config); } 3 4
C 1.2 3 5
4 3 5 3 4 CRef 6 CRef 7 6 8 6
2. 2 2.1 C C C CASE CASE CASE program analysis C C CASE CASE Fave [3, 4] Badros C Perl [1, 2] Perl Livadas [13] 7
token Kullbach [12] 2.2 Smalltalk Java Smalltalk Roberts Refactoring Browser [16] Refactoring Browser Smalltalk [15] Refactoring Browser precondition 8
Java JRefactory[11] JRefactory Java GPL Gnu Public License JRefactory Java UML Java Eclipse[10] Smalltalk Java C C C CASE Sapid[6] [7] Garrido C C [8] 2.3 2 9
3. C 3.1 4 5 1. 2. 3. 4. 5. lexical tokens C C C 5 5 10
token # define BUFSIZ 512 n = BUFSIZ + 1 ; macro definition macro expansion n = 512 + 1 ; assgin symbol plus constant constant 4 11
3.1.1 C [9] 4 4 original tokens duplicate tokens 12
actual argument tokens generated tokens # ## # C ## LINE 3.1.2 13
3.1.3 2 4 2 #line #pragma #error 2 #if #else #endif 14
#include 2 3.1.4 3.1.5 15
3 3.2 3 3.3 16
4. 4.1 5... 5 Opdyke [14] 17
3.2 4.2 C 2 3 1. C 2. 3. C C C 18
C C++ Opdyke [14] 4.2.1 C C C C Opdyke C++ 3 26 [14, pp. 54 55] Opdyke Opdyke C Opdyke typedef C 4 5 6 typedef 7 8 19
6 5 2 4 5 4.2.2 #line #pragma #error 20
6 7 typedef typedef typedef typedef typedef typedef 8 typedef 21
f dst-id 1. f ext ext dst-id 2. ext dst-id 1. f ext ext f lst lst (a) f dst-id 6 22
9 7 2 9 4.2.3 C C C 10 8 23
m dst-id 1. m ext ext dst-id C 2. m ext ext dst-id 3. ext defs defs it (a) it m it dst-id 1. m dst-id 2. m calls calls (a) dst-id 7 C C 10 C 24
C m 1. m calls calls it (a) it (b) it it 2. m 8 4.3 25
2 1. 2. 3. (a) i. A. lst lst lst lst 26
lst 4.4 1. op-lst op-lst op (a) op (b) op (c) op op 27
4.5 C 2 28
5. 3 CRef 5.1 C CRef 9 User Interface (front end) command result Refactoring Engine (back end) load refactoring defintions 9 CRef 9 CRef CRef GUI Emacs 29
CRef CRef CRef 5.2 CRef GUI CRef GUI Python Tk Python CRef GUI 10 CRef GUI File Edit Refactoring Help Refactoring CRef 30
File menu Refactoring menu Menu bar Text area Status bar 10 CRef CRef 1. CRef File Open... hogehoge.c OK 31
2. hogehoge.c foo 3. Refactoring Refactor functions Rename function... 32
4. bar OK 5. 6. hogehoge.c 33
(a) (b) (a) foo (b) foo bar 1 CRef 5.3 CRef ANSI Common Lisp Lisp 34
11 11 (A) (B) (C) Lisp C Flex GNU Bison 3 Lisp read 11 (a) #define GREETING "Hello, world!\n" void main(){ printf(greeting); } Lisp #S(TOKEN :TYPE :HASH :LINE (1 0 0) :SVAL "#") #S(TOKEN :TYPE :DEFINE :LINE (1 1 1) :SVAL "define") #S(TOKEN :TYPE :SP :LINE (1 7 7) :SVAL " ") #S(TOKEN :TYPE :NAME :LINE (1 8 8) :SVAL "GREETING") #S(TOKEN :TYPE :SP :LINE (1 16 16) :SVAL " ") #S(TOKEN :TYPE :STRING :LINE (1 17 17) :SVAL "\"Hello, world!\\n\"") #S(TOKEN :TYPE :NL :LINE (1 34 34) :SVAL "~%") 35
command result source file(s) Refactoring Engine (A) lexical analysis (a) token list (B) preprocessing (b-1) token list (b-2) preprocessing trace (C) syntax analysis (c) syntax tree (D) static analysis and transformation refactoring definitions source file(s) 11 36
#S(TOKEN :TYPE :VOID :LINE (2 0 35) :SVAL "void") #S(TOKEN :TYPE :SP :LINE (2 4 39) :SVAL " ") #S(TOKEN :TYPE :NAME :LINE (2 5 40) :SVAL "main") #S(TOKEN :TYPE :OPEN-PAREN :LINE (2 9 44) :SVAL "(") #S(TOKEN :TYPE :CLOSE-PAREN :LINE (2 10 45) :SVAL ")") #S(TOKEN :TYPE :OPEN-BRACE :LINE (2 11 46) :SVAL "{") #S(TOKEN :TYPE :NL :LINE (2 12 47) :SVAL "~%") #S(TOKEN :TYPE :SP :LINE (3 0 48) :SVAL " ") #S(TOKEN :TYPE :NAME :LINE (3 2 50) :SVAL "printf") #S(TOKEN :TYPE :OPEN-PAREN :LINE (3 8 56) :SVAL "(") #S(TOKEN :TYPE :NAME :LINE (3 9 57) :SVAL "GREETING") #S(TOKEN :TYPE :CLOSE-PAREN :LINE (3 17 65) :SVAL ")") #S(TOKEN :TYPE :SEMICOLON :LINE (3 18 66) :SVAL ";") #S(TOKEN :TYPE :NL :LINE (3 19 67) :SVAL "~%") #S(TOKEN :TYPE :CLOSE-BRACE :LINE (4 0 68) :SVAL "}") #S(TOKEN :TYPE :NL :LINE (4 1 69) :SVAL "~%") TYPE LINE SVAL 3 CRef C 11 (b-1) (b-2) Lisp Lisp C C 37
C Lisp Lisp ;T VOID 8 2 ;T NAME 10 2 ;T OPEN-PAREN 11 2 ;T CLOSE-PAREN 12 2 ;T OPEN-BRACE 13 2 ;T NAME 16 3 ;T OPEN-PAREN 17 3 ;T STRING 1000 1 ;T CLOSE-PAREN 19 3 ;T SEMICOLON 20 3 ;T CLOSE-BRACE 22 4 3 Lisp read 11 (c) (DCL-SPEC-LIST -2 8) (FUNCTION-TYPE -3 10 11 NIL 12) (DCLTR -4-3) (EXPR-LIST -5 1000) (FUNCTION-CALL -6 16 17-5 19) (EXPRESSION-STATEMENT -7-6 20) (LIST -8-7) (BLOCK -9 13-8 22) 38
(FUNCTION -10-2 -4 NIL -9) (PROGRAM -11-10) C 2 3 NIL Lisp Lisp (PROGRAM (FUNCTION (DCL-SPEC-LIST #<TOKEN VOID "void" 8>) (DCLTR (FUNCTION-TYPE #<TOKEN NAME "main" 10> #<TOKEN OPEN-PAREN "(" 11> NIL #<TOKEN CLOSE-PAREN ")" 12>)) NIL (BLOCK #<TOKEN OPEN-BRACE "{" 13> (LIST (EXPRESSION-STATEMENT (FUNCTION-CALL #<TOKEN NAME "printf" 16> #<TOKEN OPEN-PAREN "(" 17> (EXPR-LIST #<TOKEN STRING ""Hello, world!\n"" 1000 (6)>) #<TOKEN CLOSE-PAREN ")" 19>) #<TOKEN SEMICOLON ";" 20>)) #<TOKEN CLOSE-BRACE "}" 22>))) 39
5.4 3 4 CRef CRef 40
6. 6.1 1. 2. 3. 11 41
11 11 4 6.2 12 init push pop 3 CRef 1. stack_g + 1 /* sample.c */ 2 3 #include <stdio.h> 4 5 #define DEPTH 100 6 int *sp; 7 int stack[depth]; 42
+ 8 struct + 9 { + 10 } + 11 stack_g; 12 13 void 14 init(void) 15 { 16 sp = stack; 17 } 2. sp stack_g sp stack_g sp sp +! 1 /* sample.c */ 2 3 #include <stdio.h> 4 5 #define DEPTH 100! 6 7 int stack[depth]; 8 struct 9 { + 10 int *sp; 11 } 12 stack_g; 13 14 void 15 init(void) 16 {! 17 stack_g.sp = stack; 18 } 19 20 int 21 push(int val) 22 {! 23 if (stack_g.sp - stack >= DEPTH) { 24 fprintf(stderr, "error: push: stack overflowed\n"); 25 return 0; 26 }! 27 return *stack_g.sp++ = val; 28 } 29 30 int 31 pop(void) 32 {! 33 if (stack_g.sp <= stack) { 34 fprintf(stderr, "error: pop: stack underflowed\n"); 35 return 0; 36 }! 37 return *--stack_g.sp; 38 } 43
3. stack stack_g stack stack_g stack stack 1 /* sample.c */ 2 3 #include <stdio.h> 4 5 #define DEPTH 100 6! 7 8 struct 9 { 10 int *sp; + 11 int stack[depth]; 12 } 13 stack_g; 14 15 void 16 init(void) 17 {! 18 stack_g.sp = stack_g.stack; 19 } 20 21 int 22 push(int val) 23 {! 24 if (stack_g.sp - stack_g.stack >= DEPTH) { 25 fprintf(stderr, "error: push: stack overflowed\n"); 26 return 0; 27 } 28 return *stack_g.sp++ = val; 29 } 30 31 int 32 pop(void) 33 {! 34 if (stack_g.sp <= stack_g.stack) { 35 fprintf(stderr, "error: pop: stack underflowed\n"); 36 return 0; 37 } 38 return *--stack_g.sp; 39 } 4. stack_g.sp g_sp 44
8 struct 9 { 10 int *sp; 11 int stack[depth]; 12 } 13 stack_g; + 14 #define g_sp stack_g.sp 15 16 void 17 init(void) 18 {! 19 g_sp = stack_g.stack; 20 } 21 22 int 23 push(int val) 24 {! 25 if (g_sp - stack_g.stack >= DEPTH) { 26 fprintf(stderr, "error: push: stack overflowed\n"); 27 return 0; 28 }! 29 return *g_sp++ = val; 30 } 31 32 int 33 pop(void) 34 {! 35 if (g_sp <= stack_g.stack) { 36 fprintf(stderr, "error: pop: stack underflowed\n"); 37 return 0; 38 }! 39 return *--g_sp; 40 } 5. stack_g.stack g_stack 8 struct 9 { 10 int *sp; 11 int stack[depth]; 12 } 13 stack_g; 14 #define g_sp stack_g.sp + 15 #define g_stack stack_g.stack 16 17 void 18 init(void) 19 {! 20 g_sp = g_stack; 21 } 22 23 int 24 push(int val) 25 {! 26 if (g_sp - g_stack >= DEPTH) { 27 fprintf(stderr, "error: push: stack overflowed\n"); 28 return 0; 29 } 30 return *g_sp++ = val; 45
31 } 32 33 int 34 pop(void) 35 {! 36 if (g_sp <= g_stack) { 37 fprintf(stderr, "error: pop: stack underflowed\n"); 38 return 0; 39 } 40 return *--g_sp; 41 } 6. init push pop stack_ 3 13 6.3 CRef 46
1 /* sample.c */ 2 3 #include <stdio.h> 4 5 #define DEPTH 100 6 int *sp; 7 int stack[depth]; 8 9 void 10 init(void) 11 { 12 sp = stack; 13 } 14 15 int 16 push(int val) 17 { 18 if (sp - stack >= DEPTH) { 19 fprintf(stderr, "error: push: stack overflowed\n"); 20 return 0; 21 } 22 return *sp++ = val; 23 } 24 25 int 26 pop(void) 27 { 28 if (sp <= stack) { 29 fprintf(stderr, "error: pop: stack underflowed\n"); 30 return 0; 31 } 32 return *--sp; 33 } 34 35 #ifdef UNIT_TEST 36 int 37 main(void) 38 { 39 int *p, v, res = 0; 40 static int vals[] = { 1, 20, -1, -20, 1000, 2000, -5000, 0 }; 41 42 init(); 43 for (p = vals; *p; p++) { 44 printf("push(%d) ", push(*p)); 45 } 46 putchar( \n ); 47 for (--p; p >= vals; p--) { 48 printf("pop()=>%d ", v = pop()); 49 if (v!= *p) { 50 fprintf(stderr, "*error(expected %d)* ", *p); 51 res = -1; 52 } 53 } 54 putchar( \n ); 55 return res; 56 } 57 #endif /* UNIT_TEST */ 12 47
1 /* sample.c */ 2 3 #include <stdio.h> 4 5 #define DEPTH 100 6 7 8 struct 9 { 10 int *sp; 11 int stack[depth]; 12 } 13 stack_g; 14 #define g_sp stack_g.sp 15 #define g_stack stack_g.stack 16 17 void 18 stack_init(void) 19 { 20 g_sp = g_stack; 21 } 22 23 int 24 stack_push(int val) 25 { 26 if (g_sp - g_stack >= DEPTH) { 27 fprintf(stderr, "error: push: stack overflowed\n"); 28 return 0; 29 } 30 return *g_sp++ = val; 31 } 32 33 int 34 stack_pop(void) 35 { 36 if (g_sp <= g_stack) { 37 fprintf(stderr, "error: stack_pop: stack underflowed\n"); 38 return 0; 39 } 40 return *--g_sp; 41 } 42 43 #ifdef UNIT_TEST 44 int 45 main(void) 46 { 47 int *p, v, res = 0; 48 static int vals[] = { 1, 20, -1, -20, 1000, 2000, -5000, 0 }; 49 50 stack_init(); 51 for (p = vals; *p; p++) { 52 printf("push(%d) ", stack_push(*p)); 53 } 54 putchar( \n ); 55 for (--p; p >= vals; p--) { 56 printf("pop()=>%d ", v = stack_pop()); 57 if (v!= *p) { 58 fprintf(stderr, "*error(expected %d)* ", *p); 59 res = -1; 60 } 61 } 62 putchar( \n ); 63 return res; 64 } 65 #endif /* UNIT_TEST */ 13 48
7. 7.1 4 4 49
PostScript Ghostscript Motif Killer GUI Tk C 50
7.2 7.3 51
8. C C C C C 3 3 52
C C 2 (1) C (2) (3) C C CRef CRef CRef C C++ 53
CRef CRef CRef 54
55
[1] Greg J. Badros. Pcp 3 : A C front end for preprocessor analysis and transformation. Master s thesis, University of Washington, 1997. [2] Greg J. Badros and David Notkin. A framework for preprocessor-aware C source code analyses. Software Practice and Experience, Vol. 30, No. 8, pp. 907 924, July 2000. [3] Jean-Marie Favre. The cpp paradox. In Proceedings of Nineth European Workshop on Software Maintenance, 1995. [4] Jean-Marie Favre. Preprocessors from an abstract point of view. In Proceedings of the 1996 International Conference on Software Maintenance (ICSM 96), 1996. [5] Martin Fowler. Refactoring: Improving the Design of Existing Code. Addison-Wesley, 1999. [6],,. CASE Sapid., Vol. 39, No. 6, pp. 1990 1998, 1998. [7],,,.., Vol. 15, No. 4, pp. 78 81, 1998. [8] Alejandra Garrido and Ralph Johnson. Challenges of refactoring C programs. In Proceedings of the international workshop on Principles of software evolution, pp. 6 14. ACM Press, 2002. [9] Samuel P. Harbison and Guy L. Steele Jr. C: A Reference Mannual. Prentice-Hall, 5th edition, 2002. [10] Object Technology International. Eclipse. http://www.eclipse.org. 56
[11] JRefactory. http://jrefactory.sourceforge.net/. [12] Bernt Kullbach and Volker Riediger. Folding: An approach to enable program understanding of preprocessed languages. Technical Report 7 2001, Universität Koblenz-Landau, 2001. [13] Panos E. Livadas and David T. Small. Understanding code containing preprocessor constructs. In IEEE Third Workshop on Program Comprehension, pp. 89 97, November 1994. [14] William F. Opdyke. Refactoring Object-Oriented Frameworks. PhDthesis, University of Illinois, 1992. [15] Don Roberts. Eliminating Analysis in Refactoring. PhD thesis, University of Illinois at Urabana-Champaign, 1997. [16] Don Roberts, John Brant, and Ralph E. Joohnson. A refactoring tool for Smalltalk. Theory and Practice of Object Systems (TAPOS), 1997. 57