Advance Compiler 2004 Fall ====== 2004/09/15 ====== * parallelizing compilter => seq input, output parallel intrucstion * vector compiler * GCC 3.3.1 * make bootstrap * 3 stage, do the same thing. * Stage 1: old gcc -> new gcc(1) * Stage 2: new gcc(1) -> new gcc(2) * Stage 3: new gcc(2) -> new gcc(3) * Only if 2 & 3 same, do make all * Generate machine-dependent files: gen*.o, feed *.md, output insn-*.[ch] * cpp * libbackend * cc1, language frontend * xgcc -> (gcc) * cpp0 * Internals * C -> tree -> RTL -> assembly (Chap 7,8) * flow of parser in Chap6 * trace # TO tag FROM line in file/text 1 1 toplev_main 35 main.c 2 1 do_compile 5414 toplev.c 3 1 compile_file 5384 toplev.c 4 1 lang_hooks 2130 toplev.c 5 1 c_common_parse_file 57 c-lang.c 6 1 yyparse 159 c-lex.c 7 1 fndef 357 c-parse.y 8 1 finish_function 403 c-parse.y * requirement * clchen teaches half * other half, report one part of gcc (include binutils...) * no exam, grades from in class report / indivisual * homework * modify main() in main.c XD ====== Installation GCC 3.3.1 on FreeBSD 6-CURRENT (Sep 18) ====== * cd gcc-3.3.1 * mkdir build && cd build (the ports way) * ../configure --prefix=/home/rafan/tmp/gcc --disable-nls --with-system-zlib --enable-languages=c * make bootstrap (native compiler) * make install It's very fast (about 10mins) on my X31 (1.4G). To define a compile-time variable, I modified the CFLAGS, {BOOT,STAGE1}_CFLAGS. (The 3-stage bootstrap is very complicated, normal make -DBLAH doesn't work.) ====== 2004/09/22 ====== * gcc.c * use gccspec to read args * fork cpp (gccmain.c) * fork cc1 (main.c) * frontend * machine-independent, also language-dependent * backend * machine-dependent * main.c:5380 backend_init() * get token path # TO tag FROM line in file/text 1 1 toplev_main 43 gcc/main.c 2 1 do_compile 5414 gcc/toplev.c 3 1 compile_file 5384 gcc/toplev.c 4 1 yyparse 162 yyparse (); 5 1 yyparse 162 yyparse (); 6 1 YYLEX 4658 gcc/c-parse.c 7 1 yylex 2102 gcc/c-parse.c 8 1 _yylex 5918 gcc/c-parse.c 9 1 c_lex 5810 gcc/c-parse.c 9 1 last_token 5810 gcc/c-parse.c 10 1 cpp_ttype 5425 gcc/c-parse.c 11 1 TTYPE_TABLE 148 TTYPE_TABLE 12 1 TTYPE_TABLE 148 TTYPE_TABLE 10 1 cpp_get_token 680 tok = cpp_get_token (parse_in); * -fdump-transition-unit * finish_function: tree fndecl = current_function_decl; * finish_fname_decls, finish_stmt_tree modify tree * nested! in finish_function * c_expand_body generates rtl from tree * history: flag_inline_trees (tree_inline.c), good point to modify tree here * homework * trace yyparse(), not an easy task ====== 2004/09/29 ====== * tree (Internal Chap 7) * tree.def * DEFTREECODE, macro (tree.h) #define DEFTREECODE(SYM, STRING, TYPE, NARGS) SYM, enum tree_code { #include "tree.def" LAST_AND_UNUSED_TREE_CODE /* A convenient way to get a value for NUM_TREE_CODE. */ }; #undef DEFTREECODE * * * c-lang.c #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, const char tree_code_type[] = { #include "tree.def" 'x', #include "c-common.def" }; #undef DEFTREECODE * * * statement is NOT handled by tree.def (language-indepenent), it's in c-common.def enum c_tree_code { C_DUMMY_TREE_CODE = LAST_AND_UNUSED_TREE_CODE, // "link" to tree_node !! #include "c-common.def" LAST_C_TREE_CODE }; * * tree_node (tree.h), union of 13 kinds of struct * array is put in the bottom in a structure. If we want more than 1, just allocate enough space. struct tree_exp GTY(()) { struct tree_common common; int complexity; tree GTY ((special ("tree_exp"), desc ("TREE_CODE ((tree) &%0)"))) operands[1]; }; * * tree.c * make_node * homeowork * precedence in yacc ====== 2004/10/06 ====== * tree op0 is the right one, op1 is the left one * build_stmt -> make_node -> ggc_alloc_tree build 'if' node * c_expand_start_cond puts 'cond' body by IF_COND (if_stmt) = cond; * c_common_truthvalue_conversion # convert true/false value here! * optimization, -d x, generates file.\d\d.xxxx (dump RTL) or -d a * RTL * c_expand_body * init_function_start * expand_function_start * walk_tree * expand_main_function * expand_stmt * *lang_expand_function_end * expand_function_end * RTX (type # ptr_prev# ptr_next# #define RTX_CODE enum rtx_code enum rtx_code { #define DEF_RTL_EXPR(ENUM, NAME, FORMAT, CLASS) ENUM , #include "rtl.def" /* rtl expressions are documented here */ #undef DEF_RTL_EXPR LAST_AND_UNUSED_RTX_CODE}; /* A convenient way to get a value for NUM_RTX_CODE. Assumes default enum value assignment. */ * rtl.c: rtx_format, rtl.h: rtl_def, macro, rtl.def * homework * precedence (yacc) Each state contains a ''action *'' which is a linked list. In ''remove_conflict()'' called by ''make_parser()'' utilizes the precdence and association information which comes from symbol to determime whethere this action is applied first (SHIFT action) In details, it uses a (''action *'') pref to record preferred action and (''action *'') p for current action. The loop is 2 level, first level is for each states, second is for each actions in that state. When it found an action with higher precedence, it modifies ''oldaction->suppressed'' to 2, i.e., the new action is dominated. (suppressed is used to indicated if this action is suppressed.) yydefred, ''defreds()-> sole_reduction()'', in this function, this handles '.' reduction, as the name ``sole'' implies. * * precedence (bison) Oh, bison's source is more clear than yacc. In main(), it calls conflicts_solve() -> set_conflicts(state), it calls resolve_sr_conflict(state) (only if the action has prec) to solve shift/reduce conflicts by precedence and associcatively. Inside it, it first finds the current reduction action for this state then if a shift has higher prec. Once the solution is out, it calls log_resolution(cur_reduction, which action, shift/reduce/right/left/...) to record the solution and records curresponding things. * * read rtx ====== 2004/10/13 ====== * c_expand_body -> expand_stmt(), tree travesel, linked list * expand_expr enters machine dependent routines here: (there is a op table in assembly) this_optab = ! unsignedp && flag_trapv && (GET_MODE_CLASS(mode) == MODE_INT) ? subv_optab : sub_optab; goto binop2; expand_binop() * i386.md * define_insn, a tree node -> a RTL expr (one rtx), also describes RTL -> assembly * define_expand, a tree node -> RTL exprs (many rtx) * define_split, rtx -> many rtx (optimization) * deinfe_peephole2, many rtx -> many rtx (optimization) * define_insn_and_split, tree -> one rtx, rtx -> assembly (many rtx) * homework * i386.md (20K lines!!) ====== 2004/10/20 ====== * gensupport.c, readctl.c support for common routines for gen* * gencodes.c: * pattern matters! * c_expand_body; rest_of_compilation() to do optimize timevar_push/pop to measure elapsed time in each optimization ====== 2004/10/27 ====== * Topics * yacc? XD * profiling? * 1st level RTX * INSN * JUMP_INSN * CALL_INSN * BARRIER * CODE_LABEL * NOTE * Basic block * data structure is the key * node, edge, traverse * no branch inside! ====== 2004/11/03 ====== (define_expand "addsi3" [(parallel [(set (match_operand:SI 0 "nonimmediate_operand" "") (plus:SI (match_operand:SI 1 "nonimmediate_operand" "") (match_operand:SI 2 "general_operand" ""))) (clobber (reg:CC 17))])] * peephole optimization * modify particular pattern in a small window (2-3 instructions) * constant folding, multiple -> shift, null op * use //decision tree// to match all patterns (i386 has 64 peephole2) * use basic block as the small window * recog.c, insn-recog.c, insn-emit.c ====== 2004/11/10 ====== * rtx -> asm * md describs how to convert a rtx pattern to a output asm pattern ====== 2004/11/24 ====== ===== if conversion ===== * When? * one of then or else part is empty * condition can be decided in compile time * same condition in nested if * How? * control dependency -> data dependency * Implemetation * use ce_if_block struct to store then, else part * convert until convengence * find if -> can convert? -> do convert ===== sibling call ===== * tail call / tail recursion * different to inline is that you don't do return ( return ( return ... ) ) ) * use call_placeholder to decide to use sibiling call or tail recursion optimization * parameter # is matter (since caller's frame size have to be larger than callee's) * RTFS: toplev.c: static const lang_independent_options f_options[] = ====== 2004/12/01 ====== ===== Loop Optimization ===== * terms * basic induction variable, X = X + c or X = X - c * induction variable A = a * B + b, B is basic induction variable. * strength-reduction * loop unrolling ===== Control Flow Graph (CFG) and Jump Optimization ===== * copy loop header, we can save one bb * no new DS, lots of condition (no pattern, dirty code) ====== 2004/12/08 profiling ====== * about 60mins * topics (tentative) * what's profiling * how to use / example * Chapter 9, implementation (rtx?) * import point * data structure * before/after rtx !! ====== 2004/12/29 ====== * porting to .Net * generate CIL code (intermediate language)