course:advancecompiler

Advance Compiler 2004 Fall

2004/09/15

  • parallelizing compilter ⇒ seq input, output parallel intrucstion
  • vector compiler
  • GCC 3.3.1
  • make bootstrap
    • 3 stage, do the same thing.
      • Stage 1: old gcc → new gcc(1)
      • Stage 2: new gcc(1) → new gcc(2)
      • Stage 3: new gcc(2) → new gcc(3)
      • Only if 2 & 3 same, do make all
    • Generate machine-dependent files: gen*.o, feed *.md, output insn-*.[ch]
    • cpp
    • libbackend
    • cc1, language frontend
    • xgcc → (gcc)
    • cpp0
  • Internals
    • C → tree → RTL → assembly (Chap 7,8)
    • flow of parser in Chap6
  • trace
  # TO tag         FROM line  in file/text
  1  1 toplev_main        35  main.c
  2  1 do_compile       5414  toplev.c
  3  1 compile_file     5384  toplev.c
  4  1 lang_hooks       2130  toplev.c
  5  1 c_common_parse_file    57  c-lang.c
  6  1 yyparse           159  c-lex.c
  7  1 fndef             357  c-parse.y
  8  1 finish_function   403  c-parse.y
  • requirement
    • clchen teaches half
    • other half, report one part of gcc (include binutils…)
    • no exam, grades from in class report / indivisual
  • homework
    • modify main() in main.c XD

Installation GCC 3.3.1 on FreeBSD 6-CURRENT (Sep 18)

  • cd gcc-3.3.1
  • mkdir build && cd build (the ports way)
  • ../configure –prefix=/home/rafan/tmp/gcc –disable-nls –with-system-zlib –enable-languages=c
  • make bootstrap (native compiler)
  • make install

It's very fast (about 10mins) on my X31 (1.4G). To define a compile-time variable, I modified the CFLAGS, {BOOT,STAGE1}_CFLAGS. (The 3-stage bootstrap is very complicated, normal make -DBLAH doesn't work.)

2004/09/22

  • gcc.c
    • use gccspec to read args
    • fork cpp (gccmain.c)
    • fork cc1 (main.c)
  • frontend
    • machine-independent, also language-dependent
  • backend
    • machine-dependent
    • main.c:5380 backend_init()
  • get token path
  # TO tag         FROM line  in file/text
  1  1 toplev_main        43  gcc/main.c
  2  1 do_compile       5414  gcc/toplev.c
  3  1 compile_file     5384  gcc/toplev.c
  4  1 yyparse           162  yyparse ();
  5  1 yyparse           162  yyparse ();
  6  1 YYLEX            4658  gcc/c-parse.c
  7  1 yylex            2102  gcc/c-parse.c
  8  1 _yylex           5918  gcc/c-parse.c
  9  1 c_lex            5810  gcc/c-parse.c
   9  1 last_token       5810  gcc/c-parse.c
   10  1 cpp_ttype        5425  gcc/c-parse.c
   11  1 TTYPE_TABLE       148  TTYPE_TABLE
   12  1 TTYPE_TABLE       148  TTYPE_TABLE
  10  1 cpp_get_token     680  tok = cpp_get_token (parse_in);
  • -fdump-transition-unit
    • finish_function: tree fndecl = current_function_decl;
    • finish_fname_decls, finish_stmt_tree modify tree
    • nested! in finish_function
    • c_expand_body generates rtl from tree
      • history: flag_inline_trees (tree_inline.c), good point to modify tree here
  • homework
    • trace yyparse(), not an easy task

2004/09/29

  • tree (Internal Chap 7)
    • tree.def
      • DEFTREECODE, macro (tree.h)
  #define DEFTREECODE(SYM, STRING, TYPE, NARGS)   SYM,
  enum tree_code {
  #include "tree.def"
    LAST_AND_UNUSED_TREE_CODE     /* A convenient way to get a value for
                                     NUM_TREE_CODE.  */
  };
  #undef DEFTREECODE
      • c-lang.c
  #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
  const char tree_code_type[] = {
  #include "tree.def"
    'x',
  #include "c-common.def"
  };                                                              
  #undef DEFTREECODE  
      • statement is NOT handled by tree.def (language-indepenent), it's in c-common.def
  enum c_tree_code {
   C_DUMMY_TREE_CODE = LAST_AND_UNUSED_TREE_CODE,  // "link" to tree_node !!
   #include "c-common.def"
   LAST_C_TREE_CODE
 };
    • tree_node (tree.h), union of 13 kinds of struct
      • array is put in the bottom in a structure. If we want more than 1, just allocate enough space.
 struct tree_exp GTY(())
 {             
   struct tree_common common; 
   int complexity;
   tree GTY ((special ("tree_exp"), 
              desc ("TREE_CODE ((tree) &%0)")))
     operands[1];
 };
    • tree.c
      • make_node
  • homeowork
    • precedence in yacc

2004/10/06

  • tree op0 is the right one, op1 is the left one
  • build_stmt → make_node → ggc_alloc_tree build 'if' node
    • c_expand_start_cond puts 'cond' body by IF_COND (if_stmt) = cond;
  • c_common_truthvalue_conversion # convert true/false value here!
  • optimization, -d x, generates file.\d\d.xxxx (dump RTL) or -d a
  • RTL
    • c_expand_body
    • init_function_start
    • expand_function_start
    • walk_tree
    • expand_main_function
    • expand_stmt
    • *lang_expand_function_end
    • expand_function_end
  • RTX (type # ptr_prev# ptr_next#
 #define RTX_CODE        enum rtx_code
 enum rtx_code  {
 
 #define DEF_RTL_EXPR(ENUM, NAME, FORMAT, CLASS)   ENUM ,
 #include "rtl.def"              /* rtl expressions are documented here */ 
 #undef DEF_RTL_EXPR
 
   LAST_AND_UNUSED_RTX_CODE};    /* A convenient way to get a value for
                                    NUM_RTX_CODE.
                                    Assumes default enum value assignment.  */
  • rtl.c: rtx_format, rtl.h: rtl_def, macro, rtl.def
  • homework
    • precedence (yacc)

Each state contains a action * which is a linked list. In remove_conflict() called by make_parser() utilizes the precdence and association information which comes from symbol to determime whethere this action is applied first (SHIFT action)

In details, it uses a (action *) pref to record preferred action and (action *) p for current action. The loop is 2 level, first level is for each states, second is for each actions in that state. When it found an action with higher precedence, it modifies oldaction→suppressed to 2, i.e., the new action is dominated. (suppressed is used to indicated if this action is suppressed.)

yydefred, defreds()→ sole_reduction(), in this function, this handles '.' reduction, as the name ``sole'' implies.

    • precedence (bison)

Oh, bison's source is more clear than yacc.

In main(), it calls conflicts_solve() → set_conflicts(state), it calls resolve_sr_conflict(state) (only if the action has prec) to solve shift/reduce conflicts by precedence and associcatively. Inside it, it first finds the current reduction action for this state then if a shift has higher prec. Once the solution is out, it calls log_resolution(cur_reduction, which action, shift/reduce/right/left/…) to record the solution and records curresponding things.

    • read rtx

2004/10/13

  • c_expand_body → expand_stmt(), tree travesel, linked list
  • expand_expr enters machine dependent routines here: (there is a op table in assembly)
      this_optab = ! unsignedp && flag_trapv
                   && (GET_MODE_CLASS(mode) == MODE_INT)
                   ? subv_optab : sub_optab;
  goto binop2;
  expand_binop()
  • i386.md
    • define_insn, a tree node → a RTL expr (one rtx), also describes RTL → assembly
    • define_expand, a tree node → RTL exprs (many rtx)
    • define_split, rtx → many rtx (optimization)
    • deinfe_peephole2, many rtx → many rtx (optimization)
    • define_insn_and_split, tree → one rtx, rtx → assembly (many rtx)
  • homework
    • i386.md (20K lines!!)

2004/10/20

  • gensupport.c, readctl.c support for common routines for gen*
  • gencodes.c:
  • pattern matters!
  • c_expand_body; rest_of_compilation() to do optimize timevar_push/pop to measure elapsed time in each optimization

2004/10/27

  • Topics
    • yacc? XD
    • profiling?
  • 1st level RTX
    • INSN
    • JUMP_INSN
    • CALL_INSN
    • BARRIER
    • CODE_LABEL
    • NOTE
  • Basic block
    • data structure is the key
    • node, edge, traverse
    • no branch inside!

2004/11/03

 (define_expand "addsi3"
   [(parallel [(set (match_operand:SI 0 "nonimmediate_operand" "")
                    (plus:SI (match_operand:SI 1 "nonimmediate_operand" "")
                             (match_operand:SI 2 "general_operand" "")))
               (clobber (reg:CC 17))])]
  • peephole optimization
    • modify particular pattern in a small window (2-3 instructions)
    • constant folding, multiple → shift, null op
    • use decision tree to match all patterns (i386 has 64 peephole2)
    • use basic block as the small window
    • recog.c, insn-recog.c, insn-emit.c

2004/11/10

  • rtx → asm
  • md describs how to convert a rtx pattern to a output asm pattern

2004/11/24

if conversion

  • When?
    • one of then or else part is empty
    • condition can be decided in compile time
    • same condition in nested if
  • How?
    • control dependency → data dependency
  • Implemetation
    • use ce_if_block struct to store then, else part
    • convert until convengence
    • find if → can convert? → do convert

sibling call

  • tail call / tail recursion
  • different to inline is that you don't do return ( return ( return … ) ) )
  • use call_placeholder to decide to use sibiling call or tail recursion optimization
  • parameter # is matter (since caller's frame size have to be larger than callee's)
  • RTFS: toplev.c: static const lang_independent_options f_options[] =

2004/12/01

Loop Optimization

  • terms
    • basic induction variable, X = X + c or X = X - c
    • induction variable A = a * B + b, B is basic induction variable.
    • strength-reduction
    • loop unrolling

Control Flow Graph (CFG) and Jump Optimization

  • copy loop header, we can save one bb
  • no new DS, lots of condition (no pattern, dirty code)

2004/12/08 profiling

  • about 60mins
  • topics (tentative)
    • what's profiling
    • how to use / example
    • Chapter 9, implementation (rtx?)
  • import point
    • data structure
    • before/after rtx !!

2004/12/29

  • porting to .Net
    • generate CIL code (intermediate language)
course/advancecompiler.txt · Last modified: 2007/03/06 13:54 (external edit)