Overview of Compiler Phases¶
The Flang compiler transforms Fortran source code into an executable file. This transformation proceeds in three high level phases – analysis, lowering, and code generation/linking.
The first high level phase (analysis) transforms Fortran source code into a decorated parse tree and a symbol table. During this phase, all user related errors are detected and reported.
The second high level phase (lowering), changes the decorated parse tree and symbol table into the Fortran Intermediate Representation (FIR), which is a dialect of LLVM’s Multi-Level Intermediate Representation or MLIR. It then runs a series of passes on the FIR code which verify its validity, perform a series of optimizations, and finally transform it into LLVM’s Intermediate Representation, or LLVM IR
The third high level phase generates machine code and invokes a linker to produce an executable file.
This document describes the first two high level phases. Each of these is described in more detailed phases.
Each detailed phase is described – its inputs and outputs along with how to produce a readable version of the outputs.
Each detailed phase produces either correct output or fatal errors.
This high level phase validates that the program is correct and creates all of the information needed for lowering.
Input: Fortran source and header files, command line macro definitions, set of enabled compiler directives (to be treated as directives rather than comments).
A “cooked” character stream: the entire program as a contiguous stream of normalized Fortran source. Extraneous whitespace and comments are removed (except comments that are compiler directives that are not disabled) and case is normalized. Also, directives are processed and macros expanded.
Provenance information mapping each character back to the source it came from. This is used in subsequent phases that need source locations. This includes error messages, optimization reports, and debugging information.
flang-new -fc1 -E src.f90dumps the cooked character stream
flang-new -fc1 -fdebug-dump-provenance src.f90dumps provenance information
Input: Cooked character stream
flang-new -fc1 -fdebug-dump-parse-tree-no-sema src.f90dumps the parse tree
flang-new -fc1 -fdebug-unparse src.f90converts the parse tree to normalized Fortran
flang-new -fc1 -fdebug-dump-parsing-log src.f90runs an instrumented parse and dumps the log
flang-new -fc1 -fdebug-measure-parse-tree src.f90measures the parse tree
Input: the parse tree, the cooked character stream, and provenance information
a symbol table
modified parse tree
module files, (see: ModFiles.md)
the intrinsic procedure table
the target characteristics
the runtime derived type derived type tables (see: RuntimeTypeInfo.md)
For more detail on semantic analysis, see: Semantics.md. Semantic processing performs several tasks:
validates labels, see: LabelResolution.md.
canonicalizes DO statements,
canonicalizes OpenACC and OpenMP code
resolves names, building a tree of scopes and symbols
rewrites the parse tree to correct parsing mistakes (when needed) once semantic information is available to clarify the program’s meaning
checks the validity of declarations
analyzes expressions and statements, emitting error messages where appropriate
creates module files if the source code contains modules, see ModFiles.md.
In the course of semantic analysis, the compiler:
creates the symbol table
decorates the parse tree with semantic information (such as pointers into the symbol table)
creates the intrinsic procedure table
folds constant expressions
At the end of semantic processing, all validation of the user’s program is complete. This is the last detailed phase of analysis processing.
flang-new -fc1 -fdebug-dump-parse-tree src.f90dumps the parse tree after semantic analysis
flang-new -fc1 -fdebug-dump-symbols src.f90dumps the symbol table
flang-new -fc1 -fdebug-dump-all src.f90dumps both the parse tree and the symbol table
Lowering takes the parse tree and symbol table produced by analysis and produces LLVM IR.
the parse tree
the symbol table
The default KINDs for intrinsic types (specified by default or command line option)
The intrinsic procedure table (created in semantics processing)
The target characteristics (created during semantics processing)
The cooked character stream
The target triple – CPU type, vendor, operating system
The mapping between Fortran KIND values to FIR KIND values
The lowering bridge is a container that holds all of the information needed for lowering.
Output: A container with all of the information needed for lowering
Entry point: lower::LoweringBridge::create
Input: the lowering bridge
Output: A Fortran IR (FIR) representation of the program.
The compiler then takes the information in the lowering bridge and creates a pre-FIR tree or PFT. The PFT is a list of programs and modules. The programs and modules contain lists of function-like units. The function-like units contain a list of evaluations. All of these contain pointers back into the parse tree. The compiler walks the PFT generating FIR.
flang-new -fc1 -fdebug-dump-pft src.f90dumps the pre-FIR tree
flang-new -fc1 -emit-mlir src.f90dumps the FIR to the files src.mlir
Input: initial version of the FIR code
Output: An LLVM IR representation of the program
The compiler then runs a series of passes over the FIR code. The first is a verification pass. It’s followed by a series of transformation passes that perform various optimizations and transformations. The final pass creates an LLVM IR representation of the program.
flang-new -mmlir --mlir-print-ir-after-all -S src.f90dumps the FIR code after each pass to standard error
flang-new -fc1 -emit-llvm src.f90dumps the LLVM IR to src.ll