Last Updated on October 30, 2023 by Prepbytes
The compilation process consists of a number of distinct stages. Each step takes information from the stage before it. We will learn more about the phases of the compiler in this blog, but first, we must comprehend how a compiler works.
What is a Compiler?
An application that converts computer code written in one programming language into another is known as a translator. Alternately, we could say that the compiler aids in the conversion of source code written in a high-level computer language into machine code. To learn more about this subject, you can also delve deeply into the distinction between a compiler and an interpreter.
In this article, you will learn:
- What are the Phases of Compiler Design?
- Phase 1: Lexical Analysis
- Phase 2: Syntax Analysis
- Phase 3: Semantic Analysis
- Phase 4: Intermediate Code Generation
- Phase 5: Code Optimization
- Phase 6: Code Generation
- Symbol Table Management
- Error Handling Routine:
What are the Phases of Compiler Design?
Every step of the compilation process changes the source program’s representation of the world. Every stage of the compiler’s phases receives input from the stage before it and feeds its product to the stage after.
Each stage in this process aids in translating high-level language into computer code. A compiler’s phases are as follows:
- Lexical analysis
- Syntax analysis
- Semantic analysis
- Intermediate code generator
- Code optimizer
- Code generator
Phases of Compiler
All of these phases convert the source code by breaking it up into tokens, building parse trees, and performing various phases of source code optimization.
Phase 1: Lexical Analysis
The initial stage of the compiler’s source code search is called lexical analysis. Tokens can be created by going through this procedure left to right, character by character.
By identifying the tokens, the character stream from the source program is here grouped into meaningful sequences. It enters the matching tickets into the symbol table and moves on to the next phase with that token.
The primary functions of this phase are:
- Recognize the word elements in a source code.
- Put lexical units in various tables after classifying them into groups like reserved words and constants. It will disregard remarks made in the source code.
- Determine tokens that are not linguistically appropriate.
Example of Lexical Analysis:
x = y + 10
Phase 2: Syntax Analysis
The main goal of syntax analysis is to identify an organization in code. It establishes whether a text adheres to the desired style or not. This phase’s primary goal is to determine whether the coder wrote accurate source code or not.
By using tokens to build the parse tree, syntax analysis is based on rules unique to the programming language being used. It also establishes the grammar or syntax of the language and the source tongue’s structure.
Here is a summary of the duties carried out during this phase:
- Take the lexical analyzer’s symbols.
- determines whether or not the expression is syntactically accurate.
- Send in all grammar mistakes
- Create a parse tree, a type of hierarchical arrangement.
Example of Syntax Analysis:
Any identifier/number is an expression
If x is an identifier and y+10 is an expression, then x= y+10 is a statement.
Consider parse tree for the following example
In Parse Tree
- Interior node: record with an operator field and two files for children
- Leaf: records with 2/more fields; one for token and other information about the token
- Ensure that the components of the program fit together meaningfully
- Gathers type information and checks for type compatibility
- Checks operands are permitted by the source language
Phase 3: Semantic Analysis
The semantic coherence of the code is examined by semantic analysis. It checks the provided source code’s semantic consistency using the symbol table, the syntax tree from the previous phase, and both. It also determines whether the code is expressing the intended message.
A method called with the wrong arguments, an undeclared variable, a type mismatch, incompatible operands, etc. will all be checked for by Semantic Analyzer.
Semantic analysis period duties include:
- assists you in saving type information in a symbol table or syntax tree.
- gives you the ability to type-check
- When there is a type mismatch and no precise type correction rules that meet the desired operation are available, a semantic error is displayed.
- gathers type data and verifies type compliance.
- determines whether the operands are allowed by the original language.
Example of Semantic Analysis:
float x = 20.2; float y = x*30;
Before multiplication, the semantic analyzer in the code above will typecast the integer 30 to float 30.0.
Phase 4: Intermediate Code Generation
The compiler produces intermediate code for the target machine following the completion of the semantic analysis step. It stands for a program for a fictitious machine.
Between high-level and machine-level languages is intermediate code. It is necessary to create this intermediate code in a way that makes it simple to convert it into the target machine code.
Functions of Intermediate Code Generation:
- It ought to be produced from the original program’s semantic representation.
- Holds the numbers calculated during the translation process and aids in the target language translation of the intermediate code.
- allows you to keep the source language’s hierarchy of priority intact
- It contains the appropriate amount of instruction operands.
Example of Intermediate Code Generation
total = count + rate * 5
Intermediate code with the help of address code method is
t1 := int_to_float(5) t2 := rate * t1 t3 := count + t2 total := t3
Phase 5: Code Optimization
Code Optimization or Intermediate code is the following stage. In order to speed up program processing without wasting resources, this step removes extraneous code lines and arranges the order of statements. This phase’s primary objective is to make improvements to the intermediate code in order to produce code that runs quickly and takes up less space.
The primary functions of this phase are
- It assists you in choosing between performance and compilation speed trade-offs.
- increases the target program’s running duration
- streamlines code generation while maintaining intermediary representation.
- Eliminating useless variables and unreachable code
- removing non-changed lines from the loop
Example of Code Optimization:
Consider the following code
a = intofloat(10) b = c * a d = e + b f = d
b =c * 10.0 f = e+b
Phase 6: Code Generation
The last and ultimate stage of a compiler is code generation. The page code or object code is created as a result of inputs from code optimization stages. Allocating storage and producing relocatable computer code are the goals of this stage.
Additionally, it allows memory spaces for the variable. The intermediate code’s commands are changed into machine instructions. The optimizer or intermediate code is converted into the target language during this step.
The machine code is the intended vernacular. As a result, during this stage, all memory regions and registers are chosen and assigned. This stage produces the code, which is then run to receive inputs and produce desired results.
Example of Code Generation:
a = b + 60.0
Would be possibly translated to registers.
MOVF a, R1 MULF #60.0, R2 ADDF R1, R2
Symbol Table Management
Each identifier’s entry in a symbol table has fields for the identifier’s attributes. The compiler can more easily and quickly examine the identifier record thanks to this component. Additionally, the symbol table aids in scope administration. All stages are interacted with by the symbol table and error handler, which updates the symbol table appropriately.
- It keeps strings and literal variables in storage.
- It facilitates the storage of function identifiers.
- Additionally, it favours storing variables and variable names.
- Labels from original languages are kept there.
Error Handling Routine:
In the compiler design process error may occur in all the below-given phases:
- Lexical analyzer: Wrongly spelled tokens
- Syntax analyzer: Missing parenthesis
- Intermediate code generator: Mismatched operands for an operator
- Code Optimizer: When the statement is not reachable
- Code Generator: When the memory is full or proper registers are not allocated
- Symbol tables: Error of multiple declared identifiers
The most frequent mistakes are incorrect token sequences in type, invalid character sequences in scanning, scope errors, and parsing errors in semantic analysis.
Any one of the aforementioned stages may experience the error. The phase must address the errors after it discovers them in order to resume the compilation process. In order to complete the compilation process, these errors must be submitted to the error handler. In most cases, messages are used to communicate errors.
- Each step of the compiler’s operation converts the source program from one representation to another. There are six stages in compiler design: 1) Lexical analysis 2) Grammar examination 3) Cognitive evaluation 4) A medium-level code creator Five) Code generator No. 6 Code Maker
- When a compiler examines the source code, the lexical analysis stage is the first.
- The main goal of syntax analysis is to identify the textual organization.
- Semantic analysis examines the code’s semantic coherence.
- Create intermediate code for the target machine using the compiler after the semantic analysis step is finished.
- During the code optimization process, unnecessary code lines are removed and the order of statements is organized.
- The phase of code generation uses data from the phase of code optimization to create the page code or object code.
- A symbol table contains a record for each identifier with fields for the attributes of the identifier
- Error handling routine handles error and reports during many phases
FAQs related to Phases of Compiler:
Below are some frequently asked questions regarding phases of compiler
1. How many phases are there in compiler design?
Lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation are the six stages of the compiler.
2. What are the phases of compilation in C?
Four stages make up the C compilation process: pre-processing, compilation, assembly, and linking. The preprocessor utility aids in the deletion of comments, the growth of macros, the inclusion of files, and conditional compilation. The first stage of the compilation procedure involves the execution of these commands.
3. Why compiler design is divided into phases?
Each step of the compiler’s operation transforms the source program from one representation to another. Each stage of the compiler’s phases feeds the following stage’s output with information from the stage before it.
4. What are the pass and phase in the compiler?
Pass: The term "pass" describes how a compiler runs the complete program. A phase is a distinct stage in a compiler that receives input from a prior stage, processes it, and produces output that can be used as input for a subsequent stage. There may be more than one step in a pass.
5. What is a parser in compiler design?
Parsing is the act of converting data between different formats. The parser can carry out this procedure. The parser is a part of the translator that aids in organizing linear text structure in accordance with the grammar, which is a collection of established rules.
6. What is the difference between phases and passes?
Units or stages in the compilation process are referred to as phases. Contrarily, the term "passes" describes the overall number of times the compiler runs the source code before generating the target machine code. This is the primary distinction between compiler stages and passes.
7. What are the functions of a compiler?
Compilers are very complex algorithms with features like error-checking. Some compilers convert high-level languages into intermediary assemblies, which are then translated (assembled) into machine code by an assembly program or assembler. Some compilers immediately produce machine language.