17
.NET compilation process explained (C#)
The CLR, you heard about it but you're not quite sure what it does or how it does certain things? JIT compiler seems familiar but where does it fit in the execution process? Keep reading and you can find out 🙂
- A developer writes C# code
- C# compiler checks the syntax and analyzes the source code
- Microsoft intermediate languages (MSIL) is generated as a result (EXE or DLL)
- CLR gets initialized inside of a process and runs entry point method (Main)
- MSIL gets converted to native code by the JIT compiler
Common language runtime (CLR or just runtime) is an environment that runs the code and provides services that make the development process easier.
Sure, yet another definition that is not quite clear. Let's define it using an example:
You write code, compile it, and the next step in the process is the responsibility of CLR - it compiles MSIL (DLL or EXE) and creates an environment in which your code can be executed.
The runtime also provides the following benefits:
- Memory management
- Security boundaries
- Type safety
- Exception handling
- Garbage collection
- Performance improvements
You can develop your code in any programming language you desire as long as the compiler (e.g. C++/CLI, C#, Visual Basic, F#) you use to compile your code targets the runtime.
When working with .NET you'll often encounter the term "managed code" which is code whose execution is managed by a runtime. Runtime is in charge of taking the managed code, compiling it into native code, and then executing it.
More on runtime role in execution process in the following sections.
When compiling to managed code, the compiler translates your source code into Microsoft intermediate language (MSIL), which is a CPU-independent set of instructions that can be efficiently converted to native code. Regardless of which compiler you use, the result is a managed module which is a standard 32-bit Windows portable executable (PE32) file, or a standard 64-bit Windows portable executable (PE32+) file that requires runtime to execute.
In addition to emitting MSIL, a compiler targeting the CLR is required to emit full metadata into every managed module. In brief, metadata is a set of data tables that describe what is defined in the module, such as types and their members, then what the managed module references, such as imported types and their members, etc.
Metadata is always embedded in the same EXE/DLL as the code (MSIL), making it impossible to separate the two.
PE header - if the header uses the PE32 format, the file can run on a 32-bit or 64-bit version of Windows. If the header uses the PE32+ format, the file requires a 64-bit version of Windows to run. This header also indicates the type of file: GUI, CUI, or DLL, and contains a timestamp indicating when the file was built.
CLR header - includes the version of the CLR required, entry point method (Main method), location/size of the module’s metadata, resources, strong name, flags, etc.
Metadata table - describes the types in your code, including the definition of each type, the signatures of each type's members, the members that your code references, and other data that the runtime uses at execution time.
Managed code (MSIL) - code the compiler produced as it compiled the source code
When running an executable file, Windows examines this EXE file's header to determine whether to create a 32-bit or 64-bit process. Once created, Windows additionally checks the CPU architecture information embedded inside the header and accordingly loads MSCorEE.dll into the process's address space.
Depending on CPU type in the computer Windows loads x86, x64 or ARM version of MSCorEE.dll
Process's primary thread calls a method inside MSCorEE.dll which initializes CLR, loads the EXE assembly, and calls its entry point method (Main). At that point, the managed application is up and running.
JIT (just-in-time) compilation converts MSIL to native code on demand at application run time when the contents of an assembly are loaded and executed. Because the MSIL is being compiled "just in time", this component of the CLR is frequently referred to as a JITter or a JIT compiler.
JIT compilation takes into account the possibility that some code might never be called during execution. Instead of using time and memory to convert all the MSIL in a PE file to native code, it converts the MSIL as needed during execution and stores the resulting native code in memory so that it is accessible for subsequent calls in the context of that process.
Once the Main
method of Program.exe
is called, and WriteLine
is set to be executed next, the JITCompiler
function searches assembly's (Console.dll
) metadata for the called method's (WriteLine
) MSIL. JITCompiler
next verifies and compiles the MSIL into native code and saves it in a dynamically allocated block of memory.
Then, JIT replaces the reference of the called method (WriteLine
) with an address of the block in memory containing the native code it previously compiled.
Finally, the JIT jumps to the code (this code is the actual implementation of WriteLine(string)
method) in the memory block, executes it and returns to Main
from where it continues execution as normal.
The second time WriteLine
is executed, code for the WriteLine
method has already been verified and compiled, meaning the call goes directly to the block of memory where native code is stored.
Subsequent calls to the JIT-compiled method go directly to the native code.
While compiling MSIL into native code, the CLR performs a process called verification which ensures that everything the code does is safe. For example, verification checks that every method is called with the correct number of parameters, that each parameter passed is of the correct type, that every method’s return value is used properly, that every method has a return statement, etc.
I hope this summarized article helped you to at least get an overview of the CLR and .NET compilation process in general. If you have any constructive feedback please let me know in the comment section 🙂
Below you can find additional resources if you want to dig deeper into terms that were mentioned but did not get enough attention in this article.
17