Walk-through flang – Part 1
Flang is an open source project to create a Fortran compiler for LLVM. It is based on NVIDIA/PGI Fortran and it has been released under Apache License 2.0. In this series we will do a walk-through the code of this compiler and how it has been integrated in the existing LLVM infrastructure.
Introduction
Fortran is a very old programming language. Invented in 1956 by a team in IBM led by John Backus, Fortran represented the first successful high-level programming language system which generated code that was realistically efficient. Much has happened since then, Fortran was standardized in Fortran 77 and then in Fortran 90, Fortran 95, Fortran 2003 and Fortran 2008. Most commercial offerings nowadays support Fortran 2003 and some subset of Fortran 2008. Today Fortran is a niche language which is mostly used for numerical computing and high-performance computing (HPC) though newer languages like Julia aim at replacing it.
The GNU project in his GNU Compiler Collection (GCC), has for long had a Fortran compiler: first g77 (Fortran 77 only) and then gfortran
(Fortran 95 and later versions). The LLVM project, being a newer project, never had such a compiler. I believe that at some point the flang project will fill this gap although this may not happen in the short-term.
The LLVM project
The LLVM project is an umbrella project for the development of compilers using open-source modular components. In contrast to GCC, LLVM uses a permissive license which, in principle, is more appealing for companies that develop commercial products on top of those components. The LLVM infrastructure lies around the LLVM IR which is basically a common representation for the middle-end of the compiler, where most target-independent transformations happen, but the project also includes a C/C++ compiler (clang), an implementation of the C++ Standard Library (libcxx), a linker (lld), a debugger (lldb) and other compilation-related components.
Installation of flang
Before we dig in the code of flang, we will want to install it. Fortunately this is documented in the upstream README.md but I will repeat the steps here.
First choose a path where you will install LLVM, clang and flang. Let's call this path the INSTALLDIR
directory and put it in an environment variable. Make sure it is an absolute path.
Also choose a directory where you will fetch the source code and build the components, it should be diferent to INSTALLDIR
. I will call this directory STAGEDIR
and inside of it I will create STAGEDIR/build
, which I will call the BUILDDIR
.
Inside STAGEDIR
, let's fetch first the code of LLVM 4.0, the command below will create a directory STAGEDIR/llvm
.
Now, inside STAGEDIR/llvm/tools
, check out the code of a clang that has been modified to be able to invoke the flang components. This will create a directory STAGEDIR/llvm/tools/clang
One dependence of flang is the Intel OpenMP Runtime Library (OpenMP RTL), so we have to check it out inside STAGEDIR/llvm/projects
. This will create a directory STAGEDIR/llvm/projects
.
At this point we can already build llvm, clang and the OpenMP RTL. You can also remove -G Ninja
if you prefer to use the Makefile generator instead of the ninja one. We are passing the option -DBUILD_SHARED_LIBS=ON
because we are building the Debug version of llvm/clang, otherwise, because of debug information, the static binaries are huge and take a lot of time and memory of the system to link, this should not make any difference in the final setup. If you do not want to build with debug information you can pass the option -DCMAKE_BUILD_TYPE=Release
in all invocations of cmake
below.
The ninja
build step will take several minutes depending on your machine. Be patient. A few warnings may appear in particular when building the OpenMP RTL. Once this step completes now we have llvm and clang but not flang yet.
Now let's checkout flang code inside STAGEDIR
. This will create a directory STAGEDIR/flang
.
Now create a build-flang directory where we will build flang only.
Building flang has to be done using the just compiled clang so we have to pass more flags to cmake. Unfortunately the ninja generator of cmake still does not support Fortran, so we will have to use the regular Makefile generator. We cannot use -DBUILD_SHARED_LIBS=ON
either due to some problems in the layering of the code. The option -DCMAKE_INSTALL_RPATH=$INSTALLDIR/lib
works around a problem when linking Fortran programs in flang. Also compiling in parallel may fail, if it happens, simply invoking again make
without the -j
option should do.
Once this process completes we should have flang installed. Let's make some smoke test. Create a file test.f90
with the following contents.
1
2
3
4
5
6
7
8
PROGRAM MAIN
IMPLICIT NONE
INTEGER :: X
PRINT *, "Please, enter a number"
READ (*, *) X
PRINT *, "The square root of ", X, " is ", SQRT(X)
END PROGRAM MAIN
Now build it using flang.
If you try to run it, however, you will see this error.
This is kind of expected, the program is not finding the libraries of the Fortran runtime. There are several solutions to this problem, but the simpler one is the following.
Now it should work (type a number and press Enter, in the example below I used 17).
Environment script
It may be convenient to create a small script to configure the environment correctly before using flang.
Before using flang simply loading this script will do.
Installation script
I made a simple installation script that does the steps shown above. This script should be run inside an empty directory and it will use the current directory as STAGEDIR
and STAGEDIR/install
as INSTALLDIR
. Also note that this script does not check if you have already cloned the git repositories, so if you want to replay part of the process you may need to comment parts of it.
Ok, that's enough for today. In the next chapter we will see the high level workflow when compiling with flang.