Appendix A — Essential and useful other programs under a Unix-alike

This appendix gives details of programs you will need to build R on Unix-like platforms, or which will be used by R if found by configure.

Remember that some package management systems (such as RPM and Debian/Ubuntu’s) make a distinction between the user version of a package and the development version. The latter usually has the same name but with the extension -devel or -dev: you need both versions installed.

A.1 Essential programs and libraries

You need a means of compiling C and Fortran 90 (see Using Fortran). Your C compiler should be ISO/IEC 600591, POSIX 1003.1 and C99-compliant.2 R tries to choose suitable flags3 for the C compilers it knows about, but you may have to set CC or CFLAGS suitably. (Note that options essential to run the compiler even for linking, such as those to set the architecture, should be specified as part of CC rather than in CFLAGS.)

1 also known as IEEE 754

2 Note that C11 compilers need not be C99-compliant: R requires support for double complex and variable-length arrays which are optional in C11 but are mandatory in C99. C17 (also known as C18 as it was published in 2018) is a ‘bugfix release’ of C11, clarifying the standard. However, all known recent compilers in C11 or C17 mode are C99-compliant, and most default to C17.

3 Examples are -std=gnu99, -std=c99 and -c99.

Unless you do not want to view graphs on-screen (or use macOS) you need X11 installed, including its headers and client libraries. For recent Fedora/RedHat distributions it means (at least) RPMs libX11, libX11-devel, libXt and libXt-devel. On Debian/Ubuntu we recommend the meta-package xorg-dev. If you really do not want these you will need to explicitly configure R without X11, using --with-x=no.

The command-line editing (and command completion) depends on the GNU readline library (including its headers): version 6.0 or later is needed for all the features to be enabled. Otherwise you will need to configure with --with-readline=no (or equivalent).

A suitably comprehensive iconv function is essential. The R usage requires iconv to be able to translate between "latin1" and "UTF-8", to recognize "" (as the current encoding) and "ASCII", and to translate to and from the Unicode wide-character formats "UCS-[24][BL]E" — this is true by default for glibc4 but not of most commercial Unixes. However, you can make use of GNU libiconv (as used on macOS: see https://www.gnu.org/software/libiconv/).

4 However, it is possible to break the default behaviour of glibc by re-specifying the gconv modules to be loaded.

5 specifically, the C99 functionality of headers wchar.h and wctype.h, types wctans_t and mbstate_t and functions mbrtowc, mbstowcs, wcrtomb, wcscoll, wcstombs, wctrans, wctype, and iswctype.

6 including expm1, hypot, log1p, nearbyint and va_copy.

7 including opendir, readdir, closedir, popen, stat, glob, access, getcwd and chdir system calls, select on a Unix-alike, and either putenv or setenv.

8 such as realpath, symlink.

The OS needs to have enough support5 for wide-character types: this is checked at configuration. Some C99 functions6 are required and checked for at configuration. A small number of POSIX functions7 are essential, and others8 will be used if available.

Installations of zlib (version 1.2.5 or later), libbz2 (version 1.0.6 or later: called bzip2-libs/bzip2-devel or libbz2-1.0/libbz2-dev by some Linux distributions) and liblzma9 version 5.0.3 or later are required.

9 most often distributed as part of xz: possible names in Linux distributions include xz-devel/xz-libs and liblzma-dev.

Either PCRE1 (version 8.32 or later, formerly known as just PCRE) or PCRE2 is required: PCRE2 is preferred and using PCRE1 requires configure option --with-pcre1. Only the 8-bit library and headers are needed if these are packaged separately. JIT support (optional) is desirable for the best performance. For PCRE2 >= 10.30 (which is desirable as matching has been re-written not to use recursion and the Unicode tables were updated to version 10)

./configure --enable-jit

suffices. If building PCRE1 for use with R a suitable configure command might be

./configure --enable-utf --enable-unicode-properties --enable-jit --disable-cpp

The --enable-jit flag is supported for most common CPUs but does not work (well or at all) for arm64 macOS.

Some packages require the ‘Unicode properties’ which are optional for PCRE1: support for this and JIT can be checked at run-time by calling pcre_config().

Library libcurl (version 7.28.0 or later) is required. Information on libcurl is found from the curl-config script: if that is missing or needs to be overridden10 there are macros to do so described in file config.site.

10 for example to specify static linking with a build which has both shared and static libraries.

11 Such as GNU tar 1.15 or later, bsdtar (from https://github.com/libarchive/libarchive/, used as tar by FreeBSD and macOS 10.6 and later) or tar from the Heirloom Toolchest (https://heirloom.sourceforge.net/tools.html), although the latter does not support xz compression.

A tar program is needed to unpack the sources and packages (including the recommended packages). A version11 that can automagically detect compressed archives is preferred for use with untar(): the configure script looks for gtar and gnutar before tar – use environment variable TAR to override this. (On NetBSD/OpenBSD systems set this to bsdtar if that is installed.)

There need to be suitable versions of the tools grep and sed: the problems are usually with old AT&T and BSD variants. configure will try to find suitable versions (including looking in /usr/xpg4/bin which is used on some commercial Unixes).

You will not be able to build most of the manuals unless you have texi2any version 5.1 or later installed (which requires perl), and if not most of the HTML manuals will be linked to a version on CRAN. To make PDF versions of the manuals you will also need file texinfo.tex installed (which is part of the GNU texinfo distribution but is often made part of the TeX package in re-distributions) as well as texi2dvi.12 Further, the versions of texi2dvi and texinfo.tex need to be compatible: we have seen problems with older TeX distributions.

12 texi2dvi is normally a shell script. Some of the issues which have been observed with broken versions of texi2dvi can be circumvented by setting the environment variable R_TEXI2DVICMD to the value emulation.

If you want to build from the R Subversion repository then texi2any is highly recommended as it is used to create files which are in the tarball but not stored in the Subversion repository.

The PDF documentation (including doc/NEWS.pdf) and building vignettes needs pdftex and pdflatex. We require LaTeX version 2005/12/01 or later (for UTF-8 support). Building PDF package manuals (including the R reference manual) and vignettes is sensitive to the version of the LaTeX package hyperref and we recommend that the TeX distribution used is kept up-to-date. A number of standard LaTeX packages are required for the PDF manuals (including url and some of the font packages such as times and helvetic and also amsfonts) and others such as hyperref and inconsolata are desirable (and without them you may need to change R’s defaults: see Making the manuals). Note that package hyperref (currently) requires packages kvoptions, ltxcmds and refcount, and inconsolata requires xkeyval. Building the base vignettes requires fancyvrb, natbib, parskip (which currently requires etoolbox) and listings. For distributions based on TeX Live the simplest approach may be to install collections collection-latex, collection-fontsrecommended, collection-latexrecommended, collection-fontsextra and collection-latexextra (assuming they are not installed by default): Fedora uses names like texlive-collection-fontsextra and Debian/Ubuntu like texlive-fonts-extra.

Programs qpdf and Ghostscript (gs) are desirable as these will be used to compact the installed PDF vignettes and any PDF manuals.

The essential programs should be in your PATH at the time configure is run: this will capture the full paths.

For date-times to work correctly it is essential that the tables defining time zones are installed: these are usually in an OS component named something like tzdata. On most OSes they are required but installations of Alpine Linux have been seen without them. There is a configure check that recent date-times to work correctly in different time zones which catches this when installing from source (but not for binary distributions).

Those distributing binary versions of R may need to be aware of the licences of the external libraries it is linked to (including ‘useful’ libraries from the next section). The liblzma library is in the public domain and X11, libbzip2, libcurl and zlib have MIT-style licences. PCRE and PCRE2 have a BSD-style licence which requires distribution of the licence (included in R’s COPYRIGHTS file) in binary distributions. GNU readline is licensed under GPL (which version(s) of GPL depends on the readline version).

A.2 Useful libraries and programs

The ability to use translated messages makes use of gettext and most likely needs GNU gettext: you do need this to work with new translations, but otherwise the version of the gettext runtime contained in the R sources will be used if no suitable external gettext is found.

The ‘modern’ version of the X11(), jpeg(), png() and tiff() graphics devices uses the Cairo and Pango libraries. Cairo version 1.2.0 or later and Pango version 1.10 or later are required (but much later versions are current). R checks for pkg-config, and uses that to check first that the pangocairo package is installed (and if not, cairo) then if suitable code can be compiled. These tests will fail if pkg-config is not installed13, and might fail if cairo was built statically unless configure option --with-static-cairo is used. Most systems with Gtk+ 2.8 or later installed will have suitable libraries: for Fedora users the pango-devel RPM and its dependencies suffice. It is possible (but very unusual on a platform with X11) to build Cairo without its cairo-xlib module in which case X11(type = "cairo") will not be available. Pango is optional but highly desirable as it is likely to give much better text rendering, including kerning.

13 If necessary the path to pkg-config can be specified by setting PKG_CONFIG in config.site, on the configure command line or in the environment. There is a compatible re-implementation of pkg-config called pkgconf which can be used in the unlikely event that is installed but not linked to pkg-config.

14 also known as ttf-mscorefonts-installer in the Debian/Ubuntu world: see also https://en.wikipedia.org/wiki/Core_fonts_for_the_Web.

15 ttf-liberation in Debian/Ubuntu.

For the best font experience with these devices you need suitable fonts installed: Linux users will want the urw-fonts package. On platforms which have it available, the msttcorefonts package14 provides TrueType versions of Monotype fonts such as Arial and Times New Roman. Another useful set of fonts is the ‘liberation’ TrueType fonts available at https://pagure.io/liberation-fonts,15 which cover the Latin, Greek and Cyrillic alphabets plus a fair range of signs. These share metrics with Arial, Times New Roman and Courier New, and contain fonts rather similar to the first two (https://en.wikipedia.org/wiki/Liberation_fonts). Then there is the ‘Free UCS Outline Fonts’ project (https://www.gnu.org/software/freefont/) which are OpenType/TrueType fonts based on the URW fonts but with extended Unicode coverage. See the R help on X11 on selecting such fonts.

The bitmapped graphics devices jpeg(), png() and tiff() need the appropriate headers and libraries installed: jpeg (version 6b or later, or libjpeg-turbo) or libpng (version 1.2.7 or later) and zlib or libtiff respectively. pkg-config is used if available and so needs the appropriate .pc file (which requires libtiff version 4.x and is not available on all platforms for jpeg before version 9c). They also need support for either X11 or cairo (see above). Should support for these devices not be required or broken system libraries need to be avoided there are configure options --without-libpng, --without-jpeglib and --without-libtiff. The TIFF library has many optional features such as jpeg, libz, zstd, lzma, webp, jbig and jpeg12, none of which is required for the tiff() devices but may need to be present to link the library (usually only an issue for static linking). pkg-config can tell you what other libraries are required for linking, for example by pkg-config libtiff-4 --static --libs.

Option --with-system-tre is also available: it needs a recent version of TRE. (The latest sources are in the git repository at https://github.com/laurikari/tre/, but at the time of writing the resulting build did not complete its checks, nor did R built against the version supplied by Fedora.)

An implementation of XDR is required, and the R sources contain one which is likely to suffice (although a system version may have higher performance). XDR is part of RPC and historically has been part of libc on a Unix-alike. (In principle man xdr_string should tell you which library is needed, but it often does not: on some OSes it is provided by libnsl.) However some builds16 of glibc omit or hide it with the intention that the TI-RPC library be used, in which case libtirpc (and its development version) should be installed, and its headers17 need to be on the C include path or under /usr/include/tirpc.

16 Including that used by Fedora 28 and later

17 R uses rpc/xdr.h but that includes netconfig.h from the top tirpc directory.

Library libdeflate (https://github.com/ebiggers/libdeflate) is used by memCompress() and memDecompress() if available.

Use of the X11 clipboard selection requires the Xmu headers and libraries. These are normally part of an X11 installation (e.g. the Debian meta-package xorg-dev), but some distributions have split this into smaller parts, so for example recent versions of Fedora require the libXmu and libXmu-devel RPMs.

Some systems (notably macOS and at least some FreeBSD systems) have inadequate support for collation in multibyte locales. It is possible to replace the OS’s collation support by that from ICU (International Components for Unicode, https://icu.unicode.org/), and this provides much more precise control over collation on all systems. ICU is available as sources and as binary distributions for (at least) most Linux distributions, FreeBSD, macOS and AIX, usually as libicu or icu4c. It will be used by default where available: should a very old or broken version of ICU be found this can be suppressed by --without-ICU.

The bitmap and dev2bitmap devices and function embedFonts() use Ghostscript (https://www.ghostscript.com/). This should either be in your path when the command is run, or its full path specified by the environment variable R_GSCMD at that time.

At the time of writing a full installation on Fedora Linux used the following packages and their development versions, and this may provide a useful checklist for other systems:

bzip2 cairo fontconfig freetype fribidi gcc gcc-gfortran gcc-c++ glib2
glibc harfbuzz lapack libX11 libXext libXt libcurl libdeflate libicu
libjpeg libpng libtiff libtirpc libxcrypt ncurses pango
pkgconf-pkg-config pcre2 readline tcl tk xz zlib

plus, preferably a TeX installation and Java.

A.2.1 Tcl/Tk

The tcltk package needs Tcl/Tk ≥ 8.4 installed: the sources are available at https://www.tcl.tk/. To specify the locations of the Tcl/Tk files you may need the configuration options

--with-tcltk

use Tcl/Tk, or specify its library directory

--with-tcl-config=TCL_CONFIG

specify location of tclConfig.sh

--with-tk-config=TK_CONFIG

specify location of tkConfig.sh

or use the configure variables TCLTK_LIBS and TCLTK_CPPFLAGS to specify the flags needed for linking against the Tcl and Tk libraries and for finding the tcl.h and tk.h headers, respectively. If you have both 32- and 64-bit versions of Tcl/Tk installed, specifying the paths to the correct config files may be necessary to avoid confusion between them.

Versions of Tcl/Tk up to 8.5.19 and 8.6.12 have been tested (including most versions of 8.4.x, but not recently).

Note that the tk.h header includes18 X11 headers, so you will need X11 and its development files installed.

18 This is true even for the ‘Aqua’ version of Tk on macOS, but distributions of that include a copy of the X11 files needed.

A.2.2 Java support

The build process looks for Java support on the host system, and if it finds it sets some settings which are useful for Java-using packages (such as rJava and JavaGD: these require a full JDK). This check can be suppressed by configure option --disable-java. Configure variable JAVA_HOME can be set to point to a specific JRE/JDK, on the configure command line or in the environment.

Principal amongst these settings are some paths to the Java libraries and JVM, which are stored in environment variable R_JAVA_LD_LIBRARY_PATH in file R_HOME/etc/ldpaths (or a sub-architecture-specific version). A typical setting for x86_64 Linux is

JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-6.fc34.x86_64/jre
R_JAVA_LD_LIBRARY_PATH=${JAVA_HOME}/lib/amd64/server

Unfortunately this depends on the exact version of the JRE/JDK installed, and so may need updating if the Java installation is updated. This can be done by running R CMD javareconf which updates settings in both R_HOME/etc/Makeconf and R_HOME/etc/ldpaths. See R CMD javareconf --help for details: note that this needs to be done by the account owning the R installation.

Another way of overriding those settings is to set the environment variable R_JAVA_LD_LIBRARY_PATH (before R is started, hence not in ~/.Renviron), which suffices to run already-installed Java-using packages. For example

R_JAVA_LD_LIBRARY_PATH=/usr/lib/jvm/java-1.8.0/jre/lib/amd64/server

It may be possible to avoid this by specifying an invariant link as the path when configuring. For example, on that system any of

JAVA_HOME=/usr/lib/jvm/java
JAVA_HOME=/usr/lib/jvm/java-1.8.0
JAVA_HOME=/usr/lib/jvm/java-1.8.0/jre
JAVA_HOME=/usr/lib/jvm/jre-1.8.0

worked (since the ‘auto’ setting of /etc/alternatives chose Java 8 aka 1.8.0).

‘Non-server’ Oracle distributions of Java as from version 11 are of a full JDK. However, Linux distributions can be confusing: for example Fedora 34 had

java-1.8.0-openjdk
java-1.8.0-openjdk-devel
java-openjdk
java-openjdk-devel
java-11-openjdk
java-11-openjdk-devel
java-17-openjdk
java-17-openjdk-devel
java-latest-openjdk
java-latest-openjdk-devel

where the -devel RPMs are needed to complete the JDK. Debian/Ubuntu use -jre and -jdk, e.g.

sudo apt install default-jdk

A.2.3 Other compiled languages

Some add-on packages need a C++ compiler. This is specified by the configure variables CXX, CXXFLAGS and similar. configure will normally find a suitable compiler. It is possible to specify an alternative C++17 compiler by the configure variables CXX17, CXX17STD, CXX17FLAGS and similar (see C++ Support). Again, configure will normally find a suitable value for CXX17STD if the compiler given by CXX is capable of compiling C++17 code, but it is possible that a completely different compiler will be needed. (Similar macros are provided for C++20.)

For source files with extension .f90 or .f95 containing free-form Fortran, the compiler defined by the macro FC is used by R CMD INSTALL. Note that it is detected by the name of the command without a test that it can actually compile Fortran 90 code. Set the configure variable FC to override this if necessary: variables FCFLAGS and FCLIBS_XTRA might also need to be set.

See file config.site in the R source for more details about these variables.

A.3 Linear algebra

The linear algebra routines in R make use of BLAS (Basic Linear Algebra Subprograms, https://netlib.org/blas/faq.html) routines, and most make use of routines from LAPACK (Linear Algebra PACKage, https://netlib.org/lapack/). The R sources contain reference (Fortran) implementations of these, but they can be replaced by external libraries, usually those tuned for speed on specific CPUs. These libraries normally contain all of the BLAS routines and some tuned LAPACK routines and perhaps the rest of LAPACK from the reference implementation. Because of the way linking works, using an external BLAS library may necessitate using the version of LAPACK it contains.

Note that the alternative implementations will not give identical numeric results. Some differences may be benign (such the signs of SVDs and eigenvectors), but the optimized routines can be less accurate and (particularly for LAPACK) can be from older versions with fewer corrections. However, R relies on ISO/IEC 60559 compliance. This can be broken if for example the code assumes that terms with a zero factor are always zero and do not need to be computed—whereas x*0 can be NaN. The internal BLAS has been extensively patched to avoid this whereas MKL’s documentation has warned

LAPACK routines assume that input matrices do not contain IEEE 754 special values such as INF or NaN values. Using these special values may cause LAPACK to return unexpected results or become unstable.

Some of the external libraries are multi-threaded. One issue is that R profiling (which uses the SIGPROF signal) may cause problems, and you may want to disable profiling if you use a multi-threaded BLAS. Note that using a multi-threaded BLAS can result in taking more CPU time and even more elapsed time (occasionally dramatically so) than using a similar single-threaded BLAS. On a machine running other tasks, there can be contention for CPU caches that reduces the effectiveness of the optimization of cache use by a BLAS implementation: some people warn that this is especially problematic for hyper-threaded CPUs.

BLAS and LAPACK routines may be used inside threaded code, for example in OpenMP sections in packages such as mgcv. The reference implementations are thread-safe but external ones may not be (even single-threaded ones): this can lead to hard-to-track-down incorrect results or segfaults.

There is a tendency for re-distributors of R to use ‘enhanced’ linear algebra libraries without explaining their downsides.

A.3.1 BLAS

An external BLAS library has to be explicitly requested at configure time.

You can specify a particular BLAS library via a value for the configuration option --with-blas. If this is given with no =, its value is taken from the environment variable BLAS_LIBS, set for example in config.site. If neither the option nor the environment variable supply a value, a search is made for a suitable19 BLAS. If the value is not obviously a linker command (starting with a dash or giving the path to a library), it is prefixed by -l, so

19 The search order is currently OpenBLAS, BLIS, ATLAS, platform-specific choices (see below) and finally a generic libblas.

--with-blas="foo"

is an instruction to link against -lfoo to find an external BLAS (which needs to be found both at link time and run time).

The configure code checks that the external BLAS is complete (as of LAPACK 3.9.1: it must include all double precision and double complex routines, as well as LSAME), and appears to be usable. However, an external BLAS has to be usable from a shared object (so must contain position-independent code), and that is not checked. Also, the BLAS can be switched after configure is run, either as a symbolic link or by the mechanisms mentioned below, and this can defeat the completeness check.

Some enhanced BLASes are compiler-system-specific (Accelerate on macOS, sunperf on Solaris20, libessl on IBM). The correct incantation for these is often found via --with-blas with no value on the appropriate platforms.

20 Using the Oracle Developer Studio cc and f95 compilers

Note that under Unix (but not under Windows) if R is compiled against a non-default BLAS and --enable-BLAS-shlib is not used (it is the default on all platforms except AIX), then all BLAS-using packages must also be. So if R is re-built to use an enhanced BLAS then packages such as quantreg will need to be re-installed.

Debian/Ubuntu systems provide a system-specific way to switch the BLAS in use: Build R with --with-blas to select the OS version of the reference BLAS, and then use update-alternatives to switch between the available BLAS libraries. See https://wiki.debian.org/DebianScience/LinearAlgebraLibraries.

Fedora 33 and later offer ‘FlexiBLAS’, a similar mechanism for switching the BLAS in use (https://www.mpi-magdeburg.mpg.de/projects/flexiblas). However, rather than overriding libblas, this requires configuring R with option --with-blas=flexiblas. ‘Backend’ wrappers are available for the reference BLAS, ATLAS and serial, threaded and OpenMP builds of OpenBLAS and BLIS, and perhaps others21. This can be controlled from a running R session by package flexiblas.

21 for example, Intel MKL not packaged by Fedora.

BLAS implementations which use parallel computations can be non-deterministic: this is known for ATLAS.

A.3.2 ATLAS

ATLAS (https://math-atlas.sourceforge.net/) is a “tuned” BLAS that runs on a wide range of Unix-alike platforms. Unfortunately it is built by default as a static library that on some platforms may not be able to be used with shared objects such as are used in R packages. Be careful when using pre-built versions of ATLAS static libraries (they seem to work on ix86 platforms, but not always on x86_64 ones).

ATLAS contains replacements for a small number of LAPACK routines, but can be built to merge these with the reference LAPACK sources to include a full LAPACK library.

Recent versions of ATLAS can be built as a single shared library, either libsatlas or libtatlas (serial or threaded respectively): these may even contain a full LAPACK. Such builds can be used by one of

--with-blas=satlas
--with-blas=tatlas

or, as on x86_64 Fedora where a path needs to be specified,

--with-blas="-L/usr/lib64/atlas -lsatlas"
--with-blas="-L/usr/lib64/atlas -ltatlas"

Distributed ATLAS libraries cannot be tuned to your machine and so are a compromise: for example Fedora tunes22 x86_64 RPMs for CPUs with SSE3 extensions, and separate RPMs may be available for specific CPU families.

22 The only way to see exactly which CPUs the distributed libraries have been tuned for is to read the atlas.spec file.

Note that building R on Linux against distributed shared libraries may need -devel or -dev packages installed.

Linking against multiple static libraries requires one of

--with-blas="-lf77blas -latlas"
--with-blas="-lptf77blas -lpthread -latlas"
--with-blas="-L/path/to/ATLAS/libs -lf77blas -latlas"
--with-blas="-L/path/to/ATLAS/libs -lptf77blas -lpthread -latlas"

Consult its installation guide23 for how to build ATLAS as a shared library or as a static library with position-independent code (on platforms where that matters).

According to the ATLAS FAQ24 the maximum number of threads used by multi-threaded ATLAS is set at compile time. Also, the author advises against using multi-threaded ATLAS on hyper-threaded CPUs without restricting affinities at compile-time to one virtual core per physical CPU. (For the Fedora libraries the compile-time flag specifies 4 threads.)

A.3.3 OpenBLAS and BLIS

Dr Kazushige Goto wrote a tuned BLAS for several processors and OSes, which was frozen in 2010. OpenBLAS (https://www.openblas.net/) is a descendant project with support for some later CPUs.

This can be used by configuring R with something like

--with-blas="openblas"

See see Shared BLAS for an alternative (and in many ways preferable) way to use them.

Some platforms provide multiple builds of OpenBLAS: for example Fedora has RPMs25

25 (and more, e.g. for 64-bit ints and static versions).

openblas
openblas-threads
openblas-openmp

providing shared libraries

libopenblas.so
libopenblasp.so
libopenblaso.so

respectively, each of which can be used as a shared BLAS. For the second and third the number of threads is controlled by OPENBLAS_NUM_THREADS and OMP_NUM_THREADS (as usual for OpenMP) respectively.

These and their Debian equivalents contain a complete LAPACK implementation.

Note that building R on Linux against distributed libraries may need -devel or -dev packages installed.

For ix86 and x86_64 CPUs most distributed libraries contain several alternatives for different CPU microarchitectures with the choice being made at run time.

Another descendant project is BLIS (https://github.com/flame/blis). This has (in Fedora) shared libraries

libblis.so
libblisp.so
libbliso.so

(p for ‘threads’, o for OpenMP as for OpenBLAS) which can also be used as a shared BLAS. The Fedora builds do not include LAPACK in the BLIS libraries.

A.3.4 Intel MKL

For Intel processors (and perhaps others) and some distributions of Linux, there is Intel’s Math Kernel Library26. You are encouraged to read the documentation which is installed with the library before attempting to link to MKL. This includes a ‘link line advisor’ which will suggest appropriate incantations: its use is recommended. Or see https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-link-line-advisor.html#gs.vpt6qp (which at the time of writing selected the Intel library for linking with GCC).

26 Nowadays known as ‘Intel oneAPI Math Kernel Library’ or even ‘oneMKL’.

27 The issue for macOS has been the use of double-complex routines.

There are also versions of MKL for macOS27 and Windows, but when these have been tried they did not work with the default compilers used for R on those platforms.

The following examples have been used with MKL versions 10.3 to 2023.2.0, for GCC compilers on x86_64 CPUs. (See also Intel compilers.)

To use a sequential version of MKL we used

MKL_LIB_PATH=/path/to/intel_mkl/mkl/lib/intel64
export LD_LIBRARY_PATH=$MKL_LIB_PATH
MKL="-L${MKL_LIB_PATH} -lmkl_gf_lp64 -lmkl_core -lmkl_sequential"
./configure --with-blas="$MKL" --with-lapack

The option --with-lapack is used since MKL contains a tuned copy of LAPACK (often older than the current version) as well as the BLAS (see LAPACK), although this can be omitted.

Threaded MKL may be used by replacing the line defining the variable MKL by

MKL="-L${MKL_LIB_PATH} -lmkl_gf_lp64 -lmkl_core \
     -lmkl_gnu_thread -dl -fopenmp"

R can also be linked against a single shared library, libmkl_rt.so, for both BLAS and LAPACK, but the correct OpenMP and MKL interface layer then has to be selected via environment variables. With 64-bit builds and the GCC compilers, we used

export MKL_INTERFACE_LAYER=GNU,LP64 
export MKL_THREADING_LAYER=GNU

On Debian/Ubuntu, MKL is provided by package intel-mkl-full and one can set libmkl_rt.so as the system-wide implementation of both BLAS and LAPACK during installation of the package, so that also R installed from Debian/Ubuntu package r-base would use it. It is, however, still essential to set MKL_INTERFACE_LAYER and MKL_THREADING_LAYER before running R, otherwise MKL computations will produce incorrect results. R does not have to be rebuilt to use MKL, but configure includes tests which may discover some errors such as a failure to set the correct OpenMP and MKL interface layer.

Note that the Debian/Ubuntu distribution can be quite old (for example 2020.4 in mid-2023 when 2023.1 was current): this can be important for the LAPACK version included.

The default number of threads will be chosen by the OpenMP software, but can be controlled by setting OMP_NUM_THREADS or MKL_NUM_THREADS, and in recent versions seems to default to a sensible value for sole use of the machine. (Parallel MKL has not always passed make check-all, but did with MKL 2019.4 and later.)

MKL includes a partial implementation of FFTW3, which causes trouble for applications that require some of the FFTW3 functionality unsupported in MKL. Please see the MKL manuals for description of these limitations and for instructions on how to create a custom version of MKL which excludes the FFTW3 wrappers.

There is Intel documentation for building R with MKL at https://www.intel.com/content/www/us/en/developer/articles/technical/using-onemkl-with-r.html: that includes

-Wl,--no-as-needed

which we have not found necessary.

A.3.5 Shared BLAS

The BLAS library will be used for many of the add-on packages as well as for R itself. This means that it is better to use a shared/dynamic BLAS library, as most of a static library will be compiled into the R executable and each BLAS-using package.

R offers the option of compiling the BLAS into a dynamic library libRblas stored in R_HOME/lib and linking both R itself and all the add-on packages against that library.

This is the default on all platforms except AIX unless an external BLAS is specified and found: for the latter it can be used by specifying the option --enable-BLAS-shlib, and it can always be disabled via --disable-BLAS-shlib.

This has both advantages and disadvantages.

  • It saves space by having only a single copy of the BLAS routines, which is helpful if there is an external static BLAS (as used to be standard for ATLAS).
  • There may be performance disadvantages in using a shared BLAS. Probably the most likely is when R’s internal BLAS is used and R is not built as a shared library, when it is possible to build the BLAS into R.bin (and libR.a) without using position-independent code. However, experiments showed that in many cases using a shared BLAS was as fast, provided high levels of compiler optimization are used.
  • It is easy to change the BLAS without needing to re-install R and all the add-on packages, since all references to the BLAS go through libRblas, and that can be replaced. Note though that any dynamic libraries the replacement links to will need to be found by the linker: this may need the library path to be changed in R_HOME/etc/ldpaths.

Another option to change the BLAS in use is to symlink a single dynamic BLAS library to R_HOME/lib/libRblas.so. For example, just

mv R_HOME/lib/libRblas.so R_HOME/lib/libRblas.so.keep
ln -s /usr/lib64/libopenblasp.so.0 R_HOME/lib/libRblas.so

on x86_64 Fedora will change the BLAS used to multithreaded OpenBLAS. A similar link works for most versions of the OpenBLAS (provided the appropriate lib directory is in the run-time library path or ld.so cache). It can also be used for a single-library ATLAS, so on x86_64 Fedora either of

ln -s /usr/lib64/atlas/libsatlas.so.3 R_HOME/lib/libRblas.so
ln -s /usr/lib64/atlas/libtatlas.so.3 R_HOME/lib/libRblas.so

can be used with its distributed ATLAS libraries. (If you have the -devel RPMs installed you can omit the .0/.3.)

Note that rebuilding or symlinking libRblas.so may not suffice if the intention is to use a modified LAPACK contained in an external BLAS: the latter could even cause conflicts. However, on Fedora where the OpenBLAS distribution contains a copy of LAPACK, it is the latter which is used.

A.3.6 LAPACK

If when configuring R a system LAPACK library is found of version 3.9.0 or later (and does not contain BLAS routines) it will be used instead of compiling the LAPACK code in the package sources. This can be prevented by configuring R with --without-lapack. Using a static liblapack.a is not supported.

It is assumed that -llapack is the reference LAPACK library but on Debian/Ubuntu it can be switched, including after R is installed. On such a platform it is better to use --without-lapack or --with-blas --with-lapack (see below) explicitly. The known examples28 of a non-reference LAPACK library found at installation all contain BLAS routines so are not used by a default configure run.

28 ATLAS, OpenBLAS and Accelerate.

Provision is made for specifying an external LAPACK library with option --with-lapack, principally to cope with BLAS libraries which contain a copy of LAPACK (such as Accelerate on macOS and some builds of ATLAS, FlexiBLAS, MKL and OpenBLAS on ix86/x86_64 Linux). At least LAPACK version 3.2 is required. This can only be done if --with-blas has been used.

However, the likely performance gains are thought to be small (and may be negative). The default is not to search for a suitable LAPACK library, and this is definitely not recommended. You can specify a specific LAPACK library or a search for a generic library by the configuration option --with-lapack without a value. The default for --with-lapack is to check the BLAS library (for function DPSTRF) and then look for an external library -llapack. Sites searching for the fastest possible linear algebra may want to build a LAPACK library using the ATLAS-optimized subset of LAPACK. Similarly, OpenBLAS can be built to contain an optimized subset of LAPACK or a full LAPACK (the latter seeming to be the default).

A value for --with-lapack can be set via the environment variable LAPACK_LIBS, but this will only be used if --with-lapack is specified and the BLAS library does not contain LAPACK.

Please bear in mind that using --with-lapack is provided only because it is necessary on some platforms and because some users want to experiment with claimed performance improvements. In practice its main uses are without a value,

  • with an ‘enhanced’ BLAS such as ATLAS, FlexiBLAS, MKL or OpenBLAS which contains a full LAPACK (to avoid possible conflicts), or
  • on Debian/Ubuntu systems to select the system liblapack which can be switched by the ‘alternatives’ mechanism.

If building LAPACK from its Netlib sources, be aware that make with its supplied Makefile will make a static library and R requires a shared/dynamic one. To get one, use cmake as documented briefly in README.md. Something like (to build only the double and double complex subroutines with 32-bit array indices),

mkdir build
cd build
cmake \
-DCMAKE_INSTALL_PREFIX=/where/you/want/to/install \
-DCMAKE_BUILD_TYPE:STRING=Release \
-DBUILD_DEPRECATED=ON -DBUILD_SHARED_LIBS=ON \
-DBUILD_INDEX64_EXT_API:BOOL=OFF \
-DBUILD_SINGLE:BOOL=OFF -DBUILD_COMPLEX:BOOL=OFF \
-DLAPACKE=OFF -DCBLAS=OFF \
-S ..
make -j10

This builds the reference BLAS and the reference LAPACK linked to it.

Note that cmake files do not provide an uninstall target, but build/install_manifest.txt is a list of the files installed, so you can remove them via shell commands or from R.

If using --with-lapack to get a generic LAPACK (or allowing the default to select one), consider also using --with-blas (with a path if an enhanced BLAS is installed).

A.3.7 Caveats

As with all libraries, you need to ensure that they and R were compiled with compatible compilers and flags. For example, this has meant that on Sun Sparc using the Oracle compilers the flag -dalign is needed if sunperf is to be used.

On some systems it has been necessary that an external BLAS/LAPACK was built with the same Fortran compiler used to build R.

BLAS and LAPACK libraries built with recent versions of gfortran require calls from C/C++ to handle ‘hidden’ character lengths — R itself does so but many packages used not to and some have segfaulted. This was largely circumvented by using the Fortran flag -fno-optimize-sibling-calls (formerly set by configure if it detected gfortran 7 or later): however use of the R headers which include those character-length arguments is no longer optional in packages.

LAPACK 3.9.0 (and probably earlier) had a bug in which the DCOMBSSQ subroutine may cause NA to be interpreted as zero. This is fixed in the R 3.6.3 and later sources, but if you use an external LAPACK, you may need to fix it there. (The bug was corrected in 3.9.1 and the routine removed in 3.10.1.)

The code (in dlapack.f) should read

*     ..
*     .. Executable Statements ..
*
      IF( V1( 1 ).GE.V2( 1 ) ) THEN
         IF( V1( 1 ).NE.ZERO ) THEN
            V1( 2 ) = V1( 2 ) + ( V2( 1 ) / V1( 1 ) )**2 * V2( 2 )
         ELSE
            V1( 2 ) = V1( 2 ) + V2( 2 )
         END IF
      ELSE
         V1( 2 ) = V2( 2 ) + ( V1( 1 ) / V2( 1 ) )**2 * V1( 2 )
         V1( 1 ) = V2( 1 )
      END IF
      RETURN

(The inner ELSE clause was missing in LAPACK 3.9.0.)

If you do use an external LAPACK, be aware of potential problems with other bugs in the LAPACK sources (or in the posted corrections to those sources), seen several times in Linux distributions over the years. We have even seen distributions with missing LAPACK routines from their liblapack.

We rely on limited support in LAPACK for matrices with 2^{31} or more elements: it is possible that an external LAPACK will not have that support.

Footnotes