For faster R use OpenBLAS instead: better than ATLAS, trivial to switch to on Ubuntu

R speeds up when the Basic Linear Algebra System (BLAS) it uses is well tuned. The reference BLAS that comes with R and Ubuntu isn't very fast. On my machine, it takes 9 minutes to run a well known R benchmarking script. If I use ATLAS, an optimized BLAS that can be easily installed, the same script takes 3.5 minutes. If I use OpenBLAS, yet another optimized BLAS that is equally easy to install, the same script takes 2 minutes. That's a pretty big improvement!

In this post, I'll show you how to install ATLAS and OpenBLAS, demonstrate how you can switch between them, and let you pick which you would like to use based on benchmark results. Before we get started, one quick shout out to Felix Riedel: thanks for encouraging me to look at OpenBLAS instead of ATLAS in your comment on my previous post.

Update for Mac OS X users: Zachary Meyer's comment gives bare bones details for how to accomplish a similar BLAS switch. He has a few more details on his blog. Thanks Zachary!

Update for R multicore users: According to this comment and this comment, OpenBLAS does not play well with one of R's other multicore schemes. It appears to be a bug, so perhaps it will get fixed in the future. See the comment stream for further details.

Update for the adventurous: According to Joe Herman: "OpenBLAS isn't faster than ATLAS, but it is much easier to install OpenBLAS via apt-get than it is to compile ATLAS and R manually from source." See Joe's comment for details on the benefits of compiling ATLAS and R from scratch.

Installing additional BLAS libraries on Ubuntu

For Ubuntu, there are currently three different BLAS options that can be easily chosen: "libblas" the reference BLAS, "libatlas" the ATLAS BLAS, and "libopenblas" the OpenBLAS. Their package names are

$ apt-cache search libblas
libblas-dev - Basic Linear Algebra Subroutines 3, static library
libblas-doc - Basic Linear Algebra Subroutines 3, documentation
libblas3gf - Basic Linear Algebra Reference implementations, shared library
libatlas-base-dev - Automatically Tuned Linear Algebra Software, generic static
libatlas3gf-base - Automatically Tuned Linear Algebra Software, generic shared
libblas-test - Basic Linear Algebra Subroutines 3, testing programs
libopenblas-base - Optimized BLAS (linear algebra) library based on GotoBLAS2
libopenblas-dev - Optimized BLAS (linear algebra) library based on GotoBLAS2

Since libblas already comes with Ubuntu, we only need to install the other two for our tests. (NOTE: In the following command, delete 'libatlas3gf-base' if you don't want to experiment with ATLAS.):

$ sudo apt-get install libopenblas-base libatlas3gf-base

Switching between BLAS libraries

Now we can switch between the different BLAS options that are installed:

$ sudo update-alternatives --config libblas.so.3gf
There are 3 choices for the alternative libblas.so.3gf (providing /usr/lib/libblas.so.3gf).

Selection Path Priority Status
------------------------------------------------------------
* 0 /usr/lib/openblas-base/libopenblas.so.0 40 auto mode
1 /usr/lib/atlas-base/atlas/libblas.so.3gf 35 manual mode
2 /usr/lib/libblas/libblas.so.3gf 10 manual mode
3 /usr/lib/openblas-base/libopenblas.so.0 40 manual mode

Press enter to keep the current choice[*], or type selection number:
    Side note: If the above returned:

    update-alternatives: error: no alternatives for libblas.so.3gf

    Try

    $ sudo update-alternatives --config libblas.so.3

    instead. See the comments at the end of the post for further details.

From the selection menu, I picked 3, so it now shows that choice 3 (OpenBLAS) is selected:

$ sudo update-alternatives --config libblas.so.3gf
There are 3 choices for the alternative libblas.so.3gf (providing /usr/lib/libblas.so.3gf).

Selection Path Priority Status
------------------------------------------------------------
0 /usr/lib/openblas-base/libopenblas.so.0 40 auto mode
1 /usr/lib/atlas-base/atlas/libblas.so.3gf 35 manual mode
2 /usr/lib/libblas/libblas.so.3gf 10 manual mode
* 3 /usr/lib/openblas-base/libopenblas.so.0 40 manual mode

And we can pull the same trick to choose between LAPACK implementations. From the output we can see that OpenBLAS does not provide a new LAPACK implementation, but ATLAS does:

$ sudo update-alternatives --config liblapack.so.3gf
There are 2 choices for the alternative liblapack.so.3gf (providing /usr/lib/liblapack.so.3gf).

Selection Path Priority Status
------------------------------------------------------------
* 0 /usr/lib/atlas-base/atlas/liblapack.so.3gf 35 auto mode
1 /usr/lib/atlas-base/atlas/liblapack.so.3gf 35 manual mode
2 /usr/lib/lapack/liblapack.so.3gf 10 manual mode

So we will do nothing in this case, since OpenBLAS is supposed to use the reference implementation (which is already selected).

    Side note: If the above returned:

    update-alternatives: error: no alternatives for liblapack.so.3gf

    Try

    $ sudo update-alternatives –config liblapack.so.3

    instead. See the comments at the end of the post for further details.

Checking that R is using the right BLAS

Now we can check that everything is working by starting R in a new terminal:

$ R

R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
...snip...
Type 'q()' to quit R.

>

Great. Let's see if R is using the BLAS and LAPACK libraries we selected. To do so, we open another terminal so that we can run a few more shell commands. First, we find the PID of the R process we just started. Your output will look something like this:

$ ps aux | grep exec/R
1000 18065 0.4 1.0 200204 87568 pts/1 Sl+ 09:00 0:00 /usr/lib/R/bin/exec/R
root 19250 0.0 0.0 9396 916 pts/0 S+ 09:03 0:00 grep --color=auto exec/R

The PID is the second number on the '/usr/lib/R/bin/exec/R' line. To see
which BLAS and LAPACK libraries are loaded with that R session, we use the "list open files" command:

$ lsof -p 18065 | grep 'blas\|lapack'
R 18065 nathanvan mem REG 8,1 9342808 12857980 /usr/lib/lapack/liblapack.so.3gf.0
R 18065 nathanvan mem REG 8,1 19493200 13640678 /usr/lib/openblas-base/libopenblas.so.0

As expected, the R session is using the reference LAPACK (/usr/lib/lapack/liblapack.so.3gf.0) and OpenBLAS (/usr/lib/openblas-base/libopenblas.so.0)

Testing the different BLAS/LAPACK combinations

I used Simon Urbanek's most recent benchmark script. To follow along, first download it to your current working directory:

$ curl http://r.research.att.com/benchmarks/R-benchmark-25.R -O

And then run it:

$ cat R-benchmark-25.R | time R --slave
Loading required package: Matrix
Loading required package: lattice
Loading required package: SuppDists
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :
there is no package called ‘SuppDists’
...snip...

Ooops. I don't have the SuppDists package installed. I can easily load it via Michael Rutter's ubuntu PPA:

$ sudo apt-get install r-cran-suppdists

Now Simon's script works wonderfully. Full output

$ cat R-benchmark-25.R | time R --slave
Loading required package: Matrix
Loading required package: lattice
Loading required package: SuppDists
Warning messages:
1: In remove("a", "b") : object 'a' not found
2: In remove("a", "b") : object 'b' not found

R Benchmark 2.5
===============
Number of times each test is run__________________________: 3

I. Matrix calculation
---------------------
Creation, transp., deformation of a 2500x2500 matrix (sec): 1.36566666666667
2400x2400 normal distributed random matrix ^1000____ (sec): 0.959
Sorting of 7,000,000 random values__________________ (sec): 1.061
2800x2800 cross-product matrix (b = a' * a)_________ (sec): 1.777
Linear regr. over a 3000x3000 matrix (c = a \ b')___ (sec): 1.00866666666667
--------------------------------------------
Trimmed geom. mean (2 extremes eliminated): 1.13484335940626

II. Matrix functions
--------------------
FFT over 2,400,000 random values____________________ (sec): 0.566999999999998
Eigenvalues of a 640x640 random matrix______________ (sec): 1.379
Determinant of a 2500x2500 random matrix____________ (sec): 1.69
Cholesky decomposition of a 3000x3000 matrix________ (sec): 1.51366666666667
Inverse of a 1600x1600 random matrix________________ (sec): 1.40766666666667
--------------------------------------------
Trimmed geom. mean (2 extremes eliminated): 1.43229160585452

III. Programmation
------------------
3,500,000 Fibonacci numbers calculation (vector calc)(sec): 1.10533333333333
Creation of a 3000x3000 Hilbert matrix (matrix calc) (sec): 1.169
Grand common divisors of 400,000 pairs (recursion)__ (sec): 2.267
Creation of a 500x500 Toeplitz matrix (loops)_______ (sec): 1.213
Escoufier's method on a 45x45 matrix (mixed)________ (sec): 1.32600000000001
--------------------------------------------
Trimmed geom. mean (2 extremes eliminated): 1.23425893178325

Total time for all 15 tests_________________________ (sec): 19.809
Overall mean (sum of I, II and III trimmed means/3)_ (sec): 1.26122106386747
--- End of test ---

134.75user 16.06system 1:50.08elapsed 137%CPU (0avgtext+0avgdata 1949744maxresident)k
448inputs+0outputs (3major+1265968minor)pagefaults 0swaps

Where the elapsed time at the very bottom is the part that we care about. With OpenBLAS and the reference LAPACK, the script took 1 minute and 50 seconds to run. By changing around the selections with update-alternatives, we can test out R with ATLAS (3:21) or R with the reference BLAS (9:13). For my machine, OpenBLAS is a clear winner.

Give it a shot yourself. If you find something different, let me know.

63 thoughts on “For faster R use OpenBLAS instead: better than ATLAS, trivial to switch to on Ubuntu

  1. Pingback: My Stat Bytes talk, with slides and code | Nathan VanHoudnos

  2. thiagogm

    Great post VanHoudnos, I tried to quickly follow your instructions but got the following error when starting R after selecting option 3 in

    'sudo update-alternatives --config libblas.so.3gf'. Any thoughts?

    Error in dyn.load(file, DLLpath = DLLpath, ...) :
    unable to load shared object '/usr/lib/R/library/stats/libs/stats.so':
    /usr/lib/liblapack.so.3gf: undefined symbol: ATL_chemv
    During startup - Warning message:
    package ‘stats’ in options("defaultPackages") was not found

    Reply
    1. nmv Post author

      The error that it's throwing is related to your LAPACK selection, not your BLAS selection. Do you have ATLAS selected for LAPACK and OpenBLAS selected for the BLAS?

      Reply
          1. nmv Post author

            @kav Make sure that you select the matching option for

            $ sudo update-alternatives --config libblas.so.3gf

            and

            $ sudo update-alternatives --config liblapack.so.3gf

  3. Zachary Mayer

    For reference, mac users can use Apple's version of BLAS in the accelerate framework using:

    cd /Library/Frameworks/R.framework/Resources/lib

    ln -sf /System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Versions/Current/libBLAS.dylib libRblas.dylib

    You can go back to the default BLAS using:

    cd /Library/Frameworks/R.framework/Resources/lib

    ln -sf libRblas.0.dylib libRblas.dylib

    For me (on R 3):
    Regular BLAS: 141 seconds (2.35 minutes)
    Apple's BLAS: 43 seconds (0.71 minutes)

    For more info, read here:
    http://r.research.att.com/man/RMacOSX-FAQ.html#Which-BLAS-is-used-and-how-can-it-be-changed_003f

    and here:
    https://groups.google.com/forum/#!topic/r-sig-mac/k4rDRRdtNwE

    Note that R 3.0 no longer includes libRblas.vecLib.dylib, but you can still link against the system version of libBLAS.

    Reply
      1. SVS

        Please see gcbd reference manual, http://cran.r-project.org/web/packages/gcbd/index.html. They also compare several BLAS implementations. However gcbd is several years old by now. It has found that Goto BLAS is ahead of Atlas, consistent with your findings (in my understanding, Goto BLAS is superseded by OpenBLAS). However, gcbd includes Intel MKL in the set of compared BLAS implementations. It would be interesting to know how OpenBLAS performance corresponds to the latest Intel MKL.

        Reply
        1. nmv Post author

          I think you are correct.

          The other take away from gcbd is that if you compile the BLAS yourself, you can likely get even better improvements.

          To the best of my knowledge, since MKL needs to be compiled on Ubuntu, a fair comparison of ATLAS, OpenBLAS, and MKL would need to compile all three. That's a bit more work than I think most want to put in to squeeze a bit more performance out of R.

          However, if you would like to do it, or know of anyone who has, let me know and I'll add a link to the body of the post.

          Reply
  4. Pingback: Optimizing R with Multi-threaded OpenBLAS | Thiago G. Martins

  5. philchalmers

    This seems really cool and extremely straightforward, but I'm having some issues getting it set up with Ubuntu 13.10. After installing libopenblas-base and libatlas3gf-base when I try to set the alternatives I get

    $ sudo update-alternatives --config libblas.so.3gf
    update-alternatives: error: no alternatives for libblas.so.3gf

    I've tried installing libopenblas-dev, but the same issue occurs. Any idea why this might be or how to fix it? Thanks so much.

    Reply
    1. nmv Post author

      Unfortunately I haven't made the jump to 13.10; my plan is to wait until 14.04 LTS comes out.

      My guess is that the 13.10 version of the packages isn't properly updating the symlinks in /etc/alternatives. What does

      $ ls /etc/alternatives/lib*

      give you?

      On 12.04 I get (with extra things removed):

      $ ls /etc/alternatives/lib*
      /etc/alternatives/libblas.a
      /etc/alternatives/liblapack.so.3gf
      /etc/alternatives/libblas.so
      /etc/alternatives/liblapack.a
      /etc/alternatives/libblas.so.3gf
      /etc/alternatives/liblapack.so

      My assumption is that you won't see the '/etc/alternatives/libblas.so.3gf' line. At least I think that is why you are getting that error message.

      You might try contacting the package maintainers. It seems like a bug (and one they would want to know about!) Please report back on what you find.

      Reply
      1. rphilipchalmers

        Actually I do see it there. Here's is what is in the directory:

        $ ls /etc/alternatives/lib*
        /etc/alternatives/libblas.a
        /etc/alternatives/libblas.so.3gf
        /etc/alternatives/liblapack.so
        /etc/alternatives/libtxc-dxtn-i386-linux-gnu
        /etc/alternatives/libblas.so
        /etc/alternatives/libgksu-gconf-defaults
        /etc/alternatives/liblapack.so.3
        /etc/alternatives/libtxc-dxtn-x86_64-linux-gnu
        /etc/alternatives/libblas.so.3
        /etc/alternatives/liblapack.a
        /etc/alternatives/liblapack.so.3gf
        /etc/alternatives/libxnvctrl.a

        So it's there. Something else might be going on then...Thanks for the quick reply, and let me know if you can think of anything else. Otherwise I'll look into filing a bug report. Cheers.

        Reply
        1. nmv Post author

          That is strange! Unfortunately we have reached the limit of my expertise. Best of luck with getting it fixed.

          Reply
  6. safish

    Hi,

    Thanks for this helpful post. I am not using OpenBLAS, but I directly use GotoBLAS2 which OpenBLAS is also based on, and I have been stuck for the last several days at an issue, and google search or asking on forums were not helpful at all. Hope you would have a response..

    I am trying to use GotoBLAS2 on R 3.0 on Unix. I downloaded GotoBLAS2 source code from TACC web site, compiled it, and replaced libRblas.so with libgoto2.so, following the instructions at the link http://www.rochester.edu/college/gradstudents/jolmsted/files/computing/BLAS.pdf. The simple matrix operations in R like "determinant" are 20 times faster than before (I am using huge matrices), which is good. However, I cannot use many cores in parallel now.

    You may tell that "You do not need to use multiple cores while using GotoBLAS2, it already uses multiple threads, and even multiple cores.". But I still need to use multiple cores not for simple matrix operations, but for performing some tasks on many different files independently, which is a great reason for parallelism. What's horrible is, after replacing libRblas.so with libgoto2.so, I cannot use %dopar% any more in any script. Operations using %dopar% takes forever.

    Below is an example showing that GotoBLAS2 gets stuck when I use %dopar% (that's not my aim for using multiple cores, it is just an example). This code was still running after 24 hours, when I finally killed it. But if I use %do% instead of %dopar%, it takes just a second. When I was using R's default BLAS library, I could get the result from below code with %dopar% in a few seconds. (Btw, my machine has 24 cores)

    library("foreach")
    library("doParallel")

    registerDoParallel(cores=2)
    set.seed(100)

    foreach (i = 1:2) %dopar% {
    a = replicate(1000, rnorm(1000))
    d = determinant(a)

    So, is it possible to use many cores at the same time with GotoBLAS2, do you have any ideas?

    Thanks a lot in advance.

    Reply
    1. nmv Post author

      Hi safish,

      Unfortunately, this seems to be a bug with GotoBLAS2 / Open BLAS. This comment has a few more details.

      My apologies that I cannot be of more use. Perhaps add a +1 to the bug tracking so that the developers would consider addressing the issue?

      Reply
  7. safish

    Thanks, I think this is a bug. I ended up using BLAS single-threaded by setting the GOTO_NUM_THREADS environment variable whenever I use R multicore.

    Reply
  8. Mauricio Zambrano-Bigiarini

    Thanks you very much Nathan for this very useful post.

    I tried it in LinuxMint 16 (MATE), but when I run:

    sudo update-alternatives --config libblas.so.3gf

    I got the following error message:

    "update-alternatives: error: no alternatives for libblas.so.3gf"

    However, thanks to one of your replies to a previous post, I was able to find a solution with:

    sudo update-alternatives --config libblas.so.3

    Reply
  9. Mauricio Zambrano-Bigiarini

    For LinuxMint 16 (MATE)I forgot to include the command:

    sudo update-alternatives --config liblapack.so.3

    for choosing between LAPACK implementations

    Reply
  10. Pingback: Computational Prediction - BLAS 설정으로 R, numpy 성능 높히기

  11. Matthew Bromberg

    The parallel bug for openblas will bite you in linux mint 16 if you use python and numpy. I'm seeing the same nasty effect of all my cores pegged at 100% while linear algebra slows to a crawl.

    Reply
  12. Pingback: The performance gains from switching R’s linear algebra libraries | On the lambda

  13. Micha M

    I've tried this, and have had no problem setting up the libs on Ubuntu 13.10. However, I'm quite surprised by the results I'm getting. I don't use R, so instead I wrote a very rudimentary Octave benchmark script:

    rand("seed",5)
    rm1 = rand(1000,1000);
    tic
    for k=1:200
    rm1 = rm1 * inv(rm1);
    end
    toc

    Essentially I multiply the matrix by its inverse and storing it back in the original, and repeat this operation 200 times. Now here's what surprises me: when I try it with OpenBLAS, I see the CPU percentage go to ~400% (all cores utilized) and it takes some 42.5 seconds. When I use ATLAS, CPU usage is limited to 100% and it takes 109 seconds. BUT, when I use the default libblas I also get only 100% CPU - but the time the script takes is only 23.5 seconds! I ran the script several times with basically the same results. How come the optimized libs are getting spanked so thoroughly by the default implementation?

    Reply
    1. nmv Post author

      Hi Micha,

      My hunch is that it has something to do with rm1 being set to the identity matrix after the first iteration. Perhaps the standard lib is smarter with identity matrix inversion than the optimized libs.

      Try the same script, but move the random number generation inside of the loop. Does the standard library still get spanked?

      Reply
  14. Micha M

    Thanks for your suggestion. Of course, it was foolish of me to let the operations run on the identity rather than on a random matrix. However, I didn't actually want to move the random generation into the loop - I don't know how long it takes to run so it might skew the results. So instead, I changed the operation inside the loop to:

    rm1 = rm1 * (inv(rm1) - eye1000);

    with eye1000 being an identity matrix of rank 1000. This has had a dramatic effect on results: the standard BLAS now runs the benchmark (using one core, as previously) in 554 seconds, while OpenBLAS takes just 66 (and utilizes 4 cores, or actually 2 hyperthreaded physical cores). Amazing.

    One thing that still bothers me, though, is understanding what's going on. You've suggested that the standard BLAS is smarter with the identity matrix. While this could definitely be the case, in order to do that it would need to know in the first place that it IS an identity matrix that it's working on. Now, this would require two things - one is that the inversion of rm1 and the subsequent multiplication are both numerically accurate enough that ALL non-diagonal values can be ignored; the other is that a routine is run to check whether the matrix is an identity matrix before the calculation begins. I think that the chances for each of these hapenning are small. It seems more plausible to me that the matrices are checked not for being identity matrices but for being sparse. In the original implementation of the script, even with non-perfect numerical accuracy a large portion of values would be zero after a couple of iterations. In the new implementation, the matrices are never sparse. So, I now know that OpenBLAS berfoms way better on dense matrices but I still have to test it on sparse ones.

    As I'm writing this, I've now made a small test to see how quickly we have numerical convergence towards the identity matrix. It seems to converge nicely - After one rm1=rm1*inv(rm1), there are only 949 values in the whole matrix that are actually 0 and only 1 is actually 1. After 2 iterations there are 42900 zeros in the matrix and all 1000 1's are correct, after 3 iterations there are 330,000 zeros, after 4 for some reason it's back to 87500, after 5 it's 287,000 and after teh 6th iteration finally we have all 999,000 correctly identified. So, this leaves open the question whether matrices are checked by the standard lib only for sparsity or also for being an identity matrix.

    Reply
  15. vishalbelsare

    While comparing ATLAS and OpenBLAS on Ubuntu, has ATLAS been compiled on the target machine or has it been installed with an 'apt-get install' incantation?
    My experience is that the apt-get way gets us ATLAS which is not geared for exploiting multicore CPUs while compiling OpenBLAS one is likely to set the threads to machine specification.
    While I have both OpenBLAS and ATLAS installed, I compiled ATLAS, by getting the source by 'apt-get source' and then building the package on a 12 core target machine.

    Reply
  16. Pudding

    Thanks for sharing.
    I wonder there are any ways to specify the threads number? My machine is 2 cores. And the thread info tells me I'm using 2 threads when I'm running R command.
    Can it be set to 4 threads for R computation?

    Reply
  17. Pingback: 同时通过OpenBLAS和mclapply加速R运算 | f(Program,Poet)=Programet

  18. dan

    ** Memory and CPU comparison **
    Hi Nathan,
    Many thanks for this extremely useful post - I would recommend anyone who is using lapack to spend 5 minutes reading this. A few
    observations:
    (a) In the code you have posted in your article
    sudo update-alternatives –config liblapack.so.3
    should be
    sudo update-alternatives --config liblapack.so.3
    (b) I do not use R but call lapack from a Fortran program using Linux (Mint (mate) 16)
    For linux users a simple way to time your programs is given by
    /usr/bin/time -v [./program.exe]
    where [./program.exe] is whatever you normally type on the command line
    to run your program (e.g. ./fort.exe , python program.py ,...)
    (c) For generating and solving an Ax = b matrix equation for matrix of size 10,000x10,000
    my results were (min time of 3 runs, performed on quiet core i7 -3930K, 64GB RAM) :
    elapsed (wall clock) time memory
    default blas, default lapack: 2:42.94 736MB
    atlas blas, atlas lapack: 1:21.00 742MB
    open blas, default lapack: 0:55.28 1200MB

    As you can see openblas was the fastest of the three, almost 3x faster than the default, however openblas does also require the largest
    amount of RAM, so if you have restricted memory atlas might be a good choice - giving a 2x speed up while only requiring slightly more memory than the default.

    Reply
    1. nmv Post author

      Hi Dan,

      Thanks for the memory comparison. Since my problems are usually CPU bound, it didn't occur to me to dig into memory usage.

      I have one quick question about your comment. In part a) of your comment, the two update-alternatives lines are the same. Did you intend to point out the difference between liblapack.so.3 and liblapack.so.3gf that I (attempted to) document in the "Side Notes"?

      Reply
  19. dan

    Hi Nathan,
    Yes, apologies my correction has been rendered the same as the original. I was trying to say that you need a double hyphen (two minus signs) before the 'config' rather than just one. I discovered the problem by attempting to copy-paste from the code you have included in your original post above.
    The code you have for the update-alternatives for the libblas.so.3 is correct (it has a double-hyphen before the config), and since in your guide
    this comes before the lapack update-alternatives I think most people will work out what the problem is with the lapack line as I did, but anyway just to let you know. Thanks again for this extremely useful posting.
    Cheers, Dan.

    Reply
  20. Pingback: Optimizing R | Jeffrey Chin Tang Wong

  21. Pingback: Cuando tus herramientas fallan: Ubuntu, R, Atlas y fallos bizantinos

  22. Jeremy Duncan

    On ubuntu 12.04 I am getting clashes with octave3.2 and R packages :- R packages cannot find lblas.

    I am using Rstudio Version 0.98.501 and R

    R version 3.1.1 (2014-07-10) -- "Sock it to Me"
    Copyright (C) 2014 The R Foundation for Statistical Computing
    Platform: i686-pc-linux-gnu (32-bit)

    So when I apt-get octave3.2 it remove libblas.

    I have removed octave and built all my R packages which may help.

    I am also going to try and build Octave 3.8 from sources.

    I don't understand the update-alternatives which may be a good part of the problem.

    Any suggestions greatly appreciated,

    Jeremy

    thanks, Jeremy

    Reply
  23. Pingback: Compile R and OpenBLAS from Source Guide - Lindons Log

  24. lindonslog

    I compiled openblas with NO_AFFINITY=1 but my R process is always at 100% CPU usage, whereas it should be at 800%. It's not working for me. Using /cat/PID/status i can see that R has 8 threads and if I use ldof -p PID I can see that the openblas library is open, but still, only 100%.

    Reply
      1. lindonslog

        I'm using red hat on my office computer and I do not have root privileges to use yum so I'm building R and openblas from source in my home directory. Don't worry though, I managed to get things working correctly since the last post :)

        Reply
  25. Joe Herman

    Very thorough post -- thanks. However, I think the main conclusion (OpenBLAS is faster than ATLAS) is dependent on the specific way you set up ATLAS.

    Firstly, let's focus on the key part of the R benchmark from the perspective of testing the BLAS & LAPACK libraries, which is the 'Matrix functions' section. When I run the test on my system (running R 3.1.1) using OpenBLAS and the apt-get version of ATLAS, I get the following:

    # OpenBLAS
    FFT over 2,400,000 random values____________________ (sec): 0.221333333333333
    Eigenvalues of a 640x640 random matrix______________ (sec): 0.796999999999999
    Determinant of a 2500x2500 random matrix____________ (sec): 0.19
    Cholesky decomposition of a 3000x3000 matrix________ (sec): 0.161666666666665
    Inverse of a 1600x1600 random matrix________________ (sec): 0.188333333333335
    Trimmed geom. mean (2 extremes eliminated): 0.199331471542089
    Overall time: 64.95user 21.82system 0:30.80elapsed 281%CPU

    # ATLAS:
    FFT over 2,400,000 random values____________________ (sec): 0.231666666666667
    Eigenvalues of a 640x640 random matrix______________ (sec): 0.333666666666666
    Determinant of a 2500x2500 random matrix____________ (sec): 0.707333333333334
    Cholesky decomposition of a 3000x3000 matrix________ (sec): 0.610333333333332
    Inverse of a 1600x1600 random matrix________________ (sec): 0.595333333333336
    Trimmed geom. mean (2 extremes eliminated): 0.494933333222859
    Overall time: 43.97user 0.32system 0:44.31elapsed 99%CPU

    In this case, OpenBLAS is faster for everything except eigenvalue computation, but ATLAS is clearly only using one core.

    However, installing ATLAS properly (downloading and compiling from http://sourceforge.net/projects/math-atlas), and rebuilding R --with-blas and --with-lapack, using static libraries, I get the following:

    # ATLAS:
    FFT over 2,400,000 random values____________________ (sec): 0.220333333333333
    Eigenvalues of a 640x640 random matrix______________ (sec): 0.35
    Determinant of a 2500x2500 random matrix____________ (sec): 0.196666666666668
    Cholesky decomposition of a 3000x3000 matrix________ (sec): 0.152
    Inverse of a 1600x1600 random matrix________________ (sec): 0.199999999999999
    Trimmed geom. mean (2 extremes eliminated): 0.205406249285989
    Overall time: 46.73user 1.51system 0:29.21elapsed 165%CPU

    which is essentially the same as OpenBLAS on all accounts (except for eigenvalues, where ATLAS is still much quicker), despite using less CPU power.

    The main conclusion of my testing is that OpenBLAS isn't faster than ATLAS, but it is much easier to install OpenBLAS via apt-get than it is to compile ATLAS and R manually from source. Hence, for a 'quick fix' on Ubuntu to improve R from its default, OpenBLAS may still be the best option. However, for optimal performance (requiring a bit more effort to set up), ATLAS may be better.

    Reply
    1. nmv Post author

      Thanks for sharing. When I transition my laptop to Ubuntu 14.04 I'll makes some time to build ATLAS and give it a shot.

      Reply
  26. Hong Liu

    Thanks for sharing!

    I have spent a huge amount of time on building the ATLAS from the source code on the OpenSUSE 13.1... Is it just a waste of time?

    1. Does that easy-install (without tuning on your computer) "libatlas" in the repository really improve the computation performance?

    2. is OpenBLAS better than ATLAS or only better than the easy-install "libatlas" in the repositories of Ubuntu and OpenSUSE?

    Reply
    1. nmv Post author

      1) I don't know. I have not run the comparison with a tuned version of ATLAS.

      2) It depends. See this comment by Joe Herman for a partial counter-example.

      Reply
  27. Pingback: Numpy with ATLAS or OpenBLAS?

Leave a Reply