How I set up my Mac for Science

Background

I prefer to use brew to install the ‘missing’ libraries on macOS that come in handy, like weget, or imagemagick, or rename. In addition, I work on the analyses of histological slides, and for that I use slideToolkit which also requires brew on macOS.

So, I thought, “might as well go full monty” and install R and RStudio using brew. This provides me with a powerful method to control many packages/libraries that are needed for macOS and at the same time creates a very clean R environment on my Mac.

Again, Google was my friend, and I found many websites that helped. Below the sites that inspired my workflow.

Step-by-step

First, I make some directories I need: .ssh and bin directory.

  • mkdir -v .ssh – the -v-flag make the mkdircommand verbose; it’s not strictly needed.

  • chmod -vR 0700 ~/.ssh

  • mkdir -v bin

  • chmod -vR 0777 ~/bin 

Second, we will proceed with installing brewCommand Line ToolsXQuartzPython, some Perl modules, and finally R with RStudio. Again, here Python is probably not strictly needed.

  • Install Command Line Tools for macOS El Capitan+ using the instructions here: http://railsapps.github.io/xcode-command-line-tools.html.

  • Install some useful Perl statistics modules:

    • sudo cpan YAML Getopt::Long Statistics::Distributions - I needed this to work get proper p-values calculated in our MetaGWASToolKit

  • Installation of brew, check out http://www.brew.sh or https://github.com/homebrew.

    • /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

  • Now were are ready to brew, and install the following packages:

    • brew install coreutils gnu-sed wget rename git gd libharu git imagemagick lzo hdf5 bison, these are needed for slideToolkit, but also contain some general very useful libraries.

    • brew install libxml2 gdal, for some R packages to work.

    • brew install mariadb-connector-c, for some R packages that require RMySQL to work.

    • brew install findutils.

    • brew install samtools bcftools vcftools, very easy install of these useful programs that are essential for genetic studies nowadays. 

  • For FastQTL and fastQTLToolKit to work we need to install GNU scientific libraries. * brew install zlib boost gsl

  • We need cask to install Xquartz.

    • brew tap caskroom/cask && brew install cask

    • brew cask install xquartz && brew cask install java

Use brew doctor to diagnose in-between installations; if there is a problem with ownership use chown -vR <MYUSERNAME> <A_FOLDER_NAME> to get ownership back recursively (indicated by the R-flag).

  • Install the following Python:

    • brew install python

    • brew install llvm — for LDSTORE to work. To get LDSTORE working on my Mac I also had to jump through some hoops - I’ll tell you all about that later.

    • pip2 install argparse numpy scipy scikit-learn pandas openpyxl xlrd — the latter two are needed for slideToolkitTools, which is a private repository I use for slideToolkit

  • So now, finally, I’m ready to install R.

    • brew install r - I needed R installed with tcl-tk, so I figured out how to do that in an other post. You might not need that.

    • brew cask install rstudio - because I dislike the terminal R.

Finally, as you know, I’m all into genetics, analyses of methylation data, etc. I’ve written a couple of R scripts that might be useful for you to. They help install some commonly (interdependent) packages in R, almost automatically. You can find them in the HerculesToolKit, alongside some other useful scripts.

Previous
Previous

Ewan Birney – On Genetics as a whole, and PRSs, and Robert Plomin’s book