Using the bioconductor AMI

The current BioC release is 3.3 for which there is an Amazon AMI (ami-64d43409). I’m not sure why, but the AMI is in the us-east1 region which isn’t convenient for me, but we do not have permission to copy it. So all services have to be in the same availability zone (AZ) in us-east(Virginia). If there’s time, I’ll have to roll my own to have something closer to home.

Not sure if my analysis is more memory or compute heavy. Going with memory optimized. r3.2xlarge $0.66/hour.

  • r3.2xlarge $0.66/hour
  • EBS $0.05/hour

Security group: enable ssh and https (22 and 80)

Connect to instance
ssh -i /path/to/key.pem ec2-user@ip-from-aws-console

Attach an EBS (console)

      go to EC2, select Volumes, create volume 250gb of gp2

        make sure its in the same availability zone as EC2 instance

          select volumne, select attach (instance must by stopped or not running)

If the EBS is new(never used), you can see the device size and mount point

lsblk

# if this returns data, there is not filesystem and we need to create one
sudo file -s /dev/xvda1
# format
sudo mkfs -t ext4 device_name
# create mount point
sudo mkdir /data
mount /dev/xvdf /data

Details where here

To upload data

sftp -i /path/to/key.pem ubuntu@ip-from-aws-console
cd /my/dirir
put, etc

Doing it better With the need to upload data first, there are two options I can think of. First, upload to S3 then copy over to machine instance or EBS volume. The other possibility to save $$$ would be to use a low cost instance with EBS, upload data to it, then spin up a larger instance and reattach the same volume.

Using RStudio
Point browser to the public IP for the AMI instance. Login with user/pass ubuntu/bio.

Permissions problems
within RStudio, I had problems loading dplyr, etc despite the fact they were present in the /home/ubuntu/R-libs directory.
I tried:
in /etc/rstudio/rsession.conf: set r-libs-user=~/R-libs
And in /home/ubuntu/R-libs
sudo chown -R ubuntu.ubuntu *
sudo chmod -R 755 *

Not sure which one worked, but RStudio now seems to work okay.