The current BioC release is 3.3 for which there is an Amazon AMI (ami-64d43409). I’m not sure why, but the AMI is in the us-east1 region which isn’t convenient for me, but we do not have permission to copy it. So all services have to be in the same availability zone (AZ) in us-east(Virginia). If there’s time, I’ll have to roll my own to have something closer to home.
Not sure if my analysis is more memory or compute heavy. Going with memory optimized. r3.2xlarge $0.66/hour.
- r3.2xlarge $0.66/hour
- EBS $0.05/hour
Security group: enable ssh and https (22 and 80)
Connect to instance
ssh -i /path/to/key.pem ec2-user@ip-from-aws-console
Attach an EBS (console)
- go to EC2, select Volumes, create volume 250gb of gp2
- make sure its in the same availability zone as EC2 instance
- select volumne, select attach (instance must by stopped or not running)
If the EBS is new(never used), you can see the device size and mount point
# if this returns data, there is not filesystem and we need to create one
sudo file -s /dev/xvda1
sudo mkfs -t ext4 device_name
# create mount point
sudo mkdir /data
mount /dev/xvdf /data
Details where here
To upload data
sftp -i /path/to/key.pem ubuntu@ip-from-aws-console
Doing it better With the need to upload data first, there are two options I can think of. First, upload to S3 then copy over to machine instance or EBS volume. The other possibility to save $$$ would be to use a low cost instance with EBS, upload data to it, then spin up a larger instance and reattach the same volume.
Point browser to the public IP for the AMI instance. Login with user/pass ubuntu/bio.
within RStudio, I had problems loading dplyr, etc despite the fact they were present in the /home/ubuntu/R-libs directory.
in /etc/rstudio/rsession.conf: set r-libs-user=~/R-libs
And in /home/ubuntu/R-libs
sudo chown -R ubuntu.ubuntu *
sudo chmod -R 755 *
Not sure which one worked, but RStudio now seems to work okay.