Friday 11/06/15

(Wed and Thur in NYC for Dale Frey interview)

9:30 am – 7:00 pm

sequencing data processing

  • bowtie finished running on all the datasets (20)
  • Need to run cufflinks.
  • first need to sort sam data
  • using sam tools (should have set this up to run overnight, it’s quite slow)
  • also need to copy the data
  • (this RNA-seq pipeline I have sucks — 6 hours to download the data, 4 hours to unzip it, overnight to run bowtie (probably finished in less than 1 hour since I did that multicore 20 threads), 2 hours to resort the data with samtools (longer because I was messing around with trying to multicore this), then XX hours to copy the data to RC and XX hours to run cufflinks.

Sample organization:


Sample Name Tube Tindex IndexSeq NEB Description
Ph1 A 1 ATCACG 1 original Ph KD sample
M1 A 2 CGATGT 2 original WT control sample
2-4P A 3 GCCAAT 6 KD performed on day 0 and day 2, extracted on day 4
4W A 4 CAGATC 7 latest WT sample, extracted on day 4
4P A 5 ACTTGA 8 latest Ph-KD sample, extracted on day 4
10-22#1 B 1 ATCACG 1 (WT) sample extracted on day 4 (need to check ID)
10-22#2 B 2 CGATGT 2 (Ph-KD) sample extracted on day 4 (need to check ID)
2W B 3 TTAGGC 3 WT sample, extracted on day 2
2P B 4 TGACCA 4 Ph-KD sample, extracted on day 2
2-4W B 5 ACAGTG 5 WT for KD performed on day 0 and day 2, extracted on day 4

bash CufflinksArray1.bash

repo issues

  • somehow my bt2 files got put in my repo and accidently committed last night
  • BFG is an excellent tool for quickly and easily scrubbing these from the repo (https://rtyley.github.io/bfg-repo-cleaner/), which has previously been a horrible pain.
This entry was posted in Summaries. Bookmark the permalink.