Tuesday 11/10/15

10:00 am

Tasks

Manuscript

  • Revise response to reviewer 1
  • Figure updates
    • change blue color
    • fix symbols
  • discussed changes with XZ again

Fastq files

  • run 1: 150902_NS500422_0189_AHHNKFBGXX Illumina HiSeq 2500, 100 bp single read, rapid flowcell
  • run 2: 151030_SN343_0533_AC8CW1ACXX Illumina HiSeq 2000, 50 bp single read, standard cell

Run 1 checksums

5_S5.R1.fastq.gz 35edcf8de86c0121ed0299e3026c3996
3_S3.R1.fastq.gz 225a5a5c2318e07deb044fd2bb5f5d06
2_S2.R1.fastq.gz 07203cda1dde852c01c3eb1014e97574
6_S6.R1.fastq.gz d07b0b67730adf45d5e3abf9926bf2d1
1_S1.R1.fastq.gz 6a61879b157d3710e4e81c07ef81c920
4_S4.R1.fastq.gz 63cdc9bc347d26f43e2d0e72fba469c5

windows checksum doesn’t give a checksum report, it just runs for a while and then stops.

Identified issues with Fastq files

  • some reads have no read data
  • writing script to remove reads (all 4 lines) which have no sequences (just an index primer).

More failed programs

  • Odyssey claims to have a fastqc installed but it doesn’t run.
    • https://portal.rc.fas.harvard.edu/apps/modules
    • typing exactly module load fastqc/0.11.3-fasrc01 gives an error.
  • tried virtualbox. Can’t access local files. This is stupid unfortunate.
  • tried bioawk. Installed on linux virtualbox. can’t get on Odyssey. Can’t access local files to run.
  • tried fastqValidate. Can’t install on windows. install fails on linux virtualbox.
  • removing blanks by hand with custom matlab script
  • accidently added spaces in printing new fastq files. Fixed this.
  • fixed files do run in bowtie, but it still complains about a bunch of reads having insufficient length
  • wrote new matlab file to cut on read lengths less than 20
    • had some errors with this script.
    • fixed issue with not checking read lines (was checking all lines, then was checking only 1st line, not read line).
  • lots of issues with disk copy speed.
  • was cutting at 20 bp reads, I think this might cut too much — not resemble data. Dropped to cutting at 10 bp reads.
This entry was posted in Summaries. Bookmark the permalink.