Saturday 03/23/13

11:00 A

Cell culture

  • Moved cell to 28C incubator
  • observe substantial cell debris in dish (dead cells from heat shock?)

Genomics

Installing OligoArray2.0

  • http://berry.engin.umich.edu/oligoarray2/
  • Need to install legacy BLAST.
  • Installed.
  • Need to install OligoArrayAux: http://mfold.rna.albany.edu/?q=DINAMelt/OligoArrayAux.
  • Error with database

     in2 = ['java -Xmx512m -jar OligoArray2.jar -i chr1.fas ',...
    '-d /BlastDb2/yeast_orf.fas -o oligo.txt ',... 
    '-n 2 -l 45 -L 47 -D 1500 -t 82 -T 88 -s 65 -x 65 -p 35 -P 50 ',... 
    '-m "GGGGG;CCCCC;TTTTT;AAAAA"'];
    system(['cd ',folder,' && ',in2]);
        *** OligoArray 2.0.1    *** 
    
    OligoArray 2.0.1 will start to process sequences from the file chr1.fas using the following parameters : 
    Blast database: '/BlastDb2/yeast_orf.fas' 
    Oligo data will be saved in: 'oligo.txt' 
    Sequence without oligo will be saved in: 'rejected.fas' 
    The log file will be: 'OligoArray.log' 
    Maximum number of oligo to design per input sequence: '2' 
    Size range: '45' to '47' 
    Maximum distance between the 5' end of the oligo and the 3' end of the input sequence: '1500' 
    Minimum distance between the 5' ends of two adjacent oligos: '69' 
    Tm range: '82' to '88' 
    GC range: '35.0' to '50.0' 
    Threshold to reject secondary structures: '65.0' 
    Threshold to start to consider cross-hybridizations: '65' 
    Sequence to avoid in the oligo: 'GGGGG;CCCCC;TTTTT;AAAAA' 
    Number of sequence to run in parallel: '1' 
    
    data initialized 
    
    Check if /BlastDb2/yeast_orf.fas is a valid Blast database 
    java.io.IOException: Cannot run program "blastall": CreateProcess error=2, The system cannot find the file specified 
    WARNING: /BlastDb2/yeast_orf.fas is not a valid Blast database
    

Darn.

  • Installed OligoArray 2.1
  • This also errors when called from matlab:

    Can OligoArray read/write specified files?  YES 
    Data initialization: DONE 
    Is BlastDb/yeast_orf.fas a valid Blast database?  java.io.IOException: Cannot run program "blastall": CreateProcess error=2, The system cannot find the file specified 
    NO 
    Is OligoArrayAux installed?  NO 
    
    Design aborted due to a failure in the test above 
    
  • However it runs correctly when called from the GUI.

  • Requires Legacy Blast library for Drosophila genome to run.

  • Requires min distance between 2 oligos > 0 to run.

  • This allows execution without error message. However, I get extremely few oligos out, especially given the generous bounds set. Not sure why this is happening.
  • Wrote to BB for parameters.
  • GUI seems to not be listening to its own GC bounds: “rejected due to high percent of GC: 51.0” (even though max is 75%).

Matlab coding implementation of OligoArray

  • Using Jeff’s LoadGenbank
  • Fixed some bugs in LoadGenbank dealing with introns etc.
  • Wrote OligoArrayDmel.m (modified from Jeff’s OligoArray script). Initializes fine. Is currently aborting with the error: java.lang.StringIndexOutOfBoundsException: String index out of range: -2
  • Rewrote Dmel_plus2sense_fasta.m parser for the genome to use simple FASTA headers. This fixed the problem.
  • Still getting small number of probes, even after upping max probes to 10000 (from 30).
  • Maybe -D option (7000) is the problem: “Maximum distance between 5′ of oligo and 3′ of query”. Not exactly sure what this is.
  • OligoArray gives warning (It may take a while depending the value entered for the -D option), I upped this to 90,000 and we’ll see what happens for chr4. Maybe this is why they run on cluster? Though it seems the fasta file needs to be parsed into chunks if the parallel processor capability is to be used…
  • 90,000 takes forever and won’t launch, but just increasing this to 15000 substantially increases the number of oligos (from 108 to 235, so this seems to be limiting).
  • writing new fasta genome parser: Dmel_plus2sense_sections_fasta:

    1. each inter-feature region and each feature region is a separate file
    2. inter-feature regions of feature regions larger than max_length get split up into subfragments.
  • Functional implementation:

    1. parse genome into 1kb sections (Needs to be less than 10kb to have any kind of efficiency in execution). Dmel_plus2sense_sections_fasta.m handles this.
      • Need to flip transcribed regions into appropriate sense/anti-sense features
      • Need to avoid overlapping regions of features and internally embedded features (like many miRNAs).
    2. Merge into a single fasta file (can put in a common folder and use MergeTxtFiles.m in Genome Code (matlab-genomics).
    3. Run OligoArrayDmel.m to call OligoArray on the assembled fasta.
  • Outstanding problems:

    1. Add heterochromatic regions and mitochondrial genome into BLAST library
    • Can’t seem to find chr2LHet etc? or chrY?
    • Neither of these seem to have it (unless it’s included in the chr2L… release 5 was supposed to have the heterochromatin)
    • http://www.ncbi.nlm.nih.gov/bioproject/PRJNA164
    • http://www.ncbi.nlm.nih.gov/assembly/238028/

Oligopaints: To Do:

  • write up final design of PPT approach
  • write up assymetric PCR approach
  • write up nicking based tertiary approach
  • Design adapter primers for initial colors library
  • Order adapter primers for initial colors library
This entry was posted in Summaries and tagged , . Bookmark the permalink.