11:00 A
Cell culture
- Moved cell to 28C incubator
- observe substantial cell debris in dish (dead cells from heat shock?)
Genomics
Installing OligoArray2.0
- http://berry.engin.umich.edu/oligoarray2/
- Need to install legacy BLAST.
- Installed.
- Need to install OligoArrayAux: http://mfold.rna.albany.edu/?q=DINAMelt/OligoArrayAux.
-
Error with database
in2 = ['java -Xmx512m -jar OligoArray2.jar -i chr1.fas ',... '-d /BlastDb2/yeast_orf.fas -o oligo.txt ',... '-n 2 -l 45 -L 47 -D 1500 -t 82 -T 88 -s 65 -x 65 -p 35 -P 50 ',... '-m "GGGGG;CCCCC;TTTTT;AAAAA"']; system(['cd ',folder,' && ',in2]); *** OligoArray 2.0.1 *** OligoArray 2.0.1 will start to process sequences from the file chr1.fas using the following parameters : Blast database: '/BlastDb2/yeast_orf.fas' Oligo data will be saved in: 'oligo.txt' Sequence without oligo will be saved in: 'rejected.fas' The log file will be: 'OligoArray.log' Maximum number of oligo to design per input sequence: '2' Size range: '45' to '47' Maximum distance between the 5' end of the oligo and the 3' end of the input sequence: '1500' Minimum distance between the 5' ends of two adjacent oligos: '69' Tm range: '82' to '88' GC range: '35.0' to '50.0' Threshold to reject secondary structures: '65.0' Threshold to start to consider cross-hybridizations: '65' Sequence to avoid in the oligo: 'GGGGG;CCCCC;TTTTT;AAAAA' Number of sequence to run in parallel: '1' data initialized Check if /BlastDb2/yeast_orf.fas is a valid Blast database java.io.IOException: Cannot run program "blastall": CreateProcess error=2, The system cannot find the file specified WARNING: /BlastDb2/yeast_orf.fas is not a valid Blast database
Darn.
- Installed OligoArray 2.1
-
This also errors when called from matlab:
Can OligoArray read/write specified files? YES Data initialization: DONE Is BlastDb/yeast_orf.fas a valid Blast database? java.io.IOException: Cannot run program "blastall": CreateProcess error=2, The system cannot find the file specified NO Is OligoArrayAux installed? NO Design aborted due to a failure in the test above
-
However it runs correctly when called from the GUI.
-
Requires Legacy Blast library for Drosophila genome to run.
-
Requires min distance between 2 oligos > 0 to run.
- This allows execution without error message. However, I get extremely few oligos out, especially given the generous bounds set. Not sure why this is happening.
- Wrote to BB for parameters.
- GUI seems to not be listening to its own GC bounds: “rejected due to high percent of GC: 51.0” (even though max is 75%).
Matlab coding implementation of OligoArray
- Using Jeff’s LoadGenbank
- Fixed some bugs in LoadGenbank dealing with introns etc.
- Wrote OligoArrayDmel.m (modified from Jeff’s OligoArray script). Initializes fine. Is currently aborting with the error:
java.lang.StringIndexOutOfBoundsException: String index out of range: -2
- Rewrote Dmel_plus2sense_fasta.m parser for the genome to use simple FASTA headers. This fixed the problem.
- Still getting small number of probes, even after upping max probes to 10000 (from 30).
- Maybe
-D
option (7000) is the problem: “Maximum distance between 5′ of oligo and 3′ of query”. Not exactly sure what this is. - OligoArray gives warning
(It may take a while depending the value entered for the -D option)
, I upped this to 90,000 and we’ll see what happens for chr4. Maybe this is why they run on cluster? Though it seems the fasta file needs to be parsed into chunks if the parallel processor capability is to be used… - 90,000 takes forever and won’t launch, but just increasing this to 15000 substantially increases the number of oligos (from 108 to 235, so this seems to be limiting).
-
writing new fasta genome parser: Dmel_plus2sense_sections_fasta:
- each inter-feature region and each feature region is a separate file
- inter-feature regions of feature regions larger than
max_length
get split up into subfragments.
-
Functional implementation:
- parse genome into 1kb sections (Needs to be less than 10kb to have any kind of efficiency in execution).
Dmel_plus2sense_sections_fasta.m
handles this.- Need to flip transcribed regions into appropriate sense/anti-sense features
- Need to avoid overlapping regions of features and internally embedded features (like many miRNAs).
- Merge into a single fasta file (can put in a common folder and use
MergeTxtFiles.m
in Genome Code (matlab-genomics). - Run
OligoArrayDmel.m
to call OligoArray on the assembled fasta.
- parse genome into 1kb sections (Needs to be less than 10kb to have any kind of efficiency in execution).
-
Outstanding problems:
- Add heterochromatic regions and mitochondrial genome into BLAST library
- Can’t seem to find chr2LHet etc? or chrY?
- Neither of these seem to have it (unless it’s included in the chr2L… release 5 was supposed to have the heterochromatin)
- http://www.ncbi.nlm.nih.gov/bioproject/PRJNA164
- http://www.ncbi.nlm.nih.gov/assembly/238028/
Oligopaints: To Do:
- write up final design of PPT approach
- write up assymetric PCR approach
- write up nicking based tertiary approach
- Design adapter primers for initial colors library
- Order adapter primers for initial colors library