Wednesday 07/03/2013

9:30 A – 6:30 P, 8:30 P – 10:00 P

STORM 4

  • issues with recent software update
  • troubleshoot with Hazen. Now running.
  • If Dave/Steve freeze on STORM4 need Hal restart
  • running O/N imaging of PhF Psc.

STORM 2

  • took some images of BX-C STORM from 37C hybe. Too much background, though spots are clear in conventional image and STORM. Lots of good signal from spots though (>100K frames with basically no 405 or less than 4%).
  • BX-C probe rehydrated works. Should try on embryos again.
  • running O/N imaging of PhM Psc.

Hox locus

  • Meeting with Aysu, plan looks good for limb buds: start with E13.

Chromatin region design

  • Running on HoxD locus
  • freezes/crashes at several points and runs slowly (mouse genome too big I guess).
  • should run these big genomes on Odyssey, total genome size seems to be substantially time limiting.

Genome access problems

  • UCSC batch region download not working:
    I select
    group: “Mappipng and Sequencing Tracks”
    track: “Assembly”,
    table: “gold”
    region: “defined regions”:
    where I paste a list of my regions, formatted as as indicated. e.g.
    chr2:71,361,210-71,387,504
    chr2:73,101,130-73,127,871

output format: “sequence”

I expect to get a fasta file which has separate fasta entries for each region, followed by the sequence. If I do this for Drosophila I get the whole chromosome, not the specified regions. (i.e. if I have 50 small regions of chr3R, I get a fasta file with 50 entries each of which is the entirety of chr3R), not my specified regions.

If I do this for Mouse for several small regions between 71Mb and 75Mb, I get instead the region: chr2:59,120,642-175,499,301, repeated for each of my entries.

I know I can easily get this data from Flybase and Ensembl and other places, but it’s convenient to do it through UCSC where I have all my custom tracks available.

If I select group: “Custom tracks” and track: “My custom track” and keep the same list of “defined regions”, the resulting BED file outputs do correctly match the positions that I input. It is just when I want the raw sequence for the chosen list of intervals that I get these much larger regions.

My Work-arounds

  • Ensembl doesn’t have an obvious batch downloader for multiple regions of interest.
  • Nor does the mouse tools database: MGI.
  • just took spanning region chr2:59,120,642-175,499,301, output from UCSC and wrote matlab script to parse out the target regions of interest and save them as separate fasta entries.

UCSC Solution:

When you paste a list of regions into the “defined regions” page, the Table Browser finds the regions in the track you selected that overlap with any defined region and returns the entire regions to you. When you retrieve sequence from the gold table, you are getting sequence for entire contigs at a time. In the case of Drosophila, the contigs are often nearly the entire chromosome. Here’s a session that shows the gold table displayed for all of chr2L:
http://genome.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=Rhead&hgS_otherUserSessionName=RM%2311225

To get sequence that corresponds exactly to your list of regions, create a custom track instead. Then select that custom track in the Table Browser and choose “output format: sequence.” Be sure to choose “region: genome” as well. This method is referenced here:
http://genome.ucsc.edu/FAQ/FAQdownloads.html#download32

This entry was posted in Summaries. Bookmark the permalink.