Lander seminar

Background
* PhD in Math from Oxford, undergrad at Princeton, started as a professor in business school at 24
* Met David Botstein to integrate in genetics and mathematics
* Sydney Brenner on “omics” has created the idea that if you get a lot of data it will all work out. ‘low input high throughput no output’.

Introduction

  • Map: will focus on lnRNAs, plus some general introduction)
  • Guckman and Einreitz. (colleagues)

Human genome project

  • genetic map, physcial map, sequence, gene list. incremental increase. freely available without restriction.
  • no centromeres, telomeres etc.

Back to buisness as usual? — no, too many maps?

  • Genetic maps
  • physical map
  • 3D folding maps
  • changed the way we thought about biology — completeness matters
  • e.g. can uniquely ID proteins on mass spec since you know all possilbe options.

what have we learned?

View from 2001

  • 35-120K protein genes
  • lots of transposons parasites junk
  • few regulatory sequences
  • non-coding RNA few examples
  • all not true.

Conservation

  • 50 vertebrate genomes.
  • draft genome couldn’t see more 30,000 phone debate of maybe 40,000 (100,000 estimated by 3billion bases 30kb each).
  • protein coding genes very clear conservation signature (codons).
  • nucleotides ~6% conserved, 1.2% is protein coding.
  • 29 mammals 3 million 10 bp conserved elements (4.7%) — occur in gene poor regions. Do contain a gene: a developmentally important gene.
  • sequences conserved to placental mammals not to marsupials. Little protein evolution, more substantial non-coding regulation.
  • genome shuffling of regulatory elements by transposons? (symbiotic not parasytic, help reuse the other elements) L1 LINE element, Kanga2 transposons.

Mapping interactions

  • Mikklesen, Bernstein, Meissner, Guttman Rin, Lieberman-Aiden
  • ‘3D structure of the human genome’
  • Q (can we barcode sequence regions and do larger scale whole genome interaction?)
  • Dekker 3C 2002.
  • “Turns out the genome is Scottish” plaid (chr 14). — 2 compartments ‘open and closed chromatin’
  • equilibrium globule — random fold, 3D separation related to linear seperation exponent -1.5.
  • fractal globule exponent ~ -1 (-1.08)

Chromatin state

  • methyl states

what have we learned?

  • lincRNAs (with Guttman, Engreitz and Rinn)
  • low level transcription in lots of places (everywhere? – just noise?)
  • not worth evolution’s trouble to stop it? [maybe some genes in some tissues need to be better silenced]
  • K4me3, K36me3 regions not currently genes. Transcripts? Conserved? promoters? spice junctions? potential protein coding?
  • no codon conservation. Does preserve nucleotides, more patchy.
  • Ribosome profiling ‘confusion’. Ribosomes occupy non-coding RNAs. this upset people. look more like protein coding occupancy rather than 3′ UTR occupancy. Look more like 5′ UTRs though — also on ribosomes (scanning?). Presence of ribosome not indicative of making a protein.
  • 3′ UTR drop indicative of ribosome release maybe a better a score of protein coding.
  • who are lincRNAs coexpressed with?
  • 200 mESC cells. knockdown with shRNA. 90% have effect on gene expression, ~26 needed for plurpotence maintenance, ~30 needed to repress differnet lineages.
  • bind different chromatin proteins, invovled in gene regulation. Proposal: Organize proteins into complexes.

where are lincRNAs bound in the genome?

  • model system Xist. (student Engreitz). 120mer antisense proves against RNA, paint entire message (120mer can wash more than shorter). Oligos have bio tags. 70% of pulldown is X chromosome.
  • Xist binds very broadly, not in focal sites. Some variation though — correlates with K27me3.
  • escape genes have lower coverage (like immediately upstream of Xist).
  • how does it spread?
  • Do 0 – 6 hr sequencing, watch Xist spread from it’s transcription focus.
  • peaked spreading
  • is it jumping or just spreading on the 3D sequence vs. the 1D sequence?
  • Yes – these peaks seem to be close in space.
  • Xist takes longer to coat active genes than inactive genes.
  • if you mutate it’s ability to silence, it never spreads to the active gene regions.
  • Model Pc then gets recruited and packs the gene into a biological blackhole (illustrated with compaction cartoons0.

lincRNA functions

  • Modular scaffold gives patchy conservation (only conserve interacting regions).
  • catalysis (rRNA)
  • template mediated catalysis

the road ahead

  • complete catalogs of interactions?
  • Grammar of regulatory interactions — tell by looking at them what they do.
  • exploit synthetic biology to build thousands of regulator reporters

Questions

  • lincRNA vs lnRNA – stress the “in” ‘intergenic” to separate those that overlap promoters of known genes
  • xist silences autosomes if sequence is embedded
  • Bill Gelbert Q: oskar like (lnRNA and protein)?
This entry was posted in Seminars. Bookmark the permalink.