Friday, September 27, 2024

Howard Chang (Stanford, HHMI) 1: Epigenomic Technologies

Howard Chang (Stanford, HHMI) 1: Epigenomic Technologies Science Communication Lab 182K subscribers Subscribe 586 Share Download Clip Save 30,083 views Jan 28, 2020 https://www.ibiology.org/genetics-and... In this talk, Dr. Howard Chang describes epigenomic approaches pioneered by his lab and the role of long-noncoding RNAs (lncRNAs) in regulating gene expression. In Part 1 of this series, Dr. Howard Chang introduces epigenomics, the study of DNA regulatory mechanisms that determine which genes are turned on or off in cells at specific times. The epigenome integrates signals from the environment to modify expression of the DNA blueprint inherited from an individual’s parents. Chang’s lab has pioneered techniques to map the landscape of chromatin, the complex of DNA, RNA and protein that organizes the genome and regulates gene expression. One example is ATAC, the Assay of Transposase Accessible Chromatin, which uses a bacterial transposase to mark open chromatin and identify genes that are likely turned “on”. In his Part 2, Chang introduces long noncoding RNAs, or lncRNAs. As their name suggests, lncRNAs are not translated into proteins, and initially their functions were poorly understood. Chang’s group has developed technologies to better understand the function of lncRNAs. For example, his lab characterized the protein partners that interact with Xist, a canonical lncRNA that mediates X chromosome inactivation. They found that the protein Spen is necessary for X chromosome silencing. Interestingly, Spen has likely been co-opted by mammalian cells to inactivate the X chromosome via viral mimicry. In his Part 3, Chang reminds us that every lncRNA gene has its own set of DNA regulatory elements, such as enhancers and promoters. These regulatory elements can confer functionality to lncRNA genes. Chang shares the research story of a mysterious lncRNA known as PVT1, which is frequently co-amplified with the proto-oncogene MYC in human cancers. His group found that PVT1 promoter activity is inversely correlated with MYC expression - when one is up, the other is down. Finally, Chang shows that the PVT1 and MYC promoters compete for four enhancers located within the PVT1 gene locus. Speaker Biography: Howard Chang is the Virginia and D. K. Ludwig Professor of Cancer Genomics and a professor of dermatology and genetics at Stanford University. He is an Investigator of the Howard Hughes Medical Institute. He studied biochemistry at Harvard University and completed a doctorate in biology at the Massachusetts Institute of Technology and medical degree at Harvard Medical School. The Chang lab pioneers new technologies for probing the function of the non-coding genome. https://med.stanford.edu/changlab.html Chapters View all Transcript Follow along using the transcript. Show transcript Transcript Search in video Intro 0:06 I'm Howard Chang, a Professor in the... Stanford University in California 0:12 and an Investigator of the Howard Hughes Medical Institute. Today, I'll be talking to you in a three-part talk 0:18 about epigenomics and long non-coding RNAs. Epigenetics is a very hot topic today. Epigenetics: Personalized health 0:26 The word literally means "above the genes." And you can remember the catchphrase 0:32 that your DNA is not your destiny. And a very good example of this is that 0:38 nearly every cell in your body has the same DNA, yet your skin cell is not the same as your muscle cell 0:44 or your brain cell. And that is because these cells have choices, choices about which genes to turn on and off. 0:52 And this comprehensive study of these gene regulatory events is the modern study of epigenomics. 0:59 Literally, we can think about epigenomics as studying the living genome. 1:05 This field has evolved so that we can now directly measure these activities, but it has important implications because it has the dynamics 1:13 of the interaction between nature and nurture, what you're born with and the impact of your environment. 1:19 As such, this may have important implications for your personalized health, for example in clinical applications 1:26 and also monitoring health states. Here's another potentially useful analogy Genome, epigenome, and rejuvenation 1:32 to think about the relationship between your DNA -- your genome -- your epigenome, 1:38 and the involvement in potential disease states. We can imagine that your genes 1:43 are like this template information, like this image, and the epigenome as the lens through which the information 1:50 is projected to show this beautiful image. With aging and/or with disease, 1:56 those templates get degraded and the lens might become cloudy, 2:01 so this image is now blurred. The promise of epigenetics is that 2:07 perhaps we can actually fix the situation. Even if the genome information is still somewhat degraded, 2:12 the lens -- the epigenome, through which information projects -- can be corrected. 2:18 And in such a way, like the glasses I'm wearing, we could actually restore this image and basically restore the phenotype 2:26 we associate with a healthful state. That is the conceptual promise. We are at an inflection point in epigenomics 2:31 Another reason for the excitement for epigenomics is because the technology is really at an inflection point. 2:38 In every field, technologies go through different phases of discovery, detection, and systematic decoding. 2:47 In the field of genomics, the hardware -- the DNA in our cells... 2:52 we discovered the structure of DNA in the 1950s. The first technology to detect or sequence DNA 3:00 occurred in the '70s. But it was only in the last decade or so that we have, really, next-generation sequencing technology 3:07 to make routine genome sequencing a possibility. Epigenomics is also going through a related kind of revolution. 3:17 And we can think about the epigenomics, now, as the counterpart, the software programming of our cells. 3:24 So, the first chemical marks associated with epigenetic memory were discovered in the '50s. 3:30 Some of the first methods to detect these marks in a laboratory setting were developed in the '80s. 3:37 And I'll be telling about some new technology that developed... were developed in the last decade 3:42 that really sped up the capacity to systematically decode this epigenomic information. Understanding Gene Expression 3:49 Let's zoom in to the specific features in the genome that we're talking about. When we think about genes, specifically disease-associated genes, 3:59 we have to remember that each of these genes are associated with switches: DNA regulatory elements that decide when and where 4:07 this gene turns on and off. These DNA elements are the binding sites 4:12 for transacting protein transcription factors or regulatory RNAs. 4:18 The picture in the human genome actually looks more like the bottom, where only just 2% of information is protein-coding, 4:26 and the vast amount of the real estate -- 98% -- is actually part of this regulatory DNA. 4:33 We also know that that human variants associated with disease reside in this non-coding space. The chromatin landscape 4:41 So, systemic work over the last several decades by many investigators 4:46 has found that... all this DNA is packed into chromatin, and I'll refer you to the iBiology talk by David Allis, 4:52 which goes into great depth about these different chemical marks. But the end conclusion is really that 5:00 each part of a gene has characteristic chemical and physical features on chromatin, and that these features reflect the current activity 5:07 and the future trajectory of the genes. And we can... and the epigenomic technology I'm talking about 5:15 basically is the systematic mapping of these chromatin features across the genome. 5:20 So, this cartoon shows the fact that if there's a protein-coding gene here that there would be promoters -- that's where the gene starts. 5:29 There are DNA elements like enhancers that activate this gene in specific cell types. There are additional DNA elements 5:36 that might prevent this gene from being activated in a different situation. And there are also, for example, insulators, things that basically break up genomes into neighborhoods of control. 5:46 And this interaction would have to occur, for example, through long-range DNA looping, chromosome looping. 5:53 A very fundamental feature of these activities is that the DNA has to be accessed. 5:58 It has to be physically touching the regulatory machinery for this regulation to happen. 6:03 And that is a fundamental feature that we can exploit. 6:08 Now, in every human cell, 2 meters of DNA is packed into a 10-micron nucleus. Assay of Transposase Accessible Chromatin 6:14 Therefore, most of the DNA is highly compacted -- all wound up -- and not accessible. 6:20 Except for the active DNA elements that your cell is actually using and reading. 6:25 And so simply finding out where these accessible elements are located can give us a lot of information 6:34 about the software program that your cell is running. A few years ago, my colleague, Will Greenleaf, and I, at Stanford, 6:41 invented a technology called Assay of Transposase Accessible Chromatin, or ATAC-seq for short. 6:48 It uses an enzyme called Tn5 transpose, which copies and pastes DNA. 6:54 It's derived from a bacteriophage. We already load up this enzyme with sequences 7:01 that can go onto our sequencing machine. When this enzyme tries to copy and paste into eukaryotic chromatin, 7:08 it can only paste into the open chromatin sites. And so, therefore, in a single step, 7:14 you selectively and covalently tag the genome at the accessible sites. 7:20 That then allows us to amplify and sequence these elements. So, this a very elegant, simple sort of strategy 7:27 led to a million-fold improvement in the sensitivity 7:32 and a hundred-fold improvement in the speed of mapping the regulatory DNA -- the epigenome -- in human cells. Active regulatory elements in single cells 7:39 Here's an example of what the data would look like. On the x axis, these would be locations of genes. 7:45 And the height of these peaks indicates the level of accessibility -- the taller the peak, the more accessible it is. 7:51 The first track, shown in blue, was the standard... prior gold standard technology, 7:57 called DNase hypersensitivity, and they used 10 million cells. 8:02 The second row, in green, is the first version of ATAC-seq technology, which only used 50,000 cells. 8:09 And the third row was the ultimate resolution that we achieved, which was actually single-cell ATAC-seq. 8:15 This is several hundred single cells summed together. You can see that the patterns look very similar across these applications. 8:23 However, now with the single-cell information, you can zoom in. 8:28 And now every row is a single cell. There are 254 single cells going down this way. 8:35 And at every position, here, at every... you basically see either 0, 1, or 2 reads, 8:42 because human cells are diploid. And therefore, this kind of analog signal can turn into digital information. ATAC-seq reveals DNA-protein interactions 8:52 When we see these individual peaks, we of course want to know, what are the factors that are acting 8:58 on those individual gene switches? And there's another very interesting feature of the ATAC-seq signal that we can exploit. 9:04 We learned that, many times, at the summit of every peak 9:09 there's approximately 8-10 bases of a dip. And this is called a footprint, okay? 9:15 So, this is an example of an ATAC-seq signal. And this is exactly the binding side 9:20 of a DNA-binding factor on... to DNA. So, the idea is that we're essentially spray painting the genome 9:28 with our ATAC-seq enzyme. And if a... if the... if a protein is sitting on the DNA, you can spray paint to the left of it, or to the right of it, 9:36 but not on top of it. And so, if I were putting my hand in front of a wall and I spray painted, 9:41 when I move my hand away, you'll see a shadow. It shows that an object was there. 9:47 That is the kind of similar principle. And so, we can see that if we directly retrieve this particular factor, call CTCF... 9:54 this is the location where the CTCF is sitting, and the footprint of the CTCF on ATAC-seq data 10:00 looks very similar. Okay. So, because we now actually know the binding preferences 10:07 of hundreds of factors across the genome, we can actually look across a genome and ask, where do we see this kind of footprint?, 10:15 and infer the binding locations. So, for example, in this map here, every col... 10:21 every row is an instance of the CTCF binding site in the genome. 10:26 This is the center of the sequence that it's being bound to. And we can see that only these sites up here 10:32 have this kind of footprint pattern, and these ones at the bottom do not. If we directly retrieve CTCF by a different technology, 10:41 we can see the same answer, that these top ones are bound and the bottom ones are not bound. Okay? 10:47 And so, because, again, we know the binding sequence, or the preference, of hundreds of factors that bind to DNA, 10:54 we can actually learn the binding locations of these factors all at once, in an allele-specific fashion. Layering insights from regulome map 11:02 Now that we have this powerful technology, we can think about what we can learn from this map, 11:08 this epigenomic map of individual cells. And by analogy, we were really quite inspired but the kind of maps 11:15 -- digital maps -- that we all use in our daily lives for navigation. Now, these digital maps represent the real world in different layers, 11:23 including the lay of the land, the different businesses, different streets, 11:28 where your friends are. And each of these layers of the map makes this map more useful. 11:34 So, we also imagine that, by analogy, if we built up a personal snapshot of gene regulation 11:41 -- the epigenome, or the regulome -- we want to learn from this map different cell types, different cell states, 11:49 the tissue microenvironment, the cell lineages, the effects of perturbations or drugs, 11:54 connect them all together through computation and to kind of, then, make maximal use of this personal regulome information. 12:03 And I'll show you some examples of how we can extract this kind of information from the epigenomic map. Enhancer fingerprints of cell identity 12:10 An important concept is that the epigenome encodes information on cell type identity. 12:15 Here on this map, I'm showing you six tracks, six different cell types from the blood, 12:22 starting from the hematopoietic stem cell -- the HSC -- to cells that make different lineages 12:29 -- myeloid cells, white cells, red... MEP, which makes red cells, 12:34 and specific kinds of immune cells: CD8 T cells and NK cells. On the right, you can see that for this particular gene, 12:42 TET2, the messenger RNA level varies by less than two-fold across these different cell types. 12:48 So, you might think that TET2 is not a very good marker for different cell type identities. But if you look at the chromatin landscape, 12:55 now you see a completely different picture, which is shown on the left. So, you can see that the TET2 promoter 13:01 is accessible in all these different cell types, but then you see that progenitor cells 13:06 have one set of accessible elements, and further elements distinguish, let's say, 13:12 lymphoid cells and specific kinds of cells, such as CD8 cells and NK cells. 13:18 So, the message here is that each of these cell types... they're making the same... 13:24 turning on the same gene, making the same RNA. But they're doing it with different gene switches. And these switches then tell us 13:31 the identity of cells that are involved. This particular concept can be particularly powerful Cancer cells are individually different 13:38 when we think about the problem of cancer. Cancer cells are individually different. 13:43 And this has been long known. On the left is a plate from a paper by Virchow, 13:50 back in 1847. He was drawing images that he saw under the microscope, 13:55 and you can see that individual cells are not all identical. On the righthand side is a more modern image from a review, 14:03 which raises the concept that tumor cells can go through these kind of epigenomic or chromatin changes, 14:12 and that changes their behavior. We used our technology and teamed up with colleagues at Stanford University Regulome landscape of cancer evolution 14:19 to study human leukemias, acute myeloid leukemia. In this particular disease, 14:25 which is a cancer of the blood cells, we know that the hematopoietic stem cell -- the HSC, that gives rise to all the other cells in the blood -- 14:33 suffers a series of mutations. The first mutation can create something called 14:39 a pHSC, and further mutations causes a cell -- shown in yellow, here, the leukemic stem cell -- 14:46 that can now propagate the disease. There's still a minority of cells in the blood. 14:51 The vast majority of cells are these blast cells, which is colored red here. We're able to isolate all the different cell types from leukemia patients 15:00 and show that, in fact, there are also... in parallel to the genetic changes, there are corresponding systemic changes to the epigenome, 15:07 as these cancer cells progress through these different stem cell fates. Regulome provides mechanistic and prognostic insight 15:13 We can also answer some important questions. We know that in certain cancers and leukemias, 15:18 that the leukemic cells will show features, or markers, of different kinds of normal cell parts... cell types. 15:27 And so, this is a confusing situation. People are not sure whether it's because 15:33 there are two kinds of cells running around in leukemia, or is it that really there's one cell running two programs? And so, we used our single cell technology, 15:42 single-cell ATAC-seq, to examine a leukemic patient... 15:48 in this case, a patient's leukemic stem cells. So, in the graph on the right, here, 15:54 each dot shows either individual cells or a particular cell type. And this two-dimensional plot, here, 15:59 indicates cell relationship by distance: the more related the cells are, the closer they are together. 16:06 And if they're far apart, that means they're quite different. And what we see is that these purple cells, 16:11 the individual cells from the cancer patients... they do not map to any of the known cell types; they map in between them. 16:18 And that really indicates that it's a single cell running two different programs, a concept called lineage infidelity. 16:26 It also further turns out that the more that these cancer stem cells 16:31 are running the program of the normal hematopoietic stem cell, the HSC, the more they're able to copy themselves and renew themselves. 16:38 And that, in that case of cancer, is a bad situation. We found that, in fact, 16:44 the leukemic stem cell with a high sort of pHSC potential... they're much more likely to cause death, unfortunately for the patients. 16:54 Whereas those with a low pHSC content have a much better outcome. 17:00 And so, therefore, we can see that even this epigenomic information has potential prognostic information. Decoding The Cancer Genome Atlas 17:08 We were able to extend these concepts into, also, solid cancers. The Cancer Genome Atlas has been a major effort for the cancer community 17:16 over the last decade. And many investigators have systematically collected nearly 10,000 tumor samples 17:23 and sequenced their genomes, sequenced their RNA. But until very recently, we didn't have, really, 17:30 any information on the epigenome landscape. We teamed up with the TCGA group, 17:36 and we were able to use ATAC-seq technology to map the chromatin landscape in 23 human cancer types, 17:43 which are shown here on the right, and these span some of the most common and deadly human cancers, 17:49 including glioblastoma, lung cancer, breast cancer, colon cancer, and so on and so forth. 17:56 We studied 410 tumors, and we discovered over half a million DNA elements 18:02 that are active in these diverse cancer types. What is very intriguing is that 18:08 we found that nearly half of these elements are not active in our surveys of normal tissues. 18:14 They're only activated in the context, in the pathology, of cancer. 18:21 We can learn some... really intriguing results. Genetic risks for cancer: Faulty gene switches 18:27 Geneticists have long studied different families, looking at different risks of cancer. 18:32 And the vast majority of these sort of risks associated with cancer 18:38 are actually falling into the non-coding elements. And so, it's a... it's a mystery as to how they might work. 18:44 So, on the right is an image coming from the epigenome mapping. Again, genes are on the x axis, 18:51 and the height on the y axis indicates accessibility. At the top, in orange, 18:57 I'm showing you five examples of colon cancer. On the bottom, five examples of kidney cancer. 19:03 This gene being shown here is called MYC. It's a very important and powerful oncogene. 19:09 And nearly all the cancers would turn on MYC. But the point I want to make is that the colon cancers turn on MYC using different elements, shown... 19:18 more to the left side, the 5' end of the locus. And the kidney cancers turn on MYC with a different set of elements, 19:25 more to the 3' end of the locus. So again, different switches, even for a common oncogene, across different cancers. 19:31 The second important point is that one of these switches that's turned on in colon cancer 19:38 is precisely the location for colon cancer predisposition. It's only active in colon cancer. 19:44 And conversely, an element that's associated with kidney cancer predisposition is actually only... again, only turned on in kidney cancer. 19:53 Okay? So, this epigenome mapping, then, provided us an... at least, I think, a biochemical hypothesis... 20:00 an explanation, for these risk elements associated with cancer predisposition. Pinpoint mutated gene switches in cancer 20:06 We also learned that, beyond inherited risk, we can also explain somatic mutations -- 20:13 those that are acquired in the body in the course of cancer. This is a map looking at a particular locus 20:21 in different kinds of bladder cancers and kidney cancers. And we see that all these cancers have the same risk, 20:28 except this one, that all of a sudden gains this very strong accessible element... 20:33 activity in this locus. And what we've discovered there is that 20:38 if you look in the ATAC-seq data, well, this accessibility comes from a mutated element. 20:45 And so, the normal sequence is shown at the bottom of this graph here. 20:50 Okay? And the mutated sequence has changed a single base: this letter in T... from C to T. 20:58 And what we realized here is that the cancer is essentially hacking the password of the genome. 21:05 This sequence, shown at the top here, is the perfect binding site for a particular transcription factor 21:11 called NKX. And when the cancer cell changes that C to a T, it now has the perfect binding site, 21:18 again, for NKX, and therefore it gains this accessibility because the machine starts reading that part of the genome 21:24 and turning on the gene. We further found that the gene linked to this element 21:30 is called FGD4. When the FGD4 level is quite high, this is actually associated with a very strong risk, again, 21:38 of death. And therefore, this is the kind of information 21:43 that would be quite important to know. We can therefore use the epigenomic information to understand both the inherited and acquired risk of cancer. Massively parallel single cell ATAC-seq in nanoliter droplets 21:54 This technology has continued to undergo evolution, and a very important recent advance 22:01 is the increase in the scale of mapping single-cell chromatin accessibility. 22:06 This is using a microfluidic technology that can parse individual nuclei from cells 22:13 into nanoliter sized drops. Into these droplets, then, we combine them with barcodes... 22:21 so, these are basically little beads that contain DNA sequences. Each bead contains a different sequence, 22:27 and that's the barcode. And so, when an individual nucleus meets an individual barcode, 22:33 we can transfer the information from the barcode onto the nucleus. And that says that all the molecules in that little drop 22:41 came from the same cell. Once we have tagged all these individual drops, 22:47 we can then break the drops and then sequence all the molecules together. But then we should retain the information 22:53 that they came originally from different cells. So, this technology allowed us to scale up the throughput 22:59 of single-cell epigenomics from, let's say, several hundred cells per assay 23:04 to, now, tens of thousands of cells, or perhaps even more, in a single experiment. Chromatin landscapes of human cancer immunotherapy 23:12 We were able to recently team up with colleagues at Stanford University to use this technology to look at a very important aspect 23:20 of cancer treatment called cancer immunotherapy. The poster child of cancer immunotherapy 23:27 is an antibody called PD-1. It's called "checkpoint blockade" 23:34 because it releases the brakes that are on the immune system for fighting against cancer. And so, in this kind of work, 23:40 people are really interested in what kind of immune cells are coming in to fight cancer 23:45 and how do they change in the progress of cancer treatment. And the challenges are that, again, 23:52 we're talking about clinical material... biopsies from patients are very tiny, and you have one shot to get it right, and... 23:59 because you can't just go back and keep asking the person to do surgery. So, in the context of a clinical trial 24:06 for a kind of cancer called basal cell carcinoma, we were able this serially biopsy the same tumor 24:13 before, during, and after treatment, and then subject them to this very powerful 24:18 single-cell epigenome analysis. Okay. So, in this map -- I call it a UMAP -- 24:24 against the two-dimensional plot that represents this cell information, 24:30 again, related cells are more clustered together, distant cells are separated apart, 24:37 and there are nearly 30,000 single tumor-infiltrating T cells in this map that we have analyzed. They've been color-coded based on different classes of cells. 24:45 And the only point I want to make here is that this tumor microenvironment is really a world into itself. 24:50 It's really diverse, and there are all different kinds of cells that you would have missed if you just averaged everything together 24:57 into a gemisch. Okay. What we can further learn is that these cells are related. Two types of TILs expand with PD1 response 25:03 On the left, I'm showing you, basically, trajectories that we've mapped out based on this single-cell ATAC-seq data 25:09 of the cells as they develop. So, from naive CD8 cells into effector T cells, memory cells, or exhausted cells, 25:15 and also naive CD4 cells into these CD4 Tfh cells. But what we learned, on the right, 25:22 is that we can compare the same patients before and after checkpoint blockade, and ask, what populations change? 25:29 And then what really emerges is that there's two populations: exhausted T cells -- CD8 cells -- 25:36 and the CD4+ Tfh cells. These two arms going down. And this is what expands, 25:42 and we think are very important for cancer immunotherapy. We've been talking about individual DNA elements 25:50 and how we can use that to learn about the epigenome. An equally important challenge is linking these DNA elements 25:56 to their target genes. And this cartoon kind of illustrates part of the problem. We know that the gene regulatory landscape is interweaved. 26:04 A DNA element can actually control a gene that's actually quite far away from itself. 26:10 There might be several genes in between. And therefore, simply finding out an element is active 26:15 is not enough to say which nearby gene is actually being controlled. And so, this is a question you can phrase 26:23 as the last mile of human genetics. What is my target gene? If we go through, let's say, these large-scale studies, 26:28 find DNA variants that are associated with disease, now we want to know, 26:34 what are the genes under control that might be changed? And so, this really needs a different aspect of epigenome technology, 26:40 looking into DNA folding and how those DNA elements touch their target genes. 26:47 And so, a technology was developed that we think is quite useful. Enhancer connectome in primary T cells 26:53 It's a method that we call the enhancer connectome. The idea is that we can take cells 26:59 and cross-link them in their native nucleus to preserve the three-dimensional contacts. We can then retrieve the active enhancers 27:07 based on one of these chemical marks that I talked about in the beginning, in this case a histone modification called 27:14 histone H3 lysine 27 acetylation. And then, when you sequence the... 27:21 these contacts, what you should get is a map like this, where we can see individual DNA elements, 27:26 in this case, for example, a causal variant for a disease, and then its target genes, in this case 27:32 gene D and gene A, but not the nearest gene, which is gene B. I should mention that, by default, 27:39 in the genetics literature people oftentimes report these disease gene associations just based on the nearest gene on the linear genome. 27:47 And this information may or may not be correct. So, it's really a shame that we've done all this work, 27:53 but maybe we haven't gotten the very precise information that we need. So, this enhancer connectome method 28:00 actually proved to be quite powerful. It was a 10,000-fold improvement in the sensitivity. 28:05 We needed only 50,000 cells instead of millions or tens of millions of cells. 28:11 And there was also a 10-fold improvement in the sequencing depth that one needs to get precise information. 28:17 Here's an example of looking at a kind of rather rare blood cell, 28:22 Th17 cells, from human blood, from an individual standard blood draw. 28:29 We can see these kind of... sort of checkered maps relate long range contacts in DNA. 28:36 It's the same genome on the x and the y axis, and therefore anything that's off-diagonal, 28:42 such as shown here, is reflective of long-range contacts. And this map just shows that we can actually 28:49 see these kinds of contacts from 500-kilobase resolution all the way down to a kilobase resolution. This kind of mapping from primary human cells 28:57 is important and needed this technology. Some of the rare cells we analyzed, we calculated, using the prior technology, 29:05 would need about four liters of blood. Just so everybody is on the same page, an adult human has five liters of blood. 29:13 So, taking out four liters is not something that I would recommend. Okay? 29:18 So, it literally would not be doable without this kind of technology. Okay. So, let's just show... Functional enhancer-promoter contacts 29:25 first check that the information is accurate. And so, we're looking now, again, at this very powerful MYC oncogene. 29:33 And this is something called a virtual 4C view. We have an anchor point, 29:39 which is usually shown by a dotted line. And that's the point in the genome that we're looking from. 29:44 Each of these peaks, then, would be an active enhancer that is touching this viewpoint. 29:50 And the taller peak, that means there's a stronger interaction or it's a stronger enhancer, 29:55 or a combination of both. And so, this viewpoint told us that, in this particular cell that we're studying, 30:01 this MYC gene is being contacted and turned on by these peaks, these five peaks that are shown here. 30:08 So, how do we know that is correct? It turns out that a recent study by Fulco et al and colleagues... 30:14 they actually went in and systematically tried to block every piece of DNA in this entire interval, 30:21 okay?, whether it's known to be active or not. And they found five elements, shown on the bottom here, okay?, 30:27 in red hatch marks, and they exactly line up with the locations that were identified by this enhancer connectome study, 30:34 showing that this information is actually accurate. Now that we know that the information is perhaps useful, Target genes of disease-associated DNA elements 30:41 we can think about applying it for solving questions in human genetics. For example, in this map of immune cells -- T cells -- 30:49 we know that there are DNA elements that have been associated, by genome-wide association studies, 30:54 with diseases like type I diabetes or rheumatoid arthritis. So, what is the target gene? The nearest gene is this gene at the bottom, shown in the green, 31:03 called SMIM20. It's not a gene that has really any known relationship to immunology. 31:09 But in this enhancer connectome map, we discovered that if you start from the viewpoint of this... 31:15 these disease-associated DNA elements, that the true target genes are actually this gene, RBPJ, 31:21 which is very important for T cell development, and a second gene, called STIM2, which is a calcium channel that's involved in T cell activation. 31:29 And that makes much more sense. We can also verify that the... that these controls are really happening. 31:36 This is using a version of the CRISPR technology, in which we use a dead Cas9 to bring in a silencing protein. 31:43 And this shows that if we target the RPBJ promoter, we can silence this ex... or lower its expression. 31:50 And similarly, if we target that disease-associated element that was predicted to contact RBPJ, 31:55 we also have an equivalently powerful effect in lowering its expression. So, it shows that, in fact, 32:03 this element, this disease-associated element, is controlling that target gene. 32:08 We can expand that concept and ask, systematically, for all these DNA-associated elements 32:15 in, let's say, autoimmune disease, what are the true target genes? Is it really the nearest gene, that's been reported in the literature? 32:21 Or could it be something else? And in fact, we found that across either all autoimmune diseases, 32:27 or specific well-known diseases like Crohn's disease, multiple sclerosis, lupus, or type I diabetes, 32:34 there's nearly a four-fold expansion of the protein targets, or the genes encoding protein targets... 32:41 by four-fold. Okay? So, a really substantial expansion of our understanding 32:48 of these diverse disease types. And then, finally, I want to talk to you about ways of systematically, now, Perturb-ATAC: Single-cell CRISPR screens for epigenomic phenotypes 32:54 testing these sort of nominated gene... epigenome connections and regulation. 32:59 And that involves combining epigenome reading with epigenome writing. 33:04 And this is a method that we've called Perturb-ATAC. It's a single-cell CRISPR screen for epigenomic phenotypes. 33:13 The current method for doing sort of large-scale CRISPR screens involves perturbing a large population of cells, 33:21 each, for example, getting a different CRISPR guide to silence or knock out a different gene. 33:27 We then impose some sort of selection, for example cell growth or some sort of reporter gene. 33:32 And we basically pull out a very small subset of the cells that have met our criteria. 33:38 We then sequence the CRISPR guides and see which ones are enriched, 33:43 which ones have been lost. And we essentially know what's been enriched. So, this is something... and so... 33:48 so, we have these hits. But everything else that got perturbed, that didn't sort of pass our selection, gets lost. 33:55 Okay? There are many phenotypes that don't manifest themselves as cell growth or reporter gene readout. 34:01 And so, the concept of Perturb-ATAC is that we want to, again, perturb cells 34:06 -- lots and lots of different combinations -- but for every single cell, we're gonna capture that cell, we're going to sequence the guide RNA 34:14 and also read out its epigenome landscape by ATAC-seq. Okay? 34:20 And this really means that we're doing multi-omics. We're recording two kinds... two modes of information 34:27 -- the chromatin and RNA -- to make this possible. 34:32 And so, this was accomplished using a microfluidic platform, where we can capture the single cells High-throughput single-cell CRISPR screens with epigenomic read-out 34:38 and then, in different chambers, first perform ATAC-seq, then capture the RNA, 34:44 barcode the molecules from the same well, so we can then map this single-cell ATAC-seq 34:50 to the single-cell RNA information. And the graphs on the bottom show that this technology is actually working. 34:57 If we introduce a guide RNA, for example, to this gene, SP1, we see a loss of accessibility at SP1. 35:03 And if you look genome-wide, the targets of SP1 are also being impacted. 35:10 We used this technology to perturb... actually, make lots of different perturbations, either singly or in combination. High-throughput unbiased screen for epigenomic phenotypes 35:18 So, here, every row is a different perturbation, and this is a recording of what kind of perturbations have been made. 35:27 Then we can see that, in fact, there are DNA regulatory elements that get changed, that are different, 35:32 with each perturbation. And we can then show, on the third column, what kind of factors are most enriched 35:39 at the sites that have been perturbed. And the results, we think, make a lot of sense. If you could perturb this factor, called EZH2, 35:47 silence it... this is an enzyme that writes... it's a histone mark called K27... 35:54 H3K27 trimethylation. It's associated with gene silencing. So, if you get rid of EZH2, 35:59 the sites that previously had K27 trimethylation are most affected, and they're all upregulated. 36:06 If you remove the silencer, the targets get activated. If we target a transcription factor called SP1, 36:14 okay?... again, this is a factor that's involved in activating genes, so the most affected elements are those that contain SP1 sites, 36:22 and you lose the activator, so the target genes go down. So, they're on the left side of this graph. 36:29 And finally, at the bottom, this is targeting a long noncoding RNA called EBER2. 36:36 And prior work has shown that it interacts with a factor called PAX5. And indeed, PAX5 is one of the most affected 36:42 class of elements in this particular screen. We wanted to use this kind of technology to look again Perturb-ATAC provides insights into disease 36:51 at the disease-associated risk in the genome. And we know that there are elements that are affected. 36:58 I have shown you how we can find them, find their target genes. But what we want to know now is, what do we have to do to affect these switches, 37:04 to turn them all on or off at the same time? Okay? 37:09 And so, the technology... the strategy we used, then, is to basically identify SNPs in autoimmune diseases, 37:14 alter these regulators in trans, and ask which of these combinations of regulators 37:20 can most affect these disease-associated elements and their nearby contacts, which we identify using enhancer connectome. 37:27 And so, this is such a map. It's a very busy slide. So, every column is a different disease, 37:34 and we're looking at the DNA elements that are associated with that disease. Every row is a different perturbation, 37:39 either... we're basically silencing different transcription factors, either singly or in combination. 37:45 And we want to ask which of these factors have the most impact, selectively, on the disease-associated elements. 37:52 And then the color code indicates the level of impact. And just as an example, 37:59 for this particular disease, called lupus, what we identified is that, among these factors that we examined, 38:05 this particular factor, NFKB1, encoding NF-kappa-B... the p50 subunit, 38:10 has a strong impact. And if we go on to do combinatorial studies, a second factor shows up, call RELA. 38:17 RELA turns out to encode the p65 subunit of NF-kappa-B, and these two subunits actually work together. 38:22 Okay, so this unbiased screen told us that this heterodimeric complex was perhaps very important in this particular disease 38:30 as a transacting regulator affecting these thousands of switches across the genome, and that fits with a lot of known biology. Personal GPS for navigating regulome 38:38 So, in summary, I told you about sort of progress in the... understanding the epigenome. 38:44 This is an exciting time, when we really have a personal GPS for navigating the gene regulation landscape. 38:51 The concept is that we have technologies, now, to go quickly from individual patients or their... even rare clinical specimens... 38:59 through technology, to define the DNA switches that control when and where these genes turn on and off. 39:05 And that might put us in position to develop custom therapeutic strategies. 24:53 Now playing Howard Chang (Stan Science Communication Lab 182K subscribers Videos About LinkedIn Twitter Facebook 32 Comments rongmaw lin Add a comment... @mmartin5816 4 years ago Wow, what a tremendous series of advances. It seems we have a new level of insight on epigenetics. Well done Dr Chang 7 Reply @DG-xg8vg 4 years ago Fantastic information, of which I only understood about 5% 11 Reply 1 reply @RoverT65536 4 years ago Fascinating 3 Reply @juanpablomorantorres1903 3 years ago Sweet mister Reply @numericalcode 1 year ago Cutting edge science Reply @metalwellington 4 years ago interesting 2 Reply @so-oo6ti 1 year ago Japan is trying to discharge radioactively contaminated water into the sea, and I'm curious about this part. Reply @Rishab1702 4 years ago Anyone from India watching this? 3 Reply 9 replies @EmmieAfra-y5l 7 days ago Martinez Margaret Brown Laura Thomas Angela Reply @Breal1969 2 years ago For you over thinkers: The man spoke clear and concise. You don't even need to have any background in medicine to keep up Tik Tockers. 1 Reply 1 reply @kipling1957 4 years ago Scientists get hung up un the underlying technology when explaining their expertise to the general public. I just want the headlines first, with some useful analogies. I can choose to peel the onion for detail as I understand its relevance. This is how learning happens. An onion model, not a detailed linear description. I understand why they do this, habitually, having to justify conclusions to their peers. This talk is better than most. But the big picture elements of the whole presentation could have been sketched out in the first 5 minutes, with some detail drilled into later. But do we really need to know the bench-level steps for ATAC sequencing of nanoliter droplets, for example? It just creates fog for the non-expert. I can pull a paper should I ever be curious enough about this level of background detail. 8 Reply 8 replies @hraqhraq 3 years ago Does not have good explanation or details, also he was pointing to things non precisely to what he was talking to at each point of time, more like he was talking to his colleges in Lab 1 Reply

No comments: