Humans have only 21,000 genes—the same as a worm—and they are identical in all of the different types of cells. It is not the inherited code of the genes that determines the different cellular functions. Rather, it is way that genes are utilized differently in each type of cell that determines which proteins will produce unique structures.
Increasingly, these “epigenetic” mechanisms (that is, mechanisms outside of the simple procedure of assigning an amino acid directly from a code in a gene) are being found to be vastly more complex than ever imagined. This post will describe recent discoveries of dynamic three-dimensional structures in the cell’s nucleus that along with unique localization and packaging of the DNA are vital for every aspect of gene function. Vast complexity of chromatin 3D shapes is another way that DNA is regulated.
DNA is controlled at many different levels. A previous post described the way cells edit the DNA to make multiple different messenger RNAs out of the same code fragment. (See post on alternative RNA editing). In fact, recent research (ENCODE) shows that pieces of DNA can be taken from multiple different “genes” to make one protein, making the entire definition of gene suspect.
DNA strands are wound on spools of histone protein molecules and then molded into a variety of structures. Many different kinds of markings on the DNA and on the histones regulate genetic function. Histones form into different structures and these are remodeled in multiple ways.
The markings on DNA and the histones are much more complex than just the two widely known acetyl and methyl molecules. In fact, there are more than 30 different kinds of markings including 7 types of methylation, 6 types of ATPases, as well as sumoylation, ubiquitination and many others. For each type of process, specific large protein molecules aid the formation of 3D structures. All of these variations alter the triggering of genetic networks in different ways for different cells and specific responses. The chromatin code has become vast.
There is even more complexity. Now, it is known that many long and short RNAs that don’t make proteins are, also, involved in regulating every aspect of gene function. At first it was thought that 98% of the DNA—the parts that are not “genes”—was evolutionary “junk” meaning it didn’t make useful RNA with its code. Now, all kinds of small and large RNAs have been identified that silence or trigger sections of DNA and regulate all kinds of aspects of genetic function. In fact, a large percentage of the DNA makes important RNAs, although the exact amount is not clear. If less than 2% are so-called “genes” and 40% of the RNA is useful, then the regulatory RNA is 20 times more than those that code for proteins. This number might be much larger.
This post will discuss how the 3D shape of chromosomes in the 3D compartments of the nucleus impacts DNA’s function—making all of this even more complex.
Squeezing DNA Into A Small Space
To fit 2 yards of DNA into a tiny nucleus is a monumental engineering feat. DNA is highly compacted yet has to be instantly available to rapidly make proteins in neurons with a momentary change of thought. This regulation is different in each type of cell. A previous post noted that there are thousands of different types of neurons and each has different expression of the gene networks. It has been known for some time that the shape of proteins determines their function and the folding is very complex involving four levels of folding (see post Protein Shape Determines Function). Now it appears that the shape of the chromatin, also, determines function, with new secondary and tertiary structures discovered.
The shape of the chromosome is correlated with the activity of the genes inside, however the exact cause and effect relationship is still being studied. The links exist at multiple levels. Loops of chromatin make space for enhancers to land. A larger relationship exists for specific places in the nucleus to impact on particular active DNA sites. Special protein structures alter polymerase activity. Chromosomes with fewer genes are placed at the edge of the nucleus while the active ones are near the center. Less active sections are placed near the nuclear lamina, which is close to membrane. The location can suppress the genetic machinery. If the chromatin is opened but not used, the entire section is moved to a different location. The exact location of the gene in space in the nucleus influences its activity.
Basic Histone Structure
Histones are proteins that form a protective structure for the long DNA strand. Eight different histones form a nucleosome that looks like a yoyo with two different sides and a string wound around it. Nucleosomes are connected by a linker piece of DNA and many of these are built into other larger structures.
There are five families of histones. H1 and H5 are the linker types. H2A, H2B, H3 and H4 are the core types that make yoyo sides. Eight histones (two of each type) form a nucleosome. 147 base pairs of DNA are wrapped around each nucleosome. 50 base pairs connect one nucleosome with another using linker histones binding at the entry and exit. Linked together, the string of nucleosomes forms the larger structures.
Nucleosomes form zigzag fibers and larger looping structures, which compact the total DNA. These complex structures, also, enable accurate triggering of genetic networks.
H3 and H4 histones have long tales that are places for epigenetic markings at several different sites. These markings use modification by acetylation, methylation, phosphorylation, ubiquitination, SUMOylation, citrullination and ribosylation. Even the core subunits can be modified in different ways. Each modification or marking alters the function of genes directly near them. All of the markings form a vast “histone code.” Modifications are critical for DNA repair mechanisms, regulation of gene networks and the shapes of chromosomes. Many of these markings are, also, inherited by the next generation. Very recent research shows that methylation during development forms the female brain.
Chromatin is the word used for the large structure formed by the many nucleosomes. When chromatin was originally viewed under a microscope, two different kinds were called heterochromatin or euchromatin. But, in fact there are many subtypes.
Heterochromatin is highly compacted and is largely not active. It is localized at the edges of the nucleus. Despite early descriptions, it is actually at least five different states with different markings. It includes telomeres and centromeres. Constitutive heterochromatin is repetitive forming structures such as centromeres and telomeres. Facultative heterochromatin consists of genes that are suppressed and silenced by markings and RNA interference. It is not repetitive and can become active at some time.
Euchromatin is the active region and has a high density of genes, with RNAs and proteins. It is usually in the act of making proteins and is closer to the center of the nucleus.
Nucleus Structure and Chromatin
The nucleus has a complex structure that is just now being discovered and the different types of chromatin fit in different compartments. The large nucleolus near the center has a membrane and its primary function is to synthesize and assemble ribosomes.
Complex structures near the edge of the nucleus are the nuclear lamina. The lamina are made of intermediate filaments (proteins called lamins) and proteins that are near or attached to the membrane. Lamins are a large family of proteins that form many very complex structures that are just being discovered. The lamina form compartments that organize the chromatin and influence replication and cell division. They bind specific chromatin through rod like structures to specific regions called matrix attachment regions. Lamina, also, bind to specific histones. The nuclear pores are complex structures in the nuclear membrane that determine what can come into the nucleus and what is sent out. The lamina are critical to the pore’s functions.
Topologically Associated Domains (TAD)
As more is learned about the vast complexity of how structures of DNA are utilized in genetic processes, it has been compared to the folding of proteins. One of the higher-level structures of DNA is called the topologically associated domains or TADs.
Proteins first fold into a secondary structure of alpha or beta helixes, then another fold based on the chemical bonds of specific amino acids used. These helixes then fold into another more complex structure, which is added together with other subunits to form the protein.
DNA first folds into the histone spools forming a zigzag filament. Then, this fiber of nucleosomes folds into regions with TADs and then some larger TAD structures. These can form into loops. The chromosome is built of TAD structures and connected to nuclear regions. Unlike most proteins, DNA folding can take different directions in one cell. (some flexible proteins have recently been described with variable shapes as well).
Loops of Chromatin
The first chromatin found to have an unusual looping 3D structure is a particular group of five genes making the beta subunit of hemoglobin. Genes sit on a loop right near the large regulator molecules needed to start and stop their production (promoters, enhancers and repressors). Loops can be flexible and the contact of the sites can be intermittent. This loop region makes it much easier to use the DNA. Often these loops create the environment for the activity, but a further stimulus is, also, needed.
Stem cells have been found to have less specific chromosome structures. As the cell differentiates into a specific cell type, then 3D structures appear limiting the cell’s function to the DNA regions that are forming loops. The new structure in the differentiated cell limits which genes are available—in essence defining the type of cell. Having pre formed structures makes the necessary proteins rapidly available by using these setups for close interaction of enhancers and promoters.
The bunching of nucleosomes has been termed “clutches,” for their comparison to the number of eggs that are left in a bird’s nest (called a clutch). The nucleosomes of the stem cells are much less densely packed with smaller clutches of nucleosomes. The more the packaging of the clutch, the more differentiated is the cell. The more the cell has capacity as a stem cell, the fewer nucleosomes are in the clutch.
Protein Complexes Help Form 3D shape of Chromatin
Loops help regulate transcription from DNA to messenger RNA. A vast library of protein complexes, making a large number of different shapes, aids the formation of loops. These proteins, also, help to fix the chromatin in particular parts of the nucleus. Three large complexes especially help make 3D chromatin shapes–CTCF, cohesion and Mediator.
The insulator protein CTCF holds together long-range interactions of different sections of the chromatin near the TADs. In fact, these CTCF loops help create the basic structure of the TADs. Some CTCFs are insulators, that is, they keep long-range sections of chromatin apart. But, others don’t.
CTCF has many functions. CTCF forms insulators, that is, regions of DNA that are blocked by “insulating” the interaction of enhancers and promoters. Also, CTCF binds the loops to the nuclear lamina determining 3D localization in the nucleus. CTCF forms a boundary between active and inactive types of chromatin. All of these allow CTCF to influence the types of genetic networks that are used in that type of cell. One
Another protein complex that alters shape is the cohesion complex. Cohesins create loops related to enhancers, different in each type of cell. They are also constitutive. They form a ring holding together the chromatids after they are copied. Cohesins connect chromatin during cell division. They attach spindles to chromosomes and help recombination of regions of DNA during division (a key to evolution) and for repair of broken DNA. It, also, regulates transcription.
A third critical protein complex is Mediator (Noble Prize in 2006 was given for this critical cofactor of all transcription). Mediator is a very large multi protein complex that activates gene networks, critical for many gene promoters. Mediator has more than 30 subunits. It is vital for the action of Vitamin D. It is very large and allows many different protein interactions. It is highly correlated to the general function of RNA polymerase II in most transcription. Specific Mediator complexes occur in different kinds of cells and form chromatin loops that connect either two promoters or an enhancer and promoter.
All three molecules can combine to work together along with transcription factors to form the 3D chromatin shapes.
TAD Structures Influence Genetic Function
The structures of TADs correlate with many kinds of activity in different regions of chromosomes. This correlation includes modification of histones, specific genes use and copying DNA.
Using TADs, chromatin forms two discrete areas—one active DNA and the other mostly inactive DNA. In facultative type chromatin, the two regions shift during fetal development. With constitutive, the size of the two regions stays the same. Instead they have active loops combining enhancers and promoters.
An important example of how TADs influence gene function is found with the Hox genes that are used in sequence during fetal development related to the positions of the body from the front to the back. H3 trimethylation of lysine 4 marks the active regions and a different methylation marks the inactive places. As the genes are used during fetal development, the marks change. The 3D chromatin changes shape with the active TAD region enlarging and the inactive region shrinking. In this way the code and the 3D structure work together; it is not clear which is directing or following.
Each type of chromatin structure works to form regions with TADs. Some factors keep the different types apart, making them more localized. Very different structures exist for genes that are needed for cellular housekeeping chores throughout the life of the cell versus those genes that are only part of the fetal developmental sequences. The latter are only used early and then are placed in an inactive region for adult life.
Transcription is one factor that creates three-dimensional TAD structures. Also, RNA polymerase is known to elongate DNA and change chromatin shapes. Active regions have enhancers and promoters interacting and promoters that are active have contact with other promoters. Start and stop sites also form loops.
Another structural mechanism involves Polycomb sites, which create clusters that have specific interactions emphasized in the structure inside of a TAD. Physical forces operating on some specific sequences in the chromatin contribute to folding. Some sequences contribute to the 3D structures, but not all
There are many diseases that are based on the loss of some specific DNA sequence with resulting 3D changes in structure contributing to the symptoms.
During cellular division, there is very detailed regulation of all 3D structures. TAD structures exist in interphase but not mitosis. These structures have been seen in some species for 40 million years. This implies that large 3D chromatin structures can be important for evolution..
TADs influence large chromosome structures. Some TADs have long-range relationships to other regions of chromosomes. Variants of these structures exist within a group of similar cells in the same organism. The chromosome dynamically moves a lot but only within a restricted area in the nucleus. Specific chromosomes are tied to the nucleoli and some to the peripheral nuclear lamina. But, also coordinated long-range movements occur for multiple chromosomes.
Long-range interactions are stable for some specific functions, such as repression or activation of transcription factors. An example is the three specific genes related to the cytokine TNFα. These three sites have long-range relationships through large chromosome structures.
Many other unexplained associations occur from the 3D structures. One is bonding between points on the DNA loops. A second is special close relations of genes that are co regulated. Another is the spatial separation of active and repressed regions.
TAD structures assure that genetic network communication won’t be disrupted. Also, they allow the regulating factors to more strongly bind to create a more rapid production of proteins.
Vast Complexity of Chromatin 3D Shapes
Complexity of genetic regulation seems to be increasing at an exponential rate. As well as alternative RNA splicing, RNA silencing, and long and short non-coding RNAs regulating genetic activity, there are at least 30 different kinds of markings on the many different kinds of histone structures.
But, now 3D higher level structures of chromatin and chromosomes are critical to the specific functioning of gene networks in particular cells. Chromatin is both extremely organized yet constantly changing shape in three dimensions. The critical connections of genes and the regulatory sections are maintained by the complex structure. The shape changes allow different networks to perform. This 3D structure fits into the very complex 3D structure of the nucleus with many different compartments. Just as we cannot predict the structures of proteins from the amino acid sequence, also, we cannot predict the complex 3D structures of chromatin. A vast array of protein complexes helps form these shapes.
The same questions arise as in other posts. This chromatin organization and regulation is not random. Where is the direction for this? The regulation of DNA and the triggering of gene networks occurs simultaneously in many orders of magnitude. The direction for all of this is obviously not in the DNA alone. How does thought trigger these genetic networks in specific brain cells?