5. How many genes are expressed?

3.5 How many genes are expressed?

Key terms defined in this section
Abundance of an mRNA is the average number of molecules per cell.

Reassociation analysis can be used to measure the complexity of an RNA population. One method is to hybridize nonrepetitive DNA with an excess of RNA; the proportion of the DNA that is bound at saturation identifies the complexity of the RNA population. Another method is to follow the kinetics of hybridization between a excess of an RNA population and a DNA copy prepared from it. This is exactly analogous to reassociation analysis of genomic DNA. The reaction is described in terms of the Rot½ (where R0 is the starting concentration of RNA).


Saturation analysis typically identifies ~1% of the DNA as providing a template for mRNA. From this we can calculate the number of genes so long as we know the average length of an mRNA. For a lower eukaryote such as yeast, the total number of expressed genes is ~4000. For somatic tissues of higher eukaryotes, the number usually is 10,000 V15,000. The value is similar for plants and for vertebrates. (The only consistent exception to this type of value is presented by mammalian brain, where much larger numbers of genes appear to be expressed, although the exact quantitation is not certain.)


Kinetic analysis typically identifies three components in a eukaryotic cell. Just as with a DNA reassociation curve, a single component hybridizes over about two decades of Rot values, and a reaction extending over a greater range must be resolved by computer curve-fitting into individual components. Again this represents what is really a continuous spectrum of sequences.




Figure 3.9 Hybridization between excess mRNA and cDNA identifies several components in chick oviduct cells, each characterized by the Rot½ of reaction.

An example of an excess mRNA cDNA reaction that generates three components is given in Figure 3.9:



  • The first component has the same characteristics as a control reaction of ovalbumin mRNA with its DNA copy. This suggests that the first component is in fact just ovalbumin mRNA (which indeed occupies about half of the messenger mass in oviduct tissue).
  • The next component provides 15% of the reaction, with a total complexity of 15 kb. This corresponds to 7 V8 mRNA species of average length 2000 bases.
  • The last component provides 35% of the reaction, which corresponds to a complexity of 26 Mb. This corresponds to ~13,000 mRNA species of average length 2000 bases.

From this analysis, we can see that about half of the mass of mRNA in the cell represents a single mRNA, ~15% of the mass is provided by a mere 7 V8 mRNAs, and ~35% of the mass is divided into the large number of 13,000 mRNA species. It is therefore obvious that the mRNAs comprising each component must be present in very different amounts.


The average number of molecules of each mRNA per cell is called its abundance. It can be calculated quite simply if the total mass of RNA in the cell is known. In the example shown in Figure 3.9, the total mRNA can be accounted for as 100,000 copies of the first component (ovalbumin mRNA), 4000 copies of each of the 7 V8 mRNAs in the second component, but only ~5 copies of each of the 13,000 mRNAs that constitute the last component.


We can divide the mRNA population into two general classes, according to their abundance:



  • The oviduct is an extreme case, with so much of the mRNA represented in only one species, but most cells do contain a small number of RNAs present in many copies each. This abundant component typically consists of <100 different mRNAs present in 1000 V10,000 copies per cell. It often corresponds to a major part of the mass, approaching 50% of the total mRNA.
  • About half of the mass of the mRNA consists of a large number of sequences, of the order of 10,000, each represented by only a small number of copies in the mRNA Xsay, <10. This is the scarce mRNA or complex mRNA class. (It is this class that drives a saturation reaction (Hastie and Bishop, 1976).

Many somatic tissues of higher eukaryotes have an expressed gene number in the range of 10,000 V20,000. How much overlap is there between the genes expressed in different tissues? For example, the expressed gene number of chick liver is ~11,000 V17,000, compared with the value for oviduct of ~13,000 V15,000. How many of these two sets of genes are identical? How many are specific for each tissue?


We see immediately that there are likely to be substantial differences among the genes expressed in the abundant class. Ovalbumin, for example, is synthesized only in the oviduct, not at all in the liver. This means that 50% of the mass of mRNA in the oviduct is specific to that tissue.


But the abundant mRNAs represent only a small proportion of the number of expressed genes. In terms of the total number of genes of the organism, and of the number of changes in transcription that must be made between different cell types, we need to know the extent of overlap between the genes represented in the scarce mRNA classes of different cell phenotypes.


Comparisons between different tissues show that, for example, ~75% of the sequences expressed in liver and oviduct are the same. In other words, ~12,000 genes are expressed in both liver and oviduct, ~5000 additional genes are expressed only in liver, and ~3000 additional genes are expressed only in oviduct.


The scarce mRNAs overlap extensively. Between mouse liver and kidney, ~90% of the scarce mRNAs are identical, leaving a difference between the tissues of only 1000 V2000 in terms of the number of expressed genes. The general result obtained in several comparisons of this sort is that only ~10% of the mRNA sequences of a cell are unique to it. The majority of sequences are common to many, perhaps even all, cell types.


This suggests that the common set of expressed gene functions, numbering perhaps ~10,000 in a mammal, comprise functions that are needed in all cell types. Sometimes this type of function is referred to as a housekeeping or constitutive activity. It contrasts with the activities represented by specialized functions (such as ovalbumin or globin) needed only for particular cell phenotypes. These are sometimes called luxury genes.


Recent technology allows more systematic and accurate estimates of the number of expressed genes. One approach (SAGE, serial analysis of gene expression) allows a unique sequence tag to be used to identify each mRNA. The technology then allows the abundance of each tag to be measured. This approach identifies 4,665 expressed genes in S. cerevisiae growing under normal conditions, with abundances varying from 0.3 to >200 transcripts/cell. This means that ~75% of the total gene number (~6000) is expressed under these conditions (Velculescu et al., 1997).


The most powerful new technology uses chips that contain high-density oligonucleotide arrays (HDAs). Their construction is made possibly by knowledge of the sequence of the entire genome. In the case of S. cerevisiae, each of 6181 ORFs is represented on the HDA by 20 25-mer oligonucleotides that perfectly match the sequence of the message and 20 mismatch oligonucleotides that differ at one base position. The expression level of any gene is calculated by subtracting the average signal of a mismatch from its perfect match partner. The entire yeast genome can be represented on 4 chips. This technology is sensitive enough to detect transcripts of 5460 genes (~90% of the genome), and shows that 80% of genes are expressed at low levels, with abundances of 0.1 V2 transcripts/cell. An abundance of <1 transcript/cell means that not all cells have a copy of the transcript at any given moment.




Figure 3.10 HDA analysis allows change in expression of each gene to be measured. Each square represents one gene (top left is first gene on chromosome I, bottom right is last gene on chromosome XVI). Change in expression relative to wild type is indicated by red (reduction), whte (no change) or blue (increase). Photograph kindly provided by Rick Young.

The technology allows not only measurement of levels of gene expression, but also detection of differences in expression in mutant yeast strains, under different conditions of growth, and so on. The results of comparing two states are expressed in the form of a grid, in which each square represents a particular gene, and the relative change in expression is indicated by color. The upper part of Figure 3.10 shows the effect of a mutation in RNA polymerase II, the enzyme that produces mRNA, which as might be expected causes the expression of most genes to be heavily reduced. By contrast, the lower part shows that a mutation in an ancillary component of the transcription apparatus (SRB10) has much more restricted effects, causing increases in expression of some genes (Holstege et al., 1998).


The extension of this technology to animal cells will allow the general descriptions based on RNA hybridization analysis to be replaced by exact descriptions of the genes that are expressed, and the abundances of their products, in any given cell type (Mikos and Rubin, 1996).



Research
Hastie, N. B. and Bishop, J. O. (1976). The expression of three abundance classes of mRNA in mouse tissues. Cell 9, 761-774.
Holstege, F. C. P. et al. (1998). Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95, 717-728.
Mikos, G. L. G. and Rubin, G. M. (1996). The role of the genome project in determining gene function: insights from model organisms. Cell 86, 521-529.
Velculescu, V. E. et al. (1997). Characterization of the yeast transcriptosome. Cell 88, 243-251.



Genes VII
Genes VII
ISBN: B000R0CSVM
EAN: N/A
Year: 2005
Pages: 382

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net