Evolution 101

Saturday, March 25, 2006

Molecular Evidence 1: Protein Functional Redundancy

All right, this is the first podcast in a series of six that I’ve planned on the molecular evidence for evolution. I’ll be using Dr. Douglas Theobald’s resource on Talk.Origins.org pretty heavily, so you can follow along with me there if you like.

The first piece of evidence is protein functional redundancy.

Proteins are, as a group, completely essential for life’s function, but there are some proteins that are more essential than others. These proteins perform very basic but essential tasks that all organisms require for life. We can call these proteins, “Ubiquitous Proteins.” These ubiquitous proteins are completely independent of an organism’s specific function or ecological niche- all organisms from bacteria to humans have these proteins, and they do the same thing no matter where they’re found.

Now, if you remember the previous podcast, the Molecular Biology Primer, you remember me talking about the relationship between protein structure and function. I didn’t get into much detail last time, but I’ll expand on it a bit more here, because it’s a pretty crucial concept for this piece of evidence. The function of a protein is determined by its structure. Imagine that we have an enzyme, which is a chemically active protein, that has the function of cutting other proteins in half. To create a conceptual model in your mind, imagine that the protein is basically like a pair of scissors. The function of a pair of scissors, to cut things, is determined by its structure, which is essentially two blades and a fulcrum, or pivot point. A pair of scissors has a pretty basic structure AND function, and so it’s not too hard to make different variations on the basic structure without changing the function too much. For example, you can make the scissors out of steel, iron, brass, or even plastic. You can make the handles longer, or shorter. You can make the blade longer or shorter. You can even have left-handed, versus right-handed scissors. So it’s pretty safe to say, if you want to cut something, you have a pretty wide variety of choices if you need a pair of scissors.

In the same way that you can vary the way you make a pair of scissors without giving up its basic function, you can vary the way you make a protein without giving up its basic function. Remember, a protein is made by constructing a long chain of amino acids, and each amino acid is distinguished from the others because of its unique side chain. That makes each amino acid slightly different from all the others both chemically and physically. Some amino acids are large, some are small, some are electrically charged, some are not, some attract water, and some repel water. Depending on specific interactions between different amino acids in the chain, the protein will twist around itself and fold up in a very specific structure. Now comes the tricky part- you can get two very similar structures from two very different chains of amino acids. To help you follow along with me, try out another conceptual model- imagine that a protein, instead of being constructed from amino acids, is constructed from Legos. (I hope I’m not violating any copyright here) Maybe I should say “small plastic construction blocks that are similar to Legos.” Whatever. Anyway, let’s say that you have a huge box of Legos, but the whole box only containes 20 different pieces. If I ask you to build me a pair of scissors out of Legos, how many ways do you think you could put the pieces together to get a decent Lego model of scissors? I haven’t actually tried this, but you could probably get pretty many, right? Probably a whole bunch. OK, well, in the same way that you can use many different combinations of Legos to give the same endproduct, you can use many different combinations of amino acids to give the same basic protein function. A more technical way of saying this is that for any given protein, there are many different amino acid sequences that are functionally redundant.

OK, this is all well and good, but what does it mean in terms of evidence for evolution? Well, you remember that I started by talking about Ubiquitous Proteins. These are proteins that are so essential to the basic functions of life that they can be found in every living organism. That is to say, their function is absolutely necessary, and what did we just learn about function? It can be produced from many different combinations of amino acids. So ubiquitous proteins are also functionally redundant in terms of amino acid sequence.

Now, before we look at the evidence, it behooves us to come up with hypotheses. This is part of the scientific method, and very essential. Without a hypothesis, we can’t draw meaningful conclusions- we’re just making observations. Now, we need to have two hypotheses- an evolutionary hypothesis and a null hypothesis. If the data support the evolutionary hypothesis, then we can conclude that evolution is the best explanation for the data. However, if the data support the null hypothesis, then we can conclude that evolution is not the best explanation for the data.

The null hypothesis posits that the evidence will show that amino acid sequences of ubiquitous genes will not be highly similar between any two given organisms. We know that the null hypothesis is possible because of the nature of protein function to be caused by many, many different variant amino acid sequences- that for any given protein, there are many amino acid sequences that are functionally redundant. Thus, since there are so many possible amino acid sequences for any given ubiquitous protein, there is no reason why each organism could not have a completely different amino acid sequence for any given ubiquitous protein. But, let’s say that the null hypothesis isn’t true- what other phenomenon could the evidence show? Well, if the evolutionary hypothesis is true, then different organisms are related to each other by heredity. Since, as I’ve mentioned before, the only mechanism which has been shown to result in similar sequences between organisms is heredity, the evolutionary hypothesis posits that the evidence will show that amino acid sequences of ubiquitous genes will be highly similar between different organisms.

So, let me just go over those two hypotheses one more time before we look at the evidence. If evolution is not true, then we would expect to see that the amino acid sequence of a ubiquitous protein would be completely different in different organisms. If evolution is true, however, then we would expect to see that the amino acid sequence of an ubiquitous protein would be more similar between organisms that are closely related. And the more similar the sequence, the closer the hereditary relationship. OK, let’s look at the data.

Cytochrome C is a ubiquitous gene that is found in all organisms, including animals, plants, and bacteria. It’s an essential gene for cellular metabolism, and helps to provide energy for all life processes. Cytochrome C fulfills the prediction of ubiquitous proteins- that is, it is extremely functionally redundant. Many different amino acid sequences have been shown to fold up into the basic structure required for Cytochrome C function, and in fact among bacterial strains, completely different amino acid sequences are redundantly functional. Experiments in yeast show that if you remove the yeast’s own Cytochrome C protein, you can replace it with Cytochrome C from humans, rats, pigeons, or even fruit flies, and it works fine. A study was published that shows there are, in fact, over 10^93 different possible amino acid sequences for Cytochrome C. That’s more possible sequences then there are atoms in the Universe. So, Cytochrome C is very functionally redundant, and it would be possible for every single different organism to have a completely different amino acid sequence, if evolution is not true.

So what do the sequence comparisons show? Let’s compare humans and chimpanzees. If evolution is true, then chimpanzees are our closest relative, but if evolution is not true, we’re no more related to chimps then we are to crickets. But if you compare the amino acid sequence of humans and chimpanzees, you see that they are exactly the same. Exactly the same. And when you compare human Cytochrome C to that of other mammals, you find that there is only about 10 amino acids difference between them. The chance of this happening without shared heredity is about 1 in 10^29. If you compare human Cytochrome C with the organism the least related to us, outside of bacteria, you find that there’s only about 51 amino acids difference between us. The chance of this happening without shared heredity is about 1 in 10^25.

To review, protein functional redundancy is the phenomenon by which many different amino acid sequences can give the same function in any particular protein. This phenomenon means that closely similar amino acid sequences between organisms implies shared heredity. Examination of the amino acid sequence of a ubiquitous protein shows that different organisms have a greater sequence similarity than would be expected by chance, and thus supports the evolutionary hypothesis.

Saturday, March 18, 2006

Molecular Biology Primer

I’m going to embark on a six-part series of podcasts to present, as simply as possible, the molecular evidence for evolution. I’m extremely grateful for Dr. Douglas Theobald’s work in compiling not only these evidences, but two dozen additional evidences which can be found at www.talkorigins.org. Usually when people think of the evidence for evolution, they think of fossils. And certainly, fossil evidence is very substantial, making the case almost by itself, but we should be interested to know that evidence can be found in places other than under the ground- it can also be found inside all of us. That is- in the molecules that make up our bodies. Now, since the nature of this evidence is pretty technical, I want to preface it with a brief primer, so that I can flesh out the relationships between the relevant molecules that I’ll be discussing. So, hang with me as best as you can, because the evidences that will be piling up in the next few weeks are really astounding, in my opinion.

Molecular biology is a fascinating component of the biological sciences. It was born in the early part of the twentieth century out of a desire to find some way to unite the related fields of biological chemistry, microbiology (the study of microorganisms such as bacteria), genetics, and virology. The goal of molecular biology is to study biological systems by analyzing their macromolecular components. I’ll assume that most of you know that a molecule is nothing more than the smallest amount of any substance that still retains its properties, but what are macromolecules? They’re called “macro” molecules because unlike a molecule of, say, water, which is made up of only three atoms, macromolecules are composed of anywhere from dozens to thousands to millions of atoms, depending on the molecule. There are essentially four classes in biology- proteins, carbohydrates (sugars), nucleic acids (DNA), and lipids (fat). They are also somewhat unique in their ability to form polymers, or long chains of repeating segments. The longer the chain is, the larger the molecule.

Proteins perform most of the basic biological tasks in organisms- they form the internal structural support of cells, link cells together, cut up and assemble other proteins or nucleic acids, provide communication pathway between the inside and outside of a cell, immobilize and target invading microbes for destruction, and convert energy currencies to run the whole show. Carbohydrates and lipids are used primarily for energy storage, although they do a number of other things as well- I don’t want to slight those people who are interested in lipid biology- I come from a lipid background myself, and I know how essential they are, but I’d like to jump ahead to the final macromolecule, nucleic acids, and its connection with protein expression.

Nucleic acids form the central aspect of the replication of life. DNA is a nucleic acid, and is the beginning of the process that ends in production of a particular protein. DNA is a polymer, which means that it’s a long chain of subunits. These subunits, or nucleotides, come in four types, called adenine, cytosine, guanine, and thymine. These are usually abbreviated to the first letters of their name, A,C,G, or T. A DNA molecule is made up of only these four nucleotides, and they can be placed in any order. DNA molecules are millions of nucleotides long, which basically makes them very long string-like molecules. Unless they’re being copied, DNA molecules are usually wound up tightly around themselves- sort of like a telephone cord that’s been stretched too far and too many times. These wound up DNA molecules are called chromosomes- and humans have 23 pairs of them, or 46 total. The sequence of nucleotides that makes up a chromosome is copied every time a cell divides- in the process called mitosis. Mitosis occurs whenever new cells are being made- and this is happening in your body all the time. Skin cells, hair follicles, liver cells, muscle cells, bone marrow cells- all these cells are undergoing mitosis as you listen to this.

Mutations are mistakes in DNA replication. The molecular machinery that copies DNA during mitosis is not perfect, and it is susceptible to a number of factors, including radiation, certain chemicals, or viruses. Radiation, especially ultraviolet radiation, tends to affect adjacent thymine bases, so it’s not completely random, but it’s very close. But there is also a base rate of mutation that occurs randomly but at a measurable average rate, that results in one base being switched with another during copying. In humans, this rate is at about 1 mistake per 100 million base pairs every generation. This is about 175 total mutations per individual. If one of these mutations occurs in one of the cells that is transferred to the next generation- we call these “germ cells” and they would be either sperm in the male or eggs in the female- then the mutation is incorporated into the genome of the next generation.

This is an important concept- since we observe time and time again that inheritance is the mechanism for transfer of mutation from one generation to the next, we can infer genetic relationships between organisms based on shared mutations. For example, let’s say that your grandfather was the first person to have a unique and dominant mutation, call it “Mutation X”, which was passed on to all of his children, including your father, and then on to you. You happen to meet someone who claims to be a long-lost cousin, but how do you know? If you were to compare your DNA sequence to this supposed cousin and find that they had Mutation X as well, that would be genetic proof that you share the same grandfather. Thus, shared DNA sequence implies shared ancestry.

OK, so that’s how DNA works, but how do you get protein from DNA? Well, as I’ve mentioned before, the DNA sequence of most organisms is divided up into transcribed and non-transcribed parts. The transcribed parts are called “genes.” Gene transcription is the process by which an RNA copy is made of a DNA sequence. RNA is similar in structure to DNA, but it isn’t used as the genetic storage molecule. Instead, it’s used as an intermediate to ferry copies of the DNA sequence out of the nucleus of the cell and into the main part of the cell, where protein is made. RNA is kind of like a librarian who goes into the basement of the library, makes a photocopy of a book, and then brings the photocopy to a person who requested it. It’s basically an exact copy of the original gene, but constructed out of RNA nucleotides, instead of DNA nucleotides. These copies are called transcripts, because we talk about RNA being transcribed from DNA. The RNA travels from where the DNA is stored in the nucleus out into the main part of the cell, where protein is made.

Proteins are also polymers, or long chains of subunit molecules. Instead of being made out of nucleotides, however, proteins are made out of amino acids. Now, whereas there are only four different nucleotides that are incorporated into DNA, there are twenty different amino acids that are incorporated into proteins. That means that it would be impossible to have a 1:1 relationship between a nucleotide and an amino acid sequence- there just are too many amino acids. So what’s the solution? The solution is that there is a kind of code in the nucleotide sequence that requires it to be subdivided into three nucleotide groups. This way, a sequence such as AGTCTCGAATCC would be read, AGT, CTC, GAA, TCC. These groups of three nucleotides are called codons, because they are the individual units of the genetic code. Since there are 64 possible codons, that makes plenty of possible amino acid counterparts- too many, in fact. Since there are 64 possible codons but only 20 possible amino acids, that means that there are multiple codons that correspond to the same amino acid. The RNA sequence is used directly to make the amino acid sequence, in a process called translation.

Amino acids are themselves somewhat similar is structure to a nucleotide- there is a base structure that is composed of an amino group and a carboxylic acid group- hence the name, amino acid. But each amino acid also has room for another group, called a side chain- and it’s the various structures of the side chain that make one amino acid different from the other. Some amino acids are electrically charged, and some have no charge. Some amino acids associate well with water, others are repelled by water. Some amino acids are very large, and others are very small. All of these factors come into play during the final product, the protein molecule. Ultimately, a protein is just a long chain of amino acids, just like DNA is a long chain of nucleic acids. But instead of staying a long, floppy string of amino acids, proteins fold up into specific conformations, depending on the specific amino acids that are used to make them. Chemical bonds between different amino acids cause parts of the chain to stick together, specific orders of amino acids can cause the chain to fold back and forth or spiral around itself, much like DNA does. Because of all this folding, each protein has a different appearance, or what we call a structure. And it’s this structure that makes a protein able to do the specific things that it can do- all the things that I mentioned at the beginning of this episode.

All right, that’s a lot of information to soak up. Let me just go over the basics again. DNA is made up of a chain of four different nucleotides. The nucleotide sequence is transcribed into RNA, which is then translated into an amino acid sequence. The translation is carried out by virtue of the genetic code, in which 64 different 3-nucleotide codons are translated into 20 different amino acids. The specific order of amino acids confers physical and chemical properties to the final protein, influencing the way it is folded up into its final structure. And the structure of the protein is directly related to its function.

Monday, March 13, 2006

What is Evo-Devo?

Evo-Devo is a combination of two disciplines within the field of biology: evolutionary biology and developmental biology. The realm of evolutionary concepts should be fairly familiar to you by now, but what is developmental biology? Developmental biology is the study of how organisms develop from a single cell through all the intermediate embryological stages, all the way to birth. Evolutionary developmental biology, then, or Evo-Devo for short, is a way to look at the way that the mechanisms of development have been influenced by evolutionary forces.

And this, of course, is a logical collaboration between two different biological camps. And it’s a fairly typical overlap, as well- remember that in biology, you almost never find things that are black or white, instead you find things that are various shades of grey. Likewise, different biological disciplines find themselves overlapping with others all the time. My field is molecular biology, which is pretty general really, because it focuses on the molecular pathways, the individual genes and gene products that contribute to physiological function and pathological condition. But our entire bodies are made up of molecules, so depending on the project of interest, any given molecular biologist could be overlapping in cardiology, in neurology, in gastroenterology, in immunology, or any number of other “ologies.” The same could be said about a cell biologist, or a biological chemist. Or even an ecologist, I suppose, although my own personal paradigm in biomedical research is preventing me from thinking of a good example for some kind of collaboration between a molecular biologist and an ecologist, I’m sure there are plenty out there. My point is that a collaboration between evolutionary biology and developmental biology is not odd or unique, and the only reason it’s become recently popular is because of some pretty powerful discoveries, which I’ll get to a little bit later.

First, I’d like to defuse or debunk one of the criticisms of Evo-Devo from the creationist camp. It seems like there’s really no aspect of evolutionary biology that creationism hasn’t taken pot shots at over the years, and most of them are more pervasive in the popular consciousness than the actual science, for a number of reasons. This particular criticism, like most of the others put forth by creationism, has sort of been left behind scientific progress, but since so many are unaware of current scientific thought, it can be somewhat successful. You may remember that I mentioned the Jack Chick tract “Big Daddy,” when I was talking about “What is NOT Evolution,” several weeks ago. Well, you can find this creationist criticism here also (by the way, not to get off on a tangent, but it does seem to me that it’s very rare to find an actual Argument put forth by creationism- much more often, it’s a criticism of one aspect of evolutionary theory). Midway through the comic, the Evil Evolutionary Professor tells his class that, “Here is proof of evolution. Human embryos have gill slits proving man evolved through the fish stage millions of years ago.” All this while thinking, “I hate him,” about the Saintly Creationist Student who is challenging him. And the Student replies, “Sir, Ernst Haeckel made up those drawings in 1869 and they were proven to be wrong in 1874. Those folds of skin are not gills. They grow into bones of the ear and glands in the throat.” And another student comments, “Wow, 125 years wrong and still in our book!”

So what is the truth behind this criticism? Well, Ernst Haeckel was a German scientist who accepted evolutionary theory fairly strongly, although he was somewhat torn between Darwin’s theory of natural selection and Lamarck’s theory of evolutionary ontology. He won some popularity with his analysis of the embryological stages of different animals, and in fact published a theory which is now currently referred to as “recapitulation theory.” According to Haeckel, if you line up embryos from different vertebrates at similar stages of development, there are obvious anatomical similarities between all organisms. He published drawings that he had made of these embryos in 1874, to back up his claim. Now, the “Big Daddy” comic says that Haeckel “made up” these drawings. This is not true. At worst, Haeckel deliberately overemphasized the similarities between the different organisms in the way he drew them, and at best, he didn’t realize that he had drawn his conclusions into the figure subconsciously. They’re clearly not made up out of whole cloth, as the comic implies. It would be more accurate to think of Haeckel treating his drawings the way magazines airbrush pictures of models to remove blemishes and overemphasize certain characteristics. A magazine photographer might airbush a model to have larger breasts and thinner thighs, and Haeckel airbrushed a drawing of a human embryo to have larger gill slits and a longer tail. That doesn’t excuse the inaccuracy, of course. A magazine might be only be selling you perfume or lingerie, but a scientific paper is supposed to be a pretty clear representation of the truth.

So, the guy fudged his drawings to support his theory. But what is recapitulation theory anyway? Haeckel thought that the evolutionary development of an organism was carried out again, or recapitulated, during its embryological development. A short, catchy way to say this is, “Ontogeny recapitulates phylogeny.” Ontogeny means the development of an organism from embryo to adult, and phylogeny means the evolutionary development of an organism from ancestral to modern species. Essentially, this means that as an organism develops from embryo to adult, it passes through a series of intermediate forms which approximate ancestral species. For example, a human embryo would pass through a fish stage, then an amphibian stage, then a reptile stage, then a bird stage, then a general mammal stage, and finally the human stage. It’s pretty clear that this theory is bunk- evolutionary science has rejected this theory almost completely… almost. While it’s pretty clear that human embryos don’t actually become fish, they do share a number of characteristics with fish embryos. For example, the notorious gill slits. In the “Big Daddy” comic the Student criticizes the Professor for mentioning gill slits, because they’re not gills. Excuse my glibness, but DUH! It’s well known in developmental biology that gill slits aren’t gills. I’m not sure why this is supposed to be such a shocking revelation by the Student, other than the fact that the comic was likely written by someone who has no knowledge of developmental biology or willingness to look it up. Gill slits are not gills- they’re often called “pharyngeal pouches” because they occur in the throat, which is technically known as the pharynx. They look somewhat like gills, hence the name. The Student is correct in saying that they develop into ear bones and throat glands, but he leaves out a little bit. Ear bones develop in mammals only- in reptiles, these bones are part of the jaw, and this intermediate stages of this is an excellent example of evolution in the fossil record. Otherwise, the first two slits become the jawbone, and the other slits become different anatomical structures in different organisms. But importantly, in fish, surprise, surprise, the gill slits become… GILLS!

And this really gets to the essence of what we can retrieve from Haeckel’s theory. Clearly, Ontogeny does NOT recapitulate phylogeny. However, ontogeny does organize according to phylogeny. By which I mean, we can look at the developmental forms of different organisms and infer evolutionary relationships between them. So, to compare a fish embryo, a chimpanzee embryo, and a human embryo, is to show pretty clearly that there are more similarities in the way a chimpanzee and a human develop compared to either one and a fish. All three start off with the same number of pharyngeal slits, but only chimpanzees and humans form ear bones, and only the fish forms functional gills. And we don’t have to rely on Haeckel’s drawings, either- most biology textbooks use photographs of embryos, which may be a bit harder to interpret, but which are at least more accurate than Haeckel’s drawings.

But evo-devo isn’t limited to examining embryos. Nowadays the most exciting research in this field is directly or indirectly related to the domain of molecular biology and genetics. That’s right- just like virtually everything else in biology, it comes down to genes. Hox genes, specifically. Hox is short for homeobox, and refers to a region of DNA within a particular gene that allows that gene to turn on or off other genes once it’s been translated into a protein. Hox genes function to promote embryonic development and to structure the developing body plan. Different Hox genes are expressed at different locations along the body, from head to tail, and these signals allow for the expression of other genes which activate anatomical characteristics that are specific to one region of the body. The early work in understanding Hox genes was done (as was most genetics research) in fruit flies. By changing the order in which Hox genes were turned on or off, researchers could cause legs to grow where antennae should be, or to cause the generation of a second pair of wings. Not surprisingly, Hox genes are remarkably highly conserved among vertebrates, and even to a lesser extent among invertebrates. Even more interestingly, the Hox genes are first activated at the stage in embryologic development just prior to observable differences between different organisms. These show that the regulation of gene transcription is a remarkably potent force in evolutionary development, and may have a higher impact than direct mutations on specific genes.

So, let’s review. Evolutionary biology and developmental biology join forces to study how evolution affects embryologic development. The creationist criticism of Ernst Haeckel’s embryo analysis is over a century too late, since evolutionary theory rejects Haeckel’s theory that ontogeny recapitulates phylogeny. However, work on evo-devo shows that not only are evolutionary relationships evident when comparing the development of different organisms, but there exists a genetic mechanism for these relationships in the modulation of the Hox genes.

Saturday, March 04, 2006

What is Junk DNA?

Chris Morris asks “What are currently the best explanations for the origin and function (if any) for so called ‘junk DNA’?”

What does it mean to talk about “junk” DNA? Well, first of all, it’s not a scientific concept, and so it’s extremely vulnerable to confusion, especially by laypeople. Briefly, “junk” DNA refers to the content of a genome that does not contain functional genes. A more accurate term to use is “noncoding” DNA, because “junk” is a pretty subjective adjective. One man’s junk can be another man’s treasure, as anyone who’s ever shopped at a yard sale knows well. The same thing could be said, more or less, about “junk” DNA.

An organism’s genome is comprised of the sum total of all the genetic information it contains. In most organisms, this is divided up into distinct units called chromosomes. Each chromosome, in turn, is a long chain of nucleotide bases, millions and millions of bases long. The analogy is often used of a genome being compared to a library of books, with each separate bookshelf compared to a separate chromosome. Each book represents a section of the chromosome, and contains different stories, which represent individual genes.

The problem with this analogy is that in the books, the stories are separated by pages and pages of garbled text, that aren’t meaningful as stories at all. In addition, the stories themselves are cut up into many different parts, each separated by pages or sections of pages of nonsense text. I’ve never seen a book like this, but I have seen plenty of magazines. To understand this better, it helps to think of noncoding DNA as advertisements in a magazine, and the genes as individual articles. Usually there are pages and pages of advertisements that separate each article from each other, and the articles themselves are often split up. A three-page article might start on page 50, be interrupted by ads on page 51, resume on page 52, be interrupted again on page 53, and then finish on page 54. Although the article itself only took up three pages in the magazine, it was a full five pages from start to finish, if you include the advertisements.

Genes are split up like this in the genome. If you examine the genomic sequence that results in the expression of a particular protein, you’ll find that there are segments of the sequence that don’t actually translate into protein sequence, but which separate regions of the sequence that do. In molecular biology, the regions of the genomic sequence that are translated into protein are called exons, and the regions that are not are called introns. So, exons are analogous to the article itself in a magazine, and the introns are analogous to the ads.

Now, the obvious question is, why have introns? Just like you could go through a magazine, cut out the advertisements, and lose none of the article, it’s also possible to cut out the introns from a genomic sequence and get normal expression of a gene. (This is called cDNA- briefly, it’s made by reverse-transcribing mRNA) It’s this question that gets most creationists fidgety- having introns just seems more than a tad inefficient, since each cell has to expend some energy in cutting them out during gene expression. The criticism has been made that a perfectly efficient Creator wouldn’t design a gene expression mechanism and then clutter it up like ads clutter up the average magazine. Well, it turns out that having gene expression work this way actually makes great evolutionary sense. You see, if an organism is able to radically modify an existing gene, then it might be able to use it for a different purpose. A good comparison for this is something like a cordless electric screwdriver. It would cost too much to buy a dozen different screwdrivers, each with different bits, and so they all come with interchangeable bits that all interlock with the motorized axle. This lets you use the same basic function for different applications. Many genes are like this also. It turns out that the existence of introns allows for the gene expression machinery to decide which exons to include in the final gene product. This process is called “alternative splicing,” and it effectively increases the amount of variability in the genome without being dependent on individual mutations. Instead, any given gene can produce alternatively spliced versions of itself that may be advantageous in different situations.

So, even though intronic sequences are noncoding, it’s pretty clear that they’re certainly not useless.

Well, okay, that’s all well and good for the noncoding DNA that exists within a gene itself, but what about the long stretches of noncoding DNA that separates each gene from the other?

Well, that’s not completely worthless either. There exist in an uncertain boundary around each gene in the genome, a region of noncoding DNA that still plays an important role in DNA expression. These are called regulatory sequences. Imagine that the DNA expression machinery is a road crew truck filled with safety barrels. The road crew only wants to put the barrels down where there’s going to be work done, and so it looks for a sign along the road to guide it. Let’s say, the work is going to be done between mile marker 13 and 14 on a particular highway. Well, the road crew is going to watch for the mile marker 13 sign on the side of the road, and then they’re going to start putting barrels down. Regulatory sequences work in a similar way. The DNA machinery is looking for a gene sequence in the genome, so it can start making protein. But in order to start transcribing the DNA, it needs to know where to start. The regulatory sequence is a physical marker for this, in that it physically interacts with the DNA machinery. Once the DNA machinery binds to the regulatory sequence, it can start transcribing the sequence downstream, even though the regulatory sequence itself doesn’t get transcribed. Because this sequence promotes the transcription of the gene it’s next to, it’s called a promoter sequence, and it’s very important.

So, here are two clear-cut examples of how noncoding DNA is actually very essential to the proper expression of genes in the genome, despite the fact that it doesn’t get transcribed itself, and never becomes a protein. It’s at this point that creationists often crow about the “supposed junk DNA” that isn’t really junk at all. Clearly, there’s an important purpose to this junk DNA, so it can’t be used as a criticism against special creationism, right?

Well, not exactly. You’ll notice that most of the concepts I discuss here are not clearly black or white, and this is no exception. While it is true that there are some sequences of noncoding DNA that are important, there is far more noncoding DNA that has no recognizable purpose. For example, there are short repeating segments of DNA called microsatellite regions. These differ wildly between different individuals, and are most commonly used to screen for genetic parentage. There are regions of DNA that do nothing but shuffle around inside the genome itself, called transposons and retrotransposons, depending on their mechanism of mobility. One variant of these, called Alu sequences, make up close to 10% of the human genome. There are genes which have become broken and do not work anymore, called pseudogenes, that still inhabit the genome despite being completely nonfunctional.

Now, despite the fact that most of the noncoding genome could be considered truly junk, creationists often raise the objection that there could be some unknown purpose for the rest of the noncoding genome that science has not yet discovered. This may be true, but it is not a good argument. We should no more assume that all noncoding DNA isn’t junk because we haven’t yet found a use for it than we would assume that all rocks are gems just because we haven’t yet found anyone who wants to wear a limestone necklace.

So, just to review, the majority of a genome (at least, the human genome) is noncoding sequence. Some of this noncoding sequence is truly junk, as in the case with transposons, pseudogenes, etc., and some of it is important for proper genetic expression. It’s not all junk, but it’s not all gold, either.