You've seen how information about DNA itself can be used to reconstruct phylogeny. The traditional way of approaching the problem involves identifying homologous morphologic features in different groups to find similarities. During the 1800s, fossil collections in museums accumulated at a rapid pace as paleontology grew into a discipline. Knowledge about living organisms expanded rapidly too, as Darwin, Wallace, and other field biologists explored the Earth and collected, tabulated, and measured. The pace of discovery continued through the last century, so that by now we have well-supported phylogenies for all major groups of Life, mostly built by comparison of morphological features -- DNA studies are new and welcome additions to the traditional morphology-based approach.
Studies of evolution using morphology as a guide ranged from the rough survey, based on expert opinion, to modern rigorous computer-based analyses. In different areas of science there is a gradual march toward methods that better satisfy the need for rigor and testing in scientific work. For morphology-based, and later DNA-based, phylogeny studies, this refinement of methods came about especially with the work of Willi Hennig , a specialist on insects. If there is a group that shows more diversity in morphology, it has to be insects, of course the largest of groups. Hennig studied flies and their classification, using methods that he developed to carefully compare species and the characteristics they share. The pattern of sharing of morphological features, just like the pattern of sharing of SINES, is the key to reconstructing phylogeny. Hennig's methods came to be known as cladistics, or phylogenetic systematics. You've heard these words in one form or another, but here's a review of their meaning in this context:
The root word is clade, referrring to a node on a phylogenetic tree. A clade is a monophyletic group, which strictly means that it starts at that node and contains all descendant parts of the tree extending from that node. It is really simple. As Darwin requested, our classifications of Life should be based on genealogy, just like the family tree of your actual family is drawn according to heredity. In the following trees, only the first one has a clade (monophyletic) marked:
A clade starts at a node and contains everything extending from that node. A clade is monophyletic, and is said to refer to "an ancestor and all of its descendants."
A clade is complete (not partial) -- nothing is left out of the part of the tree extending from the node to which the clade refers. Consider the group name Reptilia, which you probably learned as including turtles, lizards, snakes, crocodiles and alligators, and dinosaurs. But, fossil discoveries show that birds are also members of this group (Birds were left out of Reptilia, which made Reptilia invalid). Putting birds inside Reptilia, actually inside Dinosauria, makes Reptilia valid, because it refers to an ancestor, the reptilian ancestor at a node on the family tree, and all of its descendants (including birds).
A clade does not contain a non-contiguous set of branches. This would be called a paraphyletic (para, for "almost") group. In fact, when you aren't careful you can use a name for a group that is actually defined like this, which is not a natural, genealogical group. That's a good point to ponder -- Why would anyone define such a group in the first place? Well, they wouldn't -- on purpose. Such groupings are mistakes that are discovered when a new study is done, or when more information comes available to change the evidence on which the reconstruction is made. The heavy lines would have been clustered together in an original designation of the group, but, as shown by the "spreading out" of the members of the original grouping, it is shown to be invalid (it is paraphyletic; it doesn't contain an ancestor and all of its descendants, but only widely scattered, more distant relatives). A monophyletic group will plot as a contiguous cluster of branches on a tree.
The root word is phylogeny, or evolutionary history. A phylogeny is essentially what we informally refer to as a family tree. A phylogeny is a genealogy, or evolutionary history. Groups named on a phylogeny must be monophyletic, as outlined above. Named groups are clades, natural groups that outline an ancestor and all descendants.
As you will see below, cladistics, or phylogenetic systematics, is a carefully done effort to reconstruct evolutionary history for a group of species. The care is taken so that "testability," and "repeatability," and "openness," all important qualities for scientific studies, are foremost. When you do something systematically, like fixing your car, you don't go stabbing in the dark for answers, you carefully examine your car, find various symptoms of the problem at hand, and try out (test) various possible causes. A good mechanic has experience to do the job in the most simple way, which would be the most cost-effective for a mechanic's work. Simple is better. More systematic is better.
Regarding "openness," a little background on philosophy is needed. For science to work well, all available evidence pertaining to a problem should be studied and described. A scientist should not pick and choose evidence, whether on purpose or not, that fits with one explanation. A scientist should strive to keep an open mind, and to share openly with other investigators any and all bits of information. This reduces the risk of science becoming too "expert opinion-based," because that makes it hard for a fellow investigator to tell how the work was done. The ability to repeat and test a study is important.
A taxon is a named group. It is a clade. It can be a single species (because a species is really a clade, made by a group of populations). Traditionally, taxa (singular taxon; plural taxa) have been designated by formal rank, as with Kingdom, Phylum, Class, Order, Family, Genus Species -- the formal ranks used in the Linnaean classification system you've probably learned before. These ranks aren't absolutely necessary. Many of the traditionally defined taxa are valid, but the idea of ranking them one versus another is not that meaningful, as it supposes some sort of equivalence. For example, is a family within the horse lineage, with perhaps just a few species in it, somehow equivalent to a family within the beetle lineage, with perhaps several thousand species? No. So, let's not worry with these ranked categories here, although you certainly will encounter them in the older literature and in publications that continue to use them.
Take any three organisms or groups (taxa), let's call them A, B, and C. There are only three ways that three taxa can be related. There are only three ways that evolution could have happened to result in the three groups. Here are the three ways:
Three trees, each with the same simple branching pattern, are possible for three taxa. The three taxa are placed in the different possible placements on this branching pattern (which happens to be three ways). One of these is the way it actually happened. Our task is to identify it, as best we can given available evidence. If we were using SINES in DNA, we could perhaps identify it with high confidence. But with morphology and other types of DNA evidence, we need to find out which solution best fits the evidence. The best fit is the simplest explanation.
Because more complicated explanations are more complicated. This is a kind of primacy of thinking called Ockham's Razor . You can read there about how this underpinning of science is defined in different ways in different contexts, but keep one thing central in your mind: If we let our investigations be free-for-all grab bags of information and ideas, where individual scientists argue this way or that without having to explicitly show evidence, we loose the ability to be careful, or to know that we are at least trying to be careful. Using simplicity as a criterion for selecting amongst several alternative explanations offers important structure to thinking. Without it, we could be stabbing in the dark, counting on luck or intuition to give us answers. Cladistics, the effort to reconstruct the pattern of evolution, offers a special opportunity to work in a structured way, because it involves choice between discretely different patterns. Look back at the three-taxa problem above. That's about as fundamental as anything gets. There are three ways that A, B, and C could be related -- three ways that evolution could have happened. Our task is to identify the one we think is the way it actually happened.
Instead of using the generic A, B, C taxa, let's use real taxa. Cats, dogs, and frogs work well. These informal references will work fine (there are formal names like Felidae, Canidae, and Anura, but we'll skip those here). So, there are three ways that these taxa could be related:
Each of these three trees explain a possible way evolution could have happened. They each must be tested. The cat the and frog could be more closely related (Hypothesis 1). The cat and the dog could be more closely related (Hypothesis 2). Or, the dog and the frog could be more closely related (Hypothesis 3). You can pick the simplest explanation, because you know that the cat and the dog are mammals and the frog is an amphibian, but follow along to learn the fundamentals of cladistics.
There are SINES that could be used as characters, and plenty of information from gene sequencing of these groups could be used, but here we are focusing on morphology. Here are a few characters from the study of morphology for these groups (Note the use of the terms character, instead of characteristic, and character state):
Character: Body Covering
frogs: bare skin
Character: Parental Care
cats: mammary glands
dogs: mammary glands
frogs: none (no parental care)
cats: warm-blooded metabolism
dogs: warm-blooded metabolism
frogs: cold-blooded metabolism
We could list a huge number of characters that could be compared, and they wouldn't all show the same pattern as the three used here, but even if all known characters were considered here, the cat-dog pairing (Mammalia) would be the simplest solution:
The middle branching pattern is shown to be the simplest explanation (Hence the Eureka! label). It is the most parsimonious explanation, if you like bigger words. The other two possibilities would involve loss of various "mammal" features by the frog, and would represent more complicated evolutionary histories (possible, but not as probable, using the criterion of simplicity). We have left out a huge number of other animals (fishes, other amphibians, other mammals, reptiles) that would plot between these taxa. Complete analyses that include them show even more clearly that Mammalia is a strongly supported clade. Taking advantage of these broader studies, and as you probably know already, these three taxa represent the two major Vertebrate animal clades Tetrapoda and Mammalia:
The cat and the dog are mammals, sharing the most recent common ancestor, in the context of the three taxa here, of all mammals. The three taxa, cats, dogs, and frogs, all with four legs, share the common ancestor of all tetrapods (lungfishes, and other fishes more distantly, were ancestral to tetrapods). When a grouping of taxa is well-supported, like for Mammalia, a set of character states that were new in the common ancestor are recognized as synapomophies of the group. Syn for shared, and apomophic for "new features." For Mammalia, the three synapomorphies given above are: hair, mammary glands, and three middle ear bones. These features were not present in more distant common ancestors with frogs and other animals.
As we add other taxa to a question of relationship, an important phenomenon appears: the number of possible trees increases rapidly. For four taxa, there are 15 ways they could be related. The reason for the increase in possibilities is, in part, because of the increased number of arrangements possible for putting taxa on the trees. But there are also more unique branching patterns, or topologies, when more taxa are added to the problem. For four taxa, there are 15 trees possible, with two topologies:
The simply bifurcating branching pattern (topology) is the easiest to think about. That accounts for 12 of the 15 possible trees. The other three are on the other topology possible for four taxa, a "double-branching" pattern.
There are three branching patterns possible for five taxa, and 105 trees possible (Imagine 105 trees, varying with these branching patterns).
If you ask how any five organisms are related -- if you ask what is their evolutionary history -- there are 105 ways you can answer the question. There are 105 different ways that evolution could have occurred to produce the lineages in which these five taxa belong. Recall that for three taxa, there were only three ways evolution could have happened.
As the number of taxa increases for a study, the number of trees that must be evaluated skyrockets, because there are so many ways to rearrange taxa on so many possible branching patterns. Here's a table of the numbers of trees for up to 30 taxa:
Number of Taxa Number of Trees
So, you see the need for a computer! Luckily, the last few decades of the computer generation have given us fast computers and amazing software, and smart people, e.g. John Huelsenbeck, to help design the software. We can pick one popular software program to mention for learning about the sophistication involved: PAUP, which stands for Phylogenetic Analysis Using Parsimony. Catchy title, eh? When scientists use PAUP, they type in data like the animal characteristics shown above, or sequence data, or gene data, or SINES data, and they set some parameters about how to search for a solution, and then they kick back and wait for a computer to whir away on the problem. You can imagine the great amount of computation to be done, with all the possible trees and all the possible rearrangements of taxa on those trees. But, amazingly, the software works, and investigators are able to discover the simplest solutions to their problems, not always with perfectly clear-cut results, but that do the best with the data at hand.
We could keep going on details of methods, but by now you appreciate how modern-day studies of evolutionary relationships work. Investigators do the following:
Collect information about genetic and/or morphologic characters in various species,
Type that data into computer programs like PAUP,
Then discover evolutionary patterns that best fit the information.
The overall "family tree of Life" that is currently supported is like the one presented in the appendix of a Historical Geology textbook, Evolution of the Earth , Seventh Edition, by Donald R. Prothero and Robert H. Dott, Jr.:
You are a member of Chordata, by the way. The closest relative of ours, and the other chordates are the echinoderms. A main piece of evidence for this relationship is the "deuterostome" condition shared by the echinoderms and chordates, which involves the way an early opening in the "body" (at the gastrula stage of development) eventually becomes the anus. In protostomes, this opening becomes the mouth.
We can represent this tree as an indented list:
- Eukaryota (or Eucarya)
- Porifera (sponges)
- Cnidaria (corals, jellyfish, sea anemones)
- Arthropoda (insects, crabs, shrimps, barnacles, trilobites, eurypterids)
- Annelida (earthworms, etc.)
- Brachiopoda (lampshells)
- Bryozoa ("moss" animals)
- Mollusca (clams, snails, octopus, squid, nautiloids, ammonoids)
- Echinodermata (starfishes, sea urchins, sand dollars, crinoids)
- Chordates (fishes, amphibians, reptiles, which includes birds, and mammals)
A great web resource for seeing the "big picture" is the Tree of Life website .
We already have a view of how humans fit within Primates. Let's step out to Mammalia and find Internet resources about mammals and their relationships, especially involving morphology and fossils. You'll find plenty of information about genetics too, and you'll see phylogenetic analyses that combine genetic and morphologic data, but try to focus on those studies using morphology.
A good place to start is this description of Animal Diversity Web at the University of Michigan Museum of Zoology (Click on Mammals there).
When learning, writing a review paper in your own words is a rewarding exercise. Search around the Internet to see what you can find about different mammal groups, for example, and pick one to write a report about their evolutionary history. Make sure that what you find has a description of the relationships using modern methods (You see a cladogram, terms like synapomorphy, monophyly, parsimony, etc.). Here is a list of some placental mammalian clades to investigate:
Rodentia - rodents
Lagomorpha - rabbits, hares, and pikas
Primates - tarsiers, lemurs, monkeys, apes
Chiroptera - bats
Carnivora - dogs, bears, cats, weasels, seals, sea lions
Proboscidea - elephants and relatives
Perissodactyla- horses, tapirs, rhinoceroses
Artiodactyla- even-toed ungulates
Cetacea - whales
Search for Internet information to help you describe:
M. D. Hendy; C. H. C. Little; David Penny, Comparing Trees with Pendant Vertices Labelled, SIAM Journal on Applied Mathematics, Vol. 44, No. 5. (Oct., 1984), pp. 1054-1065.
Technically, the trees shown on this page are "rooted." Search on "rooted unrooted" to find discussions of the difference.