But we've been making one mistake, and that's to call A the ancestor. Imagine the common ancestor actually had the same STR counts as 3. That common ancestor was ancestor to A, who was then the common ancestor to 2 and 1. That scenario would have the exact same number of changes. In fact, it would really be the same tree. Using just maximum parsimony data, we can't distinguish. In fact, the 'root' -- the common ancestor -- could be either 1, 2, 3 or A, or even half-way between 1 and A.
This is a limitation of maximum parsimony within a single cluster of individuals; you can find the best tree, but you can't locate the root within the tree (maximum likelihood says it's A, but that's only a probability).
When we go to four individuals, we can construct all sorts of trees, and so we need to set some rules. There are the rules I'm going to set.
- All 'ancestors', which we've seen may not really be ancestors, so we'll call them nodes -- they're the blue, lettered circles -- are connected to exactly three other points, either other nodes, or individuals
- All individuals, which we've seen might also be ancestors, are connected to only one other point, which must be a node. We'll call them vertices.
- There are no cycles or rings, where (say) A is connected to B, which is connected to C, which is connected back to A. This doesn't work, even in Appalachia.
You might object that one ancestor might have four or more descendants. And that's true. But we can take care of it by simply saying there are two connected nodes, A and B, with 0 distance between them. Or more.
The beautiful thing is we end up with a rather small number of possible trees, and a systematic procedure to find the minimum parsimony tree. In fact, there's the only possible 4-tree that obeys our rules. And below it is the only possible 5-tree.
So if we have say five individuals and want to find the most parsimonious tree, we have a three step procedure- Arrange the individuals in the empty vertex slots on the 5-tree, in every possible unique configuration
- Use a systematic or random procedure (I use the latter, an algorithm called Monte-Carlo/Metropolis) to change the STR counts of the nodes until the maximum parsimony solution is encountered
- Select the vertex arrangement which together with maximum parsimony gives the lowest overall number of changes.
Our final post will consider how to root the tree.
No comments:
Post a Comment