MIvsH

Editing is disabled for this wiki!

HomePage | RecentChanges | Preferences |
 

--Testing the effect of various evolutionary models on the relationship between mean MI and mean entropy

To test our method for identifying co-evolving residues, we used the simplest possible evolutionary model to generate MSAs. This model, when compared to more sophisticated models, gives higher background levels of Mutual Information, and thus makes the identification of co-evolving residues more difficult. We illustrate this effect in the figure below which quantifies the variation of background MI due to the evolutionary model. To generate the MSAs for this analysis, we used the program "evolver" from the PAML package.

Our results show that under a variety of evolutionary models, for a given mean MSA entropy, mean MI is slightly reduced in magnitude when compared to our simple model. However the overall behaviour of the mean MI vs mean entropy curves are the same in all cases. A more realistic evolutionary model may be employed in future work.

Figure 1. Mean Mutual Information vs mean entropy for MSAs composed of sequences evolved with "evolver" code. MSAs are composed of 200 sequences, each of which is 200 residues in length. Stars show the simplest possible evolutionary model, equal mutation rates at each site and equal probabilities for each amino acid substitution. Triangles represent mutation rates which vary as a gamma distribution across sites with the following values for the parameter alpha: left-pointing triangles=0.2, up-pointing triangles=0.5, right-pointing triangles=0.8. Circles represent mutation rates which give frequencies of each amino acid as observed in our GAD/NDK alignments using the "proportional" mutation model in evolver. Cyan circles represent equal mutation rates at all sites; green circles are for alpha = 0.5. Red symbols are for "empirical" mutation rates, using the default mutation matrix provided with evolver (a dataset from 12 mitochondrial proteins from 20 species of mammals "and close outgroups"). Equal rates across sites and rates which were gamma distributed with alpha=0.5 were also used in this scenario, but the results were indistinguishable.