Contributed by: Jianzhi Zhang
Most molecular biologists would agree that a gene tends to be more similar to its orthologs than paralogs in terms of function. This fundamental tenet, recently termed the ortholog conjecture, is a cornerstone of phylogenomics and is used by both computational and experimental biologists in predicting, interpreting, and understanding gene functions. But, is this conjecture wishful thinking or empirically founded?
Orthologous genes arise via speciation, whereas paralogous
genes are generated by gene duplication.
Orthologs: A1 and A2; B1 and B2.
Within-species paralogs: A1 and B1; A2 and B2.Between-species paralogs: A1 and B2; A2 and B1.
In a pioneering study, Nehrt et al. (3) attempted to test the ortholog conjecture using Gene Ontology (GO) annotations that were based on experimental data. Contrary to everyone’s expectation, they found that the functional similarity between orthologs is lower than that between paralogs, when the level of sequence divergence is controlled. Based on this and other findings, the authors proposed that protein function evolution is primarily determined by “the cellular context in which proteins act”. This would explain why within-species paralogs, which are always in the same organism, were found functionally more similar than orthologs, which by definition reside in different organisms.
Nehrt et al.’s (3) finding stirred considerable controversies in cyberspace when published in the summer of 2011, evidenced by numerous discussions in various blogs. The last 10 months have seen three papers that challenged Nehrt et al.’s conclusion from different angles, although the three papers do not completely agree with one another either.
First, Thomas and colleagues, representing the group that annotated GO, claimed that GO annotation differences between homologous genes “do not reflect differences in biological function, but rather complementarity in experimental approaches” (4). That is, gene function data are so sparse at the present that GO annotations reflect ascertainment biases in experiments rather than true functional differences.
Second, Altenhoff et al. (1) identified a number of biases in GO. After correcting these biases, they found weak but significant evidence for the ortholog conjecture.
Most recently, Chen and Zhang (2) reanalyzed GO annotations and confirmed some of the biases identified by Altenhoff and colleagues. Most disturbingly, however, was the finding of many errors in GO annotation. Even in so-called experiment-based annotations, across-species functional inferences were frequently made. For example, an experiment was conducted on a monkey gene, but the function was annotated in GO for its human ortholog, based ironically on the ortholog conjecture.
In one part of their study, Chen and Zhang (2) focused on pairs of orthologs or paralogs that have identical protein sequences and were studied in the same papers. Surprisingly, while all nine such paralogous pairs have 100% GO-based functional similarity, only nine of 31 such orthologous pairs have 100% functional similarity. More extremely, eight of the 31 orthologous pairs show 0% functional similarity, yet none of the papers that studied them explicitly mentioned their functional dissimilarity. Apparently, they reflect ascertainment biases rather than true functional differences. The authors also noted an upward trend in the functional similarity of orthologs, relative to that of paralogs, when analyzing the time series data of GO in the last five years.
These and other findings led Chen and Zhang (2) to conclude that the current GO is unsuitable for testing the ortholog conjecture. They thus turned to RNA-Seq gene expression data, which would be relative immune to ascertainment bias and annotation error. They reported that orthologs are more similar to each other than to paralogs in gene expression. But, regarding gene function, the jury is still out. The sheer difficulty of proving or rejecting the ortholog conjecture, one of the most wildly assumed principles of molecular evolution, was completely unexpected, and it still amazes me to this day.
1. Altenhoff AM, Studer RA, Robinson-Rechavi M, Dessimoz C (2012) Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs. PLoS Comput Biol 8(5): e1002514.
2. Chen X, Zhang J (2012) The Ortholog Conjecture Is Untestable by the Current Gene Ontology but Is Supported by RNA Sequencing Data. PLoS Comput Biol 8(11): e1002784.
3. Nehrt NL, Clark WT, Radivojac P, Hahn MW (2011) Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals. PLoS Comput Biol 7(6): e1002073.
4. Thomas PD, Wood V, Mungall CJ, Lewis SE, Blake JA, et al. (2012) On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report. PLoS Comput Biol 8(2): e1002386.