Title: Functional divergence of gene duplicates through ectopic recombination
Abstract: Scientific Report16 October 2012Open Access Functional divergence of gene duplicates through ectopic recombination Joaquin F Christiaens Joaquin F Christiaens Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium VIB Laboratory of Systems Biology, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Sebastiaan E Van Mulders Sebastiaan E Van Mulders Department of Microbial and Molecular Systems, Centre for Malting and Brewing Science, Faculty of Bioscience Engineering, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Jorge Duitama Jorge Duitama Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium VIB Laboratory of Systems Biology, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Chris A Brown Chris A Brown Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium VIB Laboratory of Systems Biology, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Faculty of Arts and Sciences Center for Systems Biology, Harvard University, Cambridge, Massachusetts, 02138 USA Fathom Information Design, Boston, Massachusetts, 02114 USA Search for more papers by this author Maarten G Ghequire Maarten G Ghequire Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Luc De Meester Luc De Meester Department of Biology, Animal Ecology and Systematics Section, B-3000 Leuven, Belgium Search for more papers by this author Jan Michiels Jan Michiels Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Tom Wenseleers Tom Wenseleers Department of Biology, Animal Ecology and Systematics Section, B-3000 Leuven, Belgium Search for more papers by this author Karin Voordeckers Karin Voordeckers Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium VIB Laboratory of Systems Biology, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Kevin J Verstrepen Corresponding Author Kevin J Verstrepen Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium VIB Laboratory of Systems Biology, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Joaquin F Christiaens Joaquin F Christiaens Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium VIB Laboratory of Systems Biology, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Sebastiaan E Van Mulders Sebastiaan E Van Mulders Department of Microbial and Molecular Systems, Centre for Malting and Brewing Science, Faculty of Bioscience Engineering, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Jorge Duitama Jorge Duitama Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium VIB Laboratory of Systems Biology, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Chris A Brown Chris A Brown Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium VIB Laboratory of Systems Biology, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Faculty of Arts and Sciences Center for Systems Biology, Harvard University, Cambridge, Massachusetts, 02138 USA Fathom Information Design, Boston, Massachusetts, 02114 USA Search for more papers by this author Maarten G Ghequire Maarten G Ghequire Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Luc De Meester Luc De Meester Department of Biology, Animal Ecology and Systematics Section, B-3000 Leuven, Belgium Search for more papers by this author Jan Michiels Jan Michiels Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Tom Wenseleers Tom Wenseleers Department of Biology, Animal Ecology and Systematics Section, B-3000 Leuven, Belgium Search for more papers by this author Karin Voordeckers Karin Voordeckers Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium VIB Laboratory of Systems Biology, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Kevin J Verstrepen Corresponding Author Kevin J Verstrepen Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium VIB Laboratory of Systems Biology, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium Search for more papers by this author Author Information Joaquin F Christiaens1,2,‡, Sebastiaan E Van Mulders3,‡, Jorge Duitama1,2,‡, Chris A Brown1,2,4,5, Maarten G Ghequire1, Luc De Meester6, Jan Michiels1, Tom Wenseleers6, Karin Voordeckers1,2 and Kevin J Verstrepen 1,2 1Department of Microbial and Molecular Systems, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium 2VIB Laboratory of Systems Biology, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium 3Department of Microbial and Molecular Systems, Centre for Malting and Brewing Science, Faculty of Bioscience Engineering, KU Leuven, Kasteelpark Arenberg 22, B-3001 Leuven (Heverlee), Belgium 4Faculty of Arts and Sciences Center for Systems Biology, Harvard University, Cambridge, Massachusetts, 02138 USA 5Fathom Information Design, Boston, Massachusetts, 02114 USA 6Department of Biology, Animal Ecology and Systematics Section, B-3000 Leuven, Belgium ‡These authors contributed equally to this paper. *Corresponding author. Tel:+32 (0) 16 75 13 90; Fax:+32 (0) 16 75 13 91; E-mail: [email protected] EMBO Reports (2012)13:1145-1151https://doi.org/10.1038/embor.2012.157 PDFDownload PDF of article text and main figures. Peer ReviewDownload a summary of the editorial decision process including editorial decision letters, reviewer comments and author responses to feedback. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Gene duplication stimulates evolutionary innovation as the resulting paralogs acquire mutations that lead to sub- or neofunctionalization. A comprehensive in silico analysis of paralogs in Saccharomyces cerevisiae reveals that duplicates of cell-surface and subtelomeric genes also undergo ectopic recombination, which leads to new chimaeric alleles. Mimicking such intergenic recombination events in the FLO (flocculation) family of cell-surface genes shows that chimaeric FLO alleles confer different adhesion phenotypes than the parental genes. Our results indicate that intergenic recombination between paralogs can generate a large set of new alleles, thereby providing the raw material for evolutionary adaptation and innovation. INTRODUCTION How organisms evolve and adapt to new environments remains a central question in biology. Gene duplication events have a crucial role in evolutionary processes, especially in the rapid development of new functions [1, 2]. Duplication might yield adaptive benefits through increased dosage of the parental gene. In addition, gene duplication can also stimulate evolutionary innovation as mutations in one or both gene duplicates can lead to subfunctionalization or neofunctionalization [3, 4, 5, 6, 7, 8]. Gene duplications are not spread evenly over the genome, occuring much more frequently in the subtelomeres, regions directly adjacent to the telomeres [9, 10, 11]. As a result, subtelomeric gene families are often large, with some families carrying as many as several hundred paralogs [9, 11]. In Trypanosoma and Plasmodium species, variable expression of subtelomeric variants of a cell-surface antigen allows these pathogens to elude the host immune system. Subtelomeric gene families in the more compact genome of Saccharomyces cerevisiae are generally smaller and are enriched for cell-surface genes, as well as genes involved in nutrient transport and metabolism [9, 12, 13, 14, 15, 16, 17, 18]. Recombination between paralogs could generate more sequence diversity in duplicated genes [19]. However, apart from anecdotal examples in specific gene families such as the VAR and VSG cell-surface genes of Plasmodia and Trypanosomes and the major histocompatibility complex (MHC) class genes in vertebrates, the occurrence and biological relevance of ectopic recombination between gene duplicates have not been systematically investigated [16, 17, 20]. Here, we present the results of a comprehensive in silico analysis of ectopic recombination in all paralog gene families in the model eukaryote S. cerevisiae. Our results show that intergenic recombinations occur predominantly in gene families that are located at the subtelomeres and/or encode cell-surface genes. To verify whether these intergenic recombination events could lead to altered phenotypes, we mimicked the intergenic recombination events that shaped the FLO (flocculation) adhesin gene family. Phenotypic analyses of these artificial FLO chimaera revealed that they were functional and conferred phenotypes that differed from their parental adhesins. RESULTS AND DISCUSSION Ectopic recombination in gene families To investigate the occurrence of intergenic recombination events in S. cerevisiae, we first identified paralogous genes by BLASTing the reference strain S288c's proteome against itself. Next, we used the MCL clustering algorithm and manual curation to define a list of 210 gene families, each comprising at least two paralogs (see methods and supplementary information online for details) [21]. For all genes in these families we collected the known GO categories as well as chromosomal locations. We attributed these characteristics to gene families if at least two genes in a family shared the annotation (the rationale being that it takes at least two genes to create a chimaeric allele). Next, we BLAST-searched each family against a database containing 24 recently published high-quality S. cerevisiae genomes (supplementary Table S1 online). For each family, we performed several in silico tests to check for evidence of intergenic recombination in the family. First, we used the SplitsTree4 programme [22[ to produce reticulate phylogenetic trees of each family. In the absence of recombination, this procedure results in a classic, unrooted phylogenetic tree. Alternatively, when a recombination event took place, this is represented in the reticulate tree as a closed rectangle. To further analyse these families and provide statistically significant proof for the occurrence of intergenic recombination, we used the Pairwise Homoplasy Index (PHI) test [23]. The null hypothesis for this test is that the observed sequence differences are due to convergent mutations, which implies that all PHI values are similar for all pairs (i.e., the PHI values do not vary with the physical distance between residues). Alternatively, in the presence of recombination, distant sites show low PHI values. These calculations then result in a single PHI value for the paralog gene family. For further details about these procedures, please refer the supplementary information online. The distribution of PHI values indicates the presence of certain highly recombinogenic families (Fig 1). Interestingly, almost all chimaeric sequences are part of families that contain either subtelomeric or cell-surface genes, or both (Table 1). Statistical analyses of these distributions with a Komolgorov–Smirnov test revealed a statistically significant enrichment for subtelomeric gene families (P-value 3.48 × 10−6) and for cell-surface gene families (P-value 5.5 × 10−3). The distribution of a combination of all subtelomeric and cell-surface gene families was also significantly enriched for chimaeric alleles. Moreover, when all cell-surface gene families are removed from the analysis, we still find a statistically higher occurrence of chimaeric subtelomeric genes and vice versa. We were unable to find other subgroups that are enriched or depleted for ectopic recombination (e.g., the P-value for the comparison between families containing intragenic tandem repeats, and those without is 0.38; supplementary Fig S1 online). Figure 1.Analyses of paralog gene families indicate intergenic recombination. Distribution of the PHI values for all paralog gene families in the S. cerevisiae genome. Most gene families cluster towards the left hand side of the graph, with high PHI values that are indicative of the absence of intergenic recombination. However, several gene families on the far right hand side (Table 1) show inter-paralog recombination. Note that this group exists almost exclusively of subtelomeric and/or cell-surface gene families. For more information on the calculation of PHI values, see Bruen et al [23] and (supplementary information online) methods. PHI, pairwise homoplasy index. Download figure Download PowerPoint Table 1. S. cerevisiae gene families showing ectopic recombination Family description Sequences in S288c Total number of sequences retreived Subtelomeric Cell-surface Genes coding for (putative) Helicase-like proteins 31 323 Yes No HXT family 16 213 Yes Yes COS family 9 116 Yes Yes Genes coding for (Iso)maltases 7 86 Yes No AAD gene family 7 74 Yes No FLO gene family 6 19 Yes Yes Type I transmembrane sorting receptor for vacuolar hydrolases and similar sequences 6 69 Yes No PHO gene family 5 79 Yes Yes Pheromone-regulated protein with a motif involved in COPII binding/putative integral membrane protein 4 27 No Yes Genes coding for MAL activators 4 75 Yes No ENA gene family 3 19 No Yes TDH gene family 3 52 No No FRE gene family 3 55 No Yes Transporters of thiamine/nicotinamide riboside 3 44 No Yes TPO gene family 2 24 No Yes FLO, flocculation. Examples of some representative S. cerevisiae gene families that undergo ectopic recombination (for a full list, see supplementary Dataset S1 online). The table contains 15 gene families that contain at least one chimaeric sequence (PHI value<10−16). First column lists function of characterized members of the identified gene family, ‘sequences in S288c’ lists the number of members of each gene family present in the S288c reference genome, whereas ‘sequences retrieved’ lists represents the total number of sequences found in 24 genomes (supplementary Table S1 online). Gene families are classified as subtelomeric or cell-surface if at least two sequences in the family meet the characteristic. Several recombinations shaped the adhesin gene family To investigate the functional implications of ectopic recombination events, we focused on the subtelomeric FLO gene family, which encodes lectin-like cell-surface adhesins that confer adhesion to abiotic surfaces and/or other yeast cells [24, 25, 26, 27, 28, 29]. These two phenotypes are biologically important for yeast cells and are relatively easy to measure and quantify [24, 29, 30, 31, 32, 33]. The amino terminus of these rod-shaped proteins is a lectin-like globular domain that contains a pentapeptide involved in adhesion to specific carbohydrate residues present at the surface of other yeast cells or host tissues [34, 35]. The central adhesin domain is formed by a repetitive pattern of a heavily glycosylated serine/threonine-rich peptide, which is thought to act as a variable rod-like spacer that helps to display the N-terminal domain to the environment [36[. To perform an in-depth analysis of recombination events between FLO paralogs, we gathered 58 more FLO sequences (including a few partial sequences or pseudogenes) from both NCBI and ENA (supplementary Table S2 online). Phylogenetic analysis revealed three subclades that cluster with the FLO1, FLO10 and FLO11 genes found in the reference S288c strain. In line with previous findings, we found extensive variation in tandem repeat length, even among adhesins from the same class [36, 37]. We first verified whether the full set of FLO genes shows signs of intergenic recombination across the repeat region. Unrooted phylogenetic trees of the N- (Fig 2A) and carboxy-terminal (Fig 2B) domains revealed that several alleles might originate from recombination across the repeat region (Figs 2A–C). The PHI test further confirmed the occurrence of several intergenic recombination events in the complete FLO open reading frames (P-value <10−16). Whereas this shows that ectopic recombination between intragenic tandem repeats does occur, analyses using all S. cerevisiae families do not show a significant enrichment of chimaera among genes that contain internal tandem repeats (see above). Figure 2.Phylogenetic analyses of FLO genes reveal extensive intergenic recombination. A–C show the principle of reticulate analysis. Shown in blue are sequences used as representative haplotypes in supplementary Fig S2 online. (A) Phylogenetic tree of nucleotide sequences coding for N-terminal domains of all complete FLO1-like sequences. (B) Phylogenetic tree of nucleotide sequences coding for C-terminal domains of the same subset of FLO1-like sequences. (C) Trees of (A) and (B) are combined in an unrooted reticulate network (i.e., the overlay of trees shown in A and B). Such a phylogenetic reticulate allows to represent recombination events between either current or ancestral sequences and provides a more precise visualization of the evolutionary history of the FLO genes. Sequence (27) appears in different places in (A) and (B), thus placing it at an extension of the corner of a closed square in the reticulate tree. Such closed squares are generated by a predicted recombination event. (D) Reticulate displaying recombination events in the N-terminal domain of all FLO1-like sequences. Note that all these analyses were performed using all available FLO sequences but for the sake of clarity, only a subset of sequences was used to generate these figures. FLO, flocculation. Download figure Download PowerPoint As repeat regions can cause artifacts in alignments, we repeated the analyses to specifically search for recombination events in the N-terminal domains of the FLO genes. These domains contain several sites important for the recognition and binding of specific carbohydrates [34, 35], which is important for pathogenicity in Candida strains [38, 39], and for FLO characteristics in brewing strains [28, 40]. The altered carbohydrate-binding properties due to intergenic recombination could therefore have significant phenotypic consequences. These analyses revealed extensive recombination between the N-terminal domains (P-value <10−16), especially in the FLO1-like N-terminal domains. The reticulate with the N-terminal domains of the FLO1-like subgroup (Fig 2D) confirms several recombination events between different N-terminal FLO1 gene domains (P-value <10−16). The high frequency of recombination and lack of ancestral sequences or non-recombined outgroups prevent identification of the ancestral FLO sequences that recombined into the present-day alleles. Nevertheless, whereas it is impossible to discriminate between recombined and ancestral FLO genes, it is possible to identify groups of genes comprising the parents and the product of a recombination event, and to examine the recombination breakpoints (supplementary Fig S2 online). These analyses revealed several chimaeric adhesin sequences, including the previously described LgFLO1 gene. Further, two types of recombination events appear to occur between adhesins, both with the potential to have strong phenotypic effect. In the first type, recombination events occur outside the repeat regions, across small regions of microhomology in the N- or C-terminal domain (supplementary Fig S3 online). Detailed analysis of these genes revealed that many recombination events in the N-terminal domain occurred in-between regions important for substrate binding [35]. Such events could subtly alter the strength and preference of substrate binding in the N-terminal domain and therefore influence FLO. In the second group, recombination occurred across the central repeat domain. These recombination events lead to variation in the length and sequence of the repeats, which in turn can also result in new combinations of functional FLO domains. Chimaeric adhesins also occur in pathogens Previous studies in yeasts such as Candia albicans and Candida glabrata have speculated that the rapid phenotypic variation produced by chimaeric adhesins could contribute to pathogenicity in these species by avoiding the host immune response [39, 41]. To investigate this, we performed a similar in silico analysis on adhesins of pathogenic yeasts by collecting sequences both for the EPA (epithelial adhesins) genes of C. glabrata (11 sequences) and the ALS (agglutinin-like sequences; 14 sequences) genes of C. albicans. Both cases revealed significant evidence for intergenic recombination (P-value <10−16), suggesting that recombination between adhesins is a common occurrence across yeast strains. Engineered chimaeric adhesins confer distinct phenotypes To assess the functional implications of recombination events between paralogs, we mimicked recombination by constructing several chimaeric FLO genes, each combining the 5′ end of one FLO gene with the 3′ end of another (supplementary Table S3 online). We used three genes (FLO1, FLO10 and FLO11) from S288c that represent FLO gene classes with distinct FLO characteristics as parents for the chimaeric genes. The protein products of the engineered chimaera retain the traditional three-domain structure of fungal adhesins while differing in N-terminal domains, total length, number and positions of tandem repeats and glycosylation sites. To assess the functionality of these engineered chimaera, we expressed each construct separately in the non-flocculent and non-adherent S288c strain. Expression levels (determined by quantitative PCR) were comparable for each of the constructs, with an average expression level around 75% of the ACT1 transcription level, which is comparable to the natural FLO gene expression in flocculent strains derived from the feral yeast EM93 [33]. We then determined flocculation strength, adhesion to diverse surfaces and cell wall hydrophobicity conferred by each of the chimaeric Flo proteins. Our results demonstrate that different domains contribute to different phenotypes and that the chimaeric adhesins confer phenotypes that differ from their parental adhesins (Fig 3; Table 2; supplementary Table S4 online for details). This demonstrates that recombination events between paralogs can generate new alleles that can display new combinations of phenotypes and/or variation in the degree of adhesion. Figure 3.Chimaeric adhesin confer new cell–cell adhesion phenotypes. To investigate the FLO (cell–cell adhesion) and agar-adhesion properties of chimaeric adhesins, three natural adhesin genes (FLO1, FLO10 and FLO11) as well as several chimaeric adhesins consisting of the N-terminal part of one natural adhesin, and the C-terminal part of another were overexpressed in a strain that does not express any other adhesin gene (see Table 2; supplementary information online for details). (A) Ectopic recombination between FLO genes generates new chimaeric alleles that display a wide array of cell–cell adhesion (FLO) phenotypes. Strains expressing the natural FLO1 adhesin or any chimaeric protein carrying the N-terminal part of FLO1 all show strong FLO. Strains displaying adhesins with a FLO10 N-terminal domain show a broad range of FLO that depends on the nature of the central- and C-terminal domain, whereas adhesins with a FLO11 N-terminal domain confer weak or no FLO (for more information about the chimaeric adhesins see the supplementary Table S3 online). (B) The cell-surface adhesion of natural and chimaeric adhesin genes was measured by expressing these genes in cells that do not express any other adhesin gene. The resulting transformants were grown for 6 days and subsequently washed under a gentle stream of water to estimate their propensity to stick to the agar surface. Strains displaying adhesins containing the central and C-terminal part of FLO11 show strong adhesion and therefore resist washing with water (see methods for details), whereas expression of other adhesins show a wide array of intermediate or weak cell-surface adhesion as measured with the plate-washing assay. FLO, flocculation; WT, wild-type. Download figure Download PowerPoint Table 2. Overview of adhesin phenotypes FLOx FLOy Length Floc. Hydrop. Agar A. Polys A. WT 11±0 9±5 + 1.0±0.2 FLO1 — 4,614 98±1 72±6 + 3.0±0.3 FLO10 — 3,510 55±2 50±10 +++ 2.1±0.3 FLO11 — 4,104 12±5 80±8 +++ 1.5±0.1 FLO11–807 FLO10847–3510 3,471 89±6 61±29 ++ 4.0±0.3 FLO11–807 FLO10877–3510 3,441 59±7 63±9 ++ 4.3±0.3 FLO11–819 FLO10877–3510 3,453 59±6 80±11 ++ 3.9±0.2 FLO11–831 FLO10877–3510 3,465 78±3 87±2 ++ 4.0±0.3 FLO11–807 FLO11574–4104 4,338 88±4 69±23 ++++ 4.5±0.2 FLO11–831 FLO11574–4104 4,362 87±7 80±3 ++++ 4.0±0.4 FLO101–864 FLO1799–4614 4,680 75±5 64±2 + 3.7±0.2 FLO101–864 FLO11574–4104 4,395 50±4 65±13 ++++ 1.2±0.1 FLO111–579 FLO1799–4614 4,395 14±6 34±6 + 0.8±0.1 FLO111–633 FLO1799–4614 4,449 18±1 72±8 + 0.7±0.1 FLO111–435 FLO10877–3510 3,069 17±2 54±13 ++ 0.7±0.1 FLO111–633 FLO10847–3510 3,297 12±1 83±6 ++ 0.9±0.2 FLO, flocculation; WT, wild type. FLOx and FLOy represent the origin of the N-terminal and C-terminal part of the adhesin gene, respectively (with the numbers denoting the nucleotide positions in the open reading frame). Length is the length of the final adhesin. Floc. is the flocculation (%) of the strains expressing the adhesins. Hydrop. is the hydrophobicity level of a strain expressing the respective adhesin (%). Agar A. represents after the observed cell-surface adhesion as estimated by a plate-washing assay (+, weak adhesive growth; ++++, strong adhesive growth). Polys A. represents the adherence of the strains to polystyrene relative to WT. CONCLUSION Our results reveal an elegant mechanism by which gene duplicates might be used as a molecular toolbox for the generation of a large array of different new alleles and phenotypes. These findings also demonstrate that paralogs do not necessarily evolve completely independently through mutation. Instead, ectopic recombination events also contribute to the generation of sequence divergence in paralogs, which in turn propels evolutionary innovation [1, 2, 3, 4, 5, 6, 7, 8]. Intergenic recombination events are observed predominantly in subtelomeric gene families, which are known to harbour lifestyle-specific genes, or in cell-surface gene families, which are involved in interactions with the cells environment. Hence, this mechanism could provide yeasts with an ever-changing reservoir of genes to quickly adapt and tune the way they interact with specific conditions and opportunities. Our results also confirm speculation that C. albicans and C. glabrata might have chimaeric adhesins [39, 42], suggesting that this allows pathogens to adapt to the host. This mechanism is also similar to those of some pathogenic protozoans, such as Plasmodium and Trypanosoma spp. where modular cell coat proteins continuously recombine to form new variants and avoid host immune system recognition [16, 17, 43]. Hence, recombination of cell-surface and subtelomeric genes seems to be a common theme in (eukaryotic) microorganisms, yielding a substantial source of phenotypic variability with only a few genes. Moreover, as recombinations are clearly not limited to yeasts but have also been observed in the MHC class genes of vertebrates, it seems likely that similar mechanisms also exist in higher eukaryotes, including humans [20]. Our results show that in yeast, two distinct categories of gene families show enrichment for chimaeric allelles: families containing genes located near subtelomeres, and gene families encoding cell-surface proteins. Importantly, this enrichment remains significant for both categories, even if all families belonging to the other category are removed from the analysis, indicating that the observed enrichment is independent. One main unresolved question is whether intergenic recombination