To identify flax rust resistance genes and to develop molecular markers to increase the effectiveness of the breeding process
Flax rust is a serious disease that reached epidemic levels in North America two or three times in the previous century. Since the 1970s, however, immunity to flax rust in all Canadian cultivars has depended on the L6 allele for rust resistance. This reliance on a single version (allele) of this gene for rust resistance could place the Canadian flax industry in peril in the event of a pathogen population shift that would defeat L6, or following introduction and establishment of a foreign race into Canada.
Genomic, molecular biology, and bioinformatics methods were used to achieve the following goals:
- Obtain better knowledge of the five known rust resistance genes: K, L, M, N, and P;
- Examine the allelic diversity of the rust resistance genes; and
- Develop molecular markers to facilitate the introduction of new alleles into Canadian cultivars.
The location of the K1 gene was unknown at the beginning of this project, so we used a software pipeline called QTL-seq and next-generation sequencing (NGS) data to identify a quantitative trait locus (QTL) associated with resistance in a Raja × Bison population. Using this information, we developed molecular markers to map the K1 gene from Raja to a 300-kbp region at one end of Chromosome 5 (LG5). Analysis of the DNA sequence from this region showed six putative disease resistance genes and several sequencing ambiguities. Bacterial artificial chromosomes (BACs) from this region were sequenced using PacBio, a technique that can resolve these ambiguities. Interestingly, the L gene for rust resistance is located very close to the K1 gene. Two possible gene candidstes were identified using both the BAC sequences obtained in this project and previous knowledge of Resistant Gene Analogues (RGAs). PCR products were amplified from Raja, Bison, Sorrel and Glas and sequenced, however, each was identical, suggesting that these genes are not the K1 gene. Additional molecular markers were designed within 5 cM of the L gene to further refine the location of the K1 gene. Unfortunately, none of the markers were polymorphic. A more stringent analysis of the resequencing data from the Raja x Bison F2 individuals used for QTLseq indicated that many of the SNPs identified in the first analysis were less significant than initially believed and should have been removed from the results before the molecular assays were designed. Several potential SNPs between Raja and Bison have now been identified that have a high level of confidence. Using this more stringent analysis we identified SNPs and INDELs in three putative Resistance Gene Analogs (RGA) in the region around the L gene for rust resistance. Other variants of these RGA genes from 35 other genotypes were obtained using sequencing data from a different project.
The sequence and location of the L gene and 13 of its alleles were already known. To confirm them, we sequenced the subset of the differential lines carrying the L alleles. We then developed molecular markers to distinguish all 13 alleles and estimated the frequency of each allele into the core collection of ~400 lines using NGS sequencing data and an in-house designed bioinformatics approach. Only 50% of the 400 lines were confirmed to carry one of the 13 known alleles. Additional sequencing of the L genes of the remaining lines showed some carry new alleles. At least 29 L gene sequences were sufficiently different from the known 13 to be called “new”. Structural analyses of these new alleles suggest recombination is the predominant mechanism behind this astounding diversity as opposed to natural mutations.
Multiple copies of the M gene were known to be present prior to the start of this project. We localized the M gene region to LG13 using the flax reference sequence. This region contained several gaps and ambiguities, likely arising due to the presence of multiple repeated genes. PacBio sequencing of BACs, as well as PacBio sequencing of genomic DNA, were used to resolve the sequence of this 680-kbp region. Sequence analysis identified 10-14 copies or partial copies of the M gene. Further sequence analysis to differentiate active M genes and pseudogenes was performed. The M-genes in CDC Bethune were compared with one another to determine similarities and differences. A cluster of seven genes with a high level of similarity to the already published M gene alleles was located at the end of the consensus sequence. Six of these genes are arranged as tandem repeats, separated by ~4.7 kbp. The other gene is located ~60 kbp further removed from the others and appears to be present as a single copy. Comparisons in the regions containing these seven M gene homologs were made amongst 35 genotypes from Canada and around the world.
Alleles of the N and P genes were to have been identified in a similar manner to the L gene. Assembly of NGS against the published sequences of these genes was challenging, compared to the L gene, as there are four and two repeats of the N and P genes, respectively. Admixture in some of the differential lines over the years (1981 until now) rendered this task difficult. However, much effort was expanded to reconstitute the differential set. While we cannot claim to have pure lines of all 30 differentials, substantial progress has been made. This enabled us to develop markers that can distinguish four of the five P alleles.
- The K1 and L genes are located close to one another on LG5. Identification of the K1 gene will facilitate the introduction of this resistance gene into cultivars that also have the L6 gene. Three candidate K1 genes have been identified and alleles for these genes identified in 35 different genotypes from Canada and around the world. This information will enable the efficient stacking of multiple rust resistance genes in future flax cultivars.
- Molecular markers able to distinguish different alleles (versions) of the L gene were developed. These will be useful for breeding programs to quickly broaden the genetic diversity of flax cultivars but still maintain rust resistance.
- Sequencing of the L alleles of the core collection of 400 lines revealed a far greater diversity than anticipated and at least 29 new alleles were identified, providing an important source of resistance for breeders in the event of a shift in the pathogen population or the introduction of new races in Canada.
- The M gene region is complex and contains multiple rust resistance genes. A cluster of six M gene homologs plus an additional nearby homolog were identified. Genetic differences in these homologs were identified in 35 genotypes
- Molecular markers for the N and P regions have been developed. Indel markers can distinguish four of the five P alleles but markers for N could not be validated. These markers are still useful but complete validation is needed for their comprehensive deployment into breeding programs.