Genome-wide identification and comprehensive study of anti-fungal genes in chickpea

Chickpea is an important crop that delivers nutritious food to the increasing global community and it will become increasingly popular as a result of climate change. Our objective was to use comprehensive data analysis to locate and identify candidate genes for fungal disease resistance. We used a comprehensive bioinformatics pipeline of sequence alignment, phylogenetic analysis, protein chemical and physical properties assessment and domain structure classification. In order to study gene evolution and genetic diversity, we compared these genes with known anti-fungal genes in different species of plants. A total of 19721 protein sequences belonging to 187 plant species have been downloaded from public databases, including the entire chickpea genome. We have successfully identified 23 potential anti-fungal genes in 10 different chromosomes and genomic scaffolds using sequence alignment and gene annotation. Ca2 and Ca6 have the highest number of genes followed by Ca3 and Ca4. Anti-fungal chickpea proteins have been identified as cysteine-rich (10), thaumatin (6), pathogenesis (4) and plasmodesmata (3) proteins. Analysis of the chemical and physical correlation of anti-fungal proteins revealed a high correlation between different aspects of anti-fungal proteins. Five different pattern patterns have been detected in the anti-fungal chickpea proteins identified, including domain families associated with fungal resistance. The maximum likelihood of phylogenetic analysis was successful in distinguishing between anti-fungal chickpea proteins as seen in their protein patterns / domains.


Introduction
Chickpea (Cicer arietinum) is an important crop that delivers nutritious food to the increasing global community and it will become increasingly popular as a result of climate change.Production ranked third following beans with an average annual yield of over 11.5 million tons, where India has the highest share.Land allocated to chickpea has dramatically increased and is now reported at 14.56 million hectares.More than 2.3 million tons of chickpea reach world markets every year to fund the requirements of countries unable to satisfy domestic consumption (1).
Chickpea yield is significantly affected by both abiotic as well as biotic stress susceptibility (2,3).Fungal infections have been shown to destructive effects on the chickpea production compared to various diseases triggered by a wide range of pathogens.Within numerous fungal diseases, the most prevalent foliar and root infections are the diseases caused by Ascochyta rabiei (Ascochyta blight) and Fusarium oxysporum (fusarium wilt), respectively, which cause serious crop yield decline (4).
The genome of chickpea (2n = 2x = 16) is estimated to be 738.09Mb in length, where different cultivars whole genomes have been introduced for publicly use with more than 73% of the genomic content has been successfully sequenced.The chickpea gene pool is estimated to be about 29,000 genes, and about half (49.41%) of the chickpea genome consists of transposable elements and unclassified repetitions.On the other hand, in species-specific groups, 4,468 chickpea genes occur, these groups may arise through structural rearrangements, as happens among the disease resistance genes of nucleotide-binding site leucine-rich repeat (NBS-LRR).In this regard, it has been stated that the chickpea genome has been observed as having 187 disease resistance gene homologs (RGHs) (5,6).
The continuing breakthroughs in genome sequencing and genome-wide association studies have unlocked the ability to scan genomic content of chickpea for genes to control their resistance to multiple infections of the fungal disease.Several predicted genes have been reported to have been statistically associated with chickpea fungal resistance, including NBS-LRR receptor-like kinase, wall-associated kinase, zinc finger protein, and serine / threonine protein kinase (7).In addition, it has been reported that chickpea resistance to some fungal diseases, such as Ascochyta blight, may be linked to a number of motif families, such as AT-hook motif containing nuclear localized (AHL) (8).In addition, bioinformatics methods have been accelerated over the last few years and several genomic and molecular databases have been established (9).Such repositories could be used as a central stone in the quest for anti-fungal resistance in different plant species and in the analysis of their specific and special molecular structure (10,11).Such bioinformatics tools have been used to study several gene families in chickpea, these families are considerably important in the plant defense system and essential membrane proteins (12,13).
Genome-wide characterization of anti-fungal genes in chickpea may enable researchers and breeders to overcome different fungal infections and develop new cultivars with high tolerance and better yield.Our objective was to use comprehensive data analysis to locate and identify candidate genes for fungal disease resistance.We used a comprehensive bioinformatics pipeline of sequence alignment, phylogenetic analysis, protein chemical and physical properties assessment and domain structure classification.In order to study gene evolution and genetic diversity, we compared these genes with known anti-fungal genes in different species of plants.

Materials and Methods
Genomic sequences related to anti-fungal resistance have been downloaded from the NCBI database.(14).A total of 19721 sequences of proteins belonging to 187 species of plants were downloaded from NCBI.The chickpea genome sequence was downloaded from the http:/www.cicer.infodatabase (5).Using the chickpea genome, the local BLAST+ (15) kit was used to create sequence database and align all anti-fungal amino acids with TBLASTN against the chickpea database.The NCBI TBLASTN online tool was used to annotate recovered sequences from the previous step.MEME suite (16) was used to explore patterns of amino acids in chickpea anti-fungal genes.The MegaX program was used to perform a phylogenetic analysis using a maximum likelihood algorithm (17).In order to assess the chemical and physical characteristics of the amino acids, the Pepstat program (18) was used through in-home perl scripts.These chemical properties are, A280 Molar Extinction Coefficients cystine bridges (A280-MECcb), A280 Molar Extinction Coefficients reduced (A280-MECr), Acidic (Ac), Aliphatic (Aph), Aromatic(Ar), Average Residue Weight (ARW), Basic (Bs), Charge (Chr), Charged (Chrd), Improbability of expression in inclusion bodies (IEEB), Isoelectric Point (IP), Molecular weight (MW), Non-polar (NP), Polar (Po), Residues (Re), Small (S), Tiny (T).The iTOL online tool was used to visualize phylogenetic trees combined with information on amino acids (19) .The statistical correlation analysis (pvalue<0.01)was conducted using R packages (20).The Circos package was used for displaying the genomic location of genes (21).

Identification of chickpea anti-fungal genes
Identifying anti-fungal genes in chickpea could provide a useful resource for plant breeding programs by narrowing the pool of targeted genes.We have successfully identified 23 potential anti-fungal genes on 10 different chromosomes and genomic scaffolds (File S1).The total number of amino acids was 7077, ranging from 147 to 866, with an average of 307.7 amino acids.Chromosomes Ca2 and Ca6 have the highest number of genes (4 genes) followed by Ca3 and Ca4 (3 genes) (Figure 1 and Table 1).In this regard, the entire genome re-sequencing of chickpea was used to identify 12 chromosomal regions associated with resistance to Ascochyta Blight, all of which are located on Ca4 (7).In addition, 8 quantitative trait loci (QTLs) were identified on chromosomes Ca2, Ca3, Ca4, Ca5 and Ca6 for the resistance of the same disease (22).

The chemical and physical properties of chickpea anti-fungal proteins
The chemical properties of the chickpea anti-fungal proteins were assessed across 17 different chemical and physical features of the amino acids.The total amino acid MW was 765.4,ranging from 16.0 KDa (Ca AF17) to 97.7 KDa (Ca AF9) with an average of 33.3 KDa (Figure 2 and Table 2).By studying anti-fungal proteins in wheat total MW 1913 KDa with an average of 20 KDa (10).The amino acids charge ranges from -25 to (Ca_AF13) to 14 (Ca_AF9) (Figure 2 and Table 2).The extinction coefficient is a measure of how much light at a given wavelength a chemical element attenuates.Calculating the content of the amino acid is necessary in order to determine the protein's molar extinction coefficient (23).
The A280 Molar Extinction Coefficients reduced (A280-MECr) and A280 molar extinction coefficients cystine bridges (A280-MECcb) are two separate extinction coefficient measures, where salt bridges are essential motifs of the tertiary protein structure and are mostly associated with the molecular influence force that maintains the protein's stability (24).The A280-MECr and A280-MECcb total values are 836030, 864530 M-1cm-1, ranging from 10430 and 11555 M-1cm-1 (Ca AF11) to 87560 and 90560 M-1cm-1 (Ca AF9), respectively (Figure 2 and Table 2).In some anti-fungal wheat proteins, the A280-MECc and A280-MECr minimum scores were recorded as 1740 and 1490, with the highest scores being 104570 and 103820 respectively (10).Improbability of expression in inclusion bodies (IEEB) is a type of solubility measurement.In Escherichia coli, for example, recombinant protein can be produced either as insoluble in the bodies of inclusion or soluble throughout the cytosol (25).The total IEEB of chickpea anti-fungal protein was 18.33 ranging from 0 (Ca_AF11) to 0.972 (Ca_AF16) with a mean of 0.797 (Figure 2 and Table 2).The IEIB of anti-fungal amino acids revealed an average of 0.794 by examining wheat anti-fungal proteins, ranging from 0.504 to 0.977 (10).The average collective weight as per their length for all amino acid sequences is measure though The average residue weight (ARW).

Table 1:
The chromosomal location and gene definition for identified anti-fungal genes in chickpea genome.The total ARW was 2478.4Da, where Ca AF13 and Ca AF14 have the minimum and maximum values of 101.8 Da and 116.7 Da, with an average of 107.7 Da, respectively.The IEIB of anti-fungal amino acids revealed an average of 0.794, ranging from 0.504 to 0.977, through examining wheat anti-fungal proteins (10).The isoelectric point (IP) is the pH level with which the net charge of the protein is positive, and is correlated with amino acid composition and protein structure (26).The IP of chickpea anti-fungal proteins range from 3.9 (Ca_AF13) to 8.9 (Ca_AF19) with a mean of 6 (Figure 2 and Table 2).Thus, it revealed a collective IP in wheat anti-fungal proteins with an average of 6,402, ranging from 4 to 10.4 in wheat (10).
In addition, the folded structure of a protein becomes less desirable in terms of thermodynamics because it decreases the protein disorder or entropy, where non-polar chains tend to squeeze inside the protein while polar chains push outside the molecule (27).The non-polar (NP) values ranges from 49.2 (Ca_AF20) to 62.1 (Ca_AF16) with an average of 56 (Figure 2 and Table 2).The non-polar and polar amino acid scores ranged from 48.81 and 30.081 to 69.919 and 51.19 respectively, in wheat anti-fungal proteins (10).Basic amino acids have a certain basic group within the chain whereas acidic amino acids have an acidic group within the chain.basic amino acids have high pKa while acidic amino acids have low pKa.The count of basic and acidic amino acids range from 9.011 and 9.359 to (Ca_AF13 and Ca_AF16) to 13.613 and 16.915 (Ca_AF19 and Ca_AF11) (Figure 2 and Table 2).On the other hand, in addition to the chickpea genes retrieved through this study, we have studied the chemical and physical properties of 1216 anti-fungal proteins identified in different plant species (Figure 3 and Table S2).The protein MW range from 21 KDa (Silene latifolia) to 97.7 KDa (Cicer arietinum) while the protein charge range from -25 (Cicer arietinum) to 21 (Rosa chinensis) (Figure 3 and Table S2).
A protein sequence motif is a brief pattern that is preserved by nature.For proteins, a motif may relate to the active site of an enzyme or to a functional unit required to properly fold proteins.Hence, sequence motifs are among the basic functional components of molecular evolution (16).Five specific motifs for amino acids were discovered among the sequences of identified anti-fungal proteins of chickpea (Figure 5).The maximum likelihood phylogenetic analysis was successful in distinguishing between anti-fungal chickpea proteins as illustrated by their protein motifs / domains.These motifs are motif1 (ELME000385), motif2 (ELME000094), motif3 (ELME000321), motif4 (ELME000003), and motif5 (ELME000287) (Figure 5).Motif1 is very similar to ELME000385 (pvalue of 1.60e-03), which functions as Mtr4-Air2-interaction site.This domain distinguishes the TRAMP complex, which engages in the nucleus with exosome-mediated degradation of abnormal RNAse.Oligo-adenylated tails are introduced to substrates of abnormal RNA by Air2 and Mtr4 , thus highlighting them for degradation (28).Motif2 has a high significance similarity with Integrin binding sites (ELME000094) with a pvalue of 1.16e-05.Integrins are cell surface receptors which are responsible for cell migration, cell adhesion to extracellular matrix, and cell adhesion to cells (29).Motif3 is similar to caspase cleavage motif (ELME000321) with a pvalue of 5.46e-03.Proteases caspases-3 and-7 play a major role in programmed cell apoptosis, and non-apoptotic caspases include involvement in immune response (30).Similarly, pattern 5 is similar to IAP-binding pattern (IBM): (ELME000287) that distinguishes Apoptosis Protein Inhibitor (IAP) which exhibits several immune functions, mitosis regulation, TNF-receptor signal transduction, and many more (31).Finally, motif4 was in high similar to WW domain ligands (ELME000003) motif, which are small but widespread domains are found in various regulatory circumstances (16).
The maximum likelihood of phylogenetic analysis was successful in distinguishing between anti-fungal chickpea proteins as seen in their protein patterns/domains.Where it cluster chickpea genes into 4 clusters (Figure 6).

Conclusion
It was very helpful to predict anti-fungal resistance genes using publicly available repositories and indicated that several chickpea genes could be used to limit genetic research of genes that hold the key to fungal resistance in chickpea.We have successfully identified 23 potential anti-fungal genes on 10 different chromosomes and genomic scaffolds.A high number of anti-fungal chickpea proteins are cysteine-rich (20), thaumatin (9), and pathogenesis-related (8), which could indicate the importance of these gene classes in chickpea resistance to fungal.In addition, the chemical and physical analysis shed light on the uniqueness and consistency of these proteins, where several of these parameters could be used in future research to identify anti-fungal genes in different plant species.Moreover, the domain identification analysis identified several potential anti-fungal protein domains such as TRAMP complex and caspase cleavage motifs.

Figure 1 :
Figure 1 : The genomic location and suggested definition of predicated anti-fungal genes, where the width of the internal links indicates similarity percentage of gene sequences.

Figure 2 :
Figure 2: The chemical and physical properties of identified anti-fungal proteins in chickpea.

Figure 3 :
Figure 3 : The phylogenetic tree developed using multiple sequence alignment of chickpea anti-fungal proteins and other species, where its chemical and physical properties are plotted.

Figure 4 :
Figure 4 :Statistical correlation between different chemical and physical properties of anti-fungal proteins; (A) the heatmap of inter-correlation matrix and (B) the correlation networks, where pvalue > 0.01 and R 2 > 0.3.

Figure 5 :
Figure 5: The domains/motifs found by the MeMe tool in chickpea predicated anti-fungal proteins.

Figure 6 :
Figure 6: The phylogenetic analysis of chickpea anti-fungal proteins, where the protein motif structures are shown as detected using MeMe tool.

Table 2 :
Chemical properties of identified anti-fungal proteins in chickpea.