Characterization of Bacillus anthracis proteases through protein-protein interaction: an in silico study of anthrax pathogenicity
- DOI : 10.5667/tang.2013.0031
- Author: Banerjee Amrita, Pal Shilpee, Paul Tanmay, Mondal Keshab Chandra, Pati Bikash Ranjan, Sen Arnab, Mohapatra Pradeep Kumar Das
- Organization: Banerjee Amrita; Pal Shilpee; Paul Tanmay; Mondal Keshab Chandra; Pati Bikash Ranjan; Sen Arnab; Mohapatra Pradeep Kumar Das
- Publish: TANG Volume 4, Issue1, p61~612, 28 Feb 2014
-
ABSTRACT
Anthrax is the deadly disease for human being caused by
Bacillus anthracis . Instantaneous research work on the mode of infection of the organism revealed that different proteases are involved in different steps of pathogenesis. Present study reports thein silico characterization and the detection of pathogenic proteases involved in anthrax infection through protein-protein interaction. A total of 13 acid, 9 neutral, and 1 alkaline protease ofBacillus anthracis were selected for analysing the physicochemical parameter, the protein superfamily and family search, multiple sequence alignment, phylogenetic tree construction, protein-protein interactions and motif finding. Among the 13 acid proteases, 10 were found as extracellular enzymes that interact with immune inhibitor A (InhA) and help the organism to cross the blood brain barrier during the process of infection. Multiple sequence alignment of above acid proteases revealed the position 368, 489, and 498-contained 100% conserved amino acids which could be used to deactivate the protease. Among the groups analyzed, only acid protease were found to interact with InhA, which indicated that metalloproteases of acid protease group have the capability to develop pathogenesis duringB. anthracis infection. Deactivation of conserved amino acid position of germination protease can stop the sporulation and germination ofB anthracis cell. The detailed interaction study of neutral and alkaline proteases could also be helpful to design the interaction network for the better understanding of anthrax disease.
-
KEYWORD
Anthrax , Bacillus anthracis , protease , superfamily and family , phylogenetic tree , motif , protein-protein interaction
-
Protease, renowned as proteolytic enzymes or proteinases, refers to a group of enzymes that hydrolyzes (breakdown) proteins into small peptides and amino acids. Proteolytic enzymes are essential in various therapeutic purposes such as oncology, inflammatory conditions, blood rheology control, and immune regulation. Undigested proteins, cellular debris and blood toxins can also be digested by proteases. Current classification of enzymes unveils six broad groups of proteases, i.e. serine proteases, threonine proteases, cysteine proteases, aspartate proteases, glutamic proteases and metalloproteases. Alternatively, on the basis of the isoelectric points (pI) of catalytic proteins they can also be classified into acid, alkaline and neutral proteases. Acid and neutral proteases are involved in type I hypersensitivity by activating complement systems and kinins (Mitchell et al., 2007) and the function of alkaline or basic proteases of
Bacillus anthracis is still unknown.Bacillus anthracis , pathogens of anthrax, is a gram positive, endospore-forming, rod-shaped bacterium with 1 - 1.2 μm in width and the etiologic agent of anthrax disease. According to previous study (Russell et al., 2007) the mode of invasion ofB .anthracis can occur in three forms: cutaneous (skin), gastrointestinal (digestive), and inhalation (lungs). The cutaneous anthrax rarely fatal if treated, gastrointestinal anthrax shows 25 - 60% death and the inhalation anthrax is more deadly than others.B .anthracis possesses acid, neutral as well as alkaline protease. Some strains such asB .anthracis str. CDC 684 contains both acid and alkaline protease gene in their whole genome. The main fatal complication caused by these bacteria is hemorrhagic meningitis but the pathogenesis is still unknown. Mukherjee et al. (2011) showed thatBacillus anthracis increases permeability of human brain microvasculature endothelial cells (HBMECs) which constitute the blood-brain barrier (BBB) by secreting metalloprotease InhA on the monolayer integrity of HBMECs. According to Chertow (2011) and Tonry et al. (2012), immune inhibitor a metalloprotease (InhA) ofBacillus anthracis helps in adhesion and invasion of human brain endothelial cells by modifying cell surface properties through direct proteolysis of adhesin protein. Anthracis protease cleaves the anthrax lethal protein factor to be internalised by the host cell endocytosis (John, 2010).B .anthracis was also used as effective vector for production of recombinant proteins after deletion of six proteases (Pomerantsev et al., 2011). Experimental upshot makes it understand that protease enzyme ofBacillus anthracis carries an important active site responsible for its lethal effect. Protein sequences analysis ofB .anthracis protease, may disclose the underling secrets about the functions and evolutionary relatedness.Previous
in silico study on different enzymes like tannase from bacterial and fungal origin (Banerjee et al., 2012), alkaline proteases from different species ofAspergillus (Morya et al., 2012), xylanase fromThermomyces lanuginosus (Shrivastava et al., 2007), pectate lyase from different sources (Dubey et al., 2010), have been reported butin silico study onB .anthracis protease protein sequences is still unrevealed.Present work has been designed to understand the natures of different types of
B .anthracis proteases throughin silico comparative study. Analysis and characterization of 13 acid proteases, 9 neutral proteases and one alkaline protease were performed. The protein sequences were employed to analyze various physiochemical parameter analyses, super family search, multiple sequence alignment for homology search, construction of phylogenetic tree, common and conserved motif finding and protein-protein interaction. Physiochemical parameter analysis of individual protein sequences will help to understand the different physiochemical conditions for each individual protein which maintains their stability and also indicate respective organisms’ optimum cultural conditions. Superfamily and phylogenetic tree analysis will help to classify the proteins and their evolutionary relatedness as well as protein-protein interaction analysis will detect the enzymes responsible for pathogenesis. Finally, consensus sequences from multiple sequence alignment and conserved motifs will help to design specific primers for each different species.A total of 84 acid, 283 neutral and 31 alkaline proteases of
B .anthracis origin were downloaded from NCBI (http://www.ncbi.nlm.nih.gov/) database. Among them 13 acid proteases, 9 neutral and one alkaline protease sequences were selected forin silico analysis.> Physiochemical parameters analysis
Physicochemical data were generated from ProtParam software using ExPASy server (the proteomic server of Swiss Institute of Bioinformatics). FASTA sequence format were applied for subsequent analysis.
> Protein superfamily and family search
The Superfamily tool on ExPASy server was used for protein family search.
> Multiple sequence alignment (MSA)
The program ClustalW2 (Larkin et al., 2007) was used for multiple sequence alignment and MSA was represented by CLC-Bio sequence viewer.
> Phylogenetic tree construction
Phylip-3.69 (Tuimala, 1989) was used for phylogram construction by Neighbor-joining (NJ) method using 100 bootstrap values. Tree was edited by Dendroscope (Huson et al., 2007).
Selected protein sequences were studied for protein-protein interaction to detect the probable function using STRING Database.
Acid and neutral proteases were separately subjected to Pfam to find out conserved domains. Separated domains were subjected to Block Maker for conserved block identification. Separated blocks were used for motif finding using MEME Suite. Conserved motif of alkaline protease was deduced from the multiple sequence alignment.
Among all the deposited
B .anthracis protease sequences in the NCBI database, 13 acid, 9 neutral and one alkaline protease sequences were found to have unique sequences i.e. they all have sequence level dissimilarity and different amino acid compositions. The accession numbers of protease protein sequences along with the source organism and the type of proteases are listed in Table 1.> Physicochemical parameter analysis of proteases
The physicochemical features of protease sequences were represented in Table 2. The amino acid number for acid protease ranged from 345 to 742 with variable molecular weight. The pI value varied from 4.94 to 6.29, except sequences 5 and 11 (Acc. No. YP_018763.1 and AFH83399.1), which have their pI of 8.65 and 7.24 respectively. The above mentioned two sequences (5 and 11) were membrane bound proteases which have the aliphatic index value of 84.35 and 77.07 respectively. On the other hand sequence 3 (Acc. No. YP_030468.1) has the highest aliphatic index value of 96.36.
For all the neutral proteases group of protein different range were found in different analysis (Table 2). Germination protease (sequence 2) showed all the values similar to that of the germination protease of acid protease group (sequence3) with 4.94 pI value. Some serine proteases were also found in this group with various pI values. Accession number EJT19501.1 (Sequence 7) which was a membrane bound protease, showed the highest pI value of 9.33 and aliphatic index value of 104.64.
> Protein superfamily and family search
The entire sequences of acid, neutral and alkaline proteases when subjected to Superfamily tools on ExPASy server revealed different superfamily and family (Table 3). For acid protease 10 sequences were found with Metalloproteases (''zincins''), catalytic domain superfamily and Thermolysin-like family (Table 3). Sequence 3 (Acc. No. YP_030468.1) was found with HybD-like superfamily and Germination protease family. Sequence 5 and 11 (Acc No. AFH83399.1 and YP_018763.1) were found with Cysteine proteinases superfamily and Transglutaminase core family. The short segments were found to have similarity with thermostable phytase (Table 3). But in the case of neutral proteases most variable domains were observed in superfamily and family analysis (Table 3). Among them sequence 7 and 9 were related to ten acid proteases, sequence 2 showed similarity with germination protease sequence 3 (acid protease) and sequence 8 specified similarity with alkaline protease 1. So a sequence level dissimilarity was observed among 9 neutral proteases. These have been reflected in the multiple sequence alignment also.
Multiple sequence alignment analysis of the 13 acid proteases, the 9 neutral proteases and the one alkaline protease displayed the superfamily results of each groups. Fig. 1A and B showed consensus regions of acid proteases. Presence of consensus regions throughout the whole alignment indicated high level of sequence similarity among them. Three 100% conserved positions were found in aligned region such as position 368, 489 and 498 which have been represented in pink bar. Blue bars represented some specific changes only for sequence 3 (germination protease). Red bars represented near about conserved regions where green bars indicated the changes. In maximum cases, changes were found for two membranes bound proteases sequences 5 and 11 (Accession No. YP_018763.1 and AFH83399.1) and germination protease sequence 3 (Accession No. YP_030468.1).
A few ranges of consensus regions were found with low levels of sequence similarity in multiple sequence alignment of neutral protease group and one alkaline protease. Red bars indicated the similarity area. As the alkaline protease sequence (Accession No. YP_002814818.1) showed highest similarity with neutral protease sequence 8 (AFH85759.1) in previous experiments, they have been presented in green colour (Fig. 1C and D). Ten short conserved motifs of alkaline protease were also showed in violet box in comparison with neutral protease 8.
> Phylogenetic tree construction
Phylogenetic tree construction of all the 13 acid protease, the 9 neutral protease and the one alkaline protease showed an interesting result. It was found that 10 acid protease were cluster together in the top of the tree (Fig. 2). Two membranes bound protease sequences 5 and 11 (YP_018763.1 and AFH83399.1) were found together in the bottom of the tree. Germination protease sequence 3 was found with another germination protease sequence 2 of neutral protease group. One alkaline serine protease was found with the neutral serine protease sequence 8.
Protein-protein interactions are the core of the interactom study which also represents the secretom of an organism. Here in this study the interaction of acid, neutral and alkaline protease of
B. anthracis were studied. Among the entire category, only 9 acid proteases (Accession No. YP_002869281.1, YP_002816538.1, YP_031148.1, EJY90308.1, EJY93035.1c, EJY94285.1, AFH86425.1, AFH83987.1 and ZP_05183912.1) showed interaction with immune inhibitor A metalloproteases (inhA or others). Accession number YP_002869281.1 (NprB), EJY94285.1 (extracellular protease) and ZP_05183912.1 (Npr599) specifically showed interaction with inhA1 and AFH83987.1 interacts with inhA2. For two germination protease, interaction was found with some proteins affecting the sporulation and germination procedure ofB .anthracis , like Putative stage II sporulation protein P, spore cortex-lytic enzyme prepeptide, germination protein YpeB, Stage IV sporulation protein A, Small acid-soluble spore protein/B etc.A total of 6 motifs were found from acid and neutral protease (Table 4). Motif A2 and A3 showed similarity with peptidase M4 function as per the BLAST and PFAM result. According to PFAM and GENE3D motif B1, 2, and 3 all have the peptidase activity. The function of B3 deduced by BLAST was endopeptidase spore protease Gpr. Ten short conserved motifs of alkaline protease were identified from multiple sequence alignment (Fig. 1C and D).
The present study reported that
B. anthracis acid metalloproteases have some definite role in their pathogenesis. The extracellular nature and the protein-protein interaction pattern claimed the involvement of some acid proteases during anthrax infection.Physicochemical nature of a protein can be easily calculated through
in silico analysis based on their amino acid sequences. Solubility of protein can be determined from the Grand average of hydropathicity or GRAVY. Positive GRAVY indicates hydrophobicity and negative GRAVY indicates the hydrophilicity (Kyte, 1982). On the other hand thermostability of a protein is directly proportional to the aliphatic index value.All the metalloproteases of acid group were found to prefer extracellular medium according to GRAVY results (Table 2) and they were moderately thermostable in nature. In reference to the above parameters protein sequences of 5 and 11 were highly thermostable and membrane bound proteases of
B. anthracis . The presence of thermostable phytase (3-phytase) domain in sequence 5 and 11 at superfamily and family analysis also supported their thermostability (Table 3). Germination protease (sequence 3) showed highest aliphatic index value of 96.36 indicating high level of thermal stability. The 4.94 pI value of two germination protease indicated that the germination ofB .anthracis occurs in acidic medium and the high value of hydrophobicity (GRAVY -0.235) denied their extracellular existence (Table 2). In case of neutral protease group sequence 7 was found to prefer intracellular medium with highest thermal stability of 104.64. Sequence 3 and 6 showed hydrophilic nature but all other were hydrophobic or preferred intercellular space. The in vivo half-life of a protein can be calculated in the form of instability index (Guruprasad et al., 1990). As per the literature, proteins having more than 40 instability index value, having less than 5h of half-life and proteins having less than 40 instability index values, having more than 16h of half-life (Rogers et al., 1986). From this point of view the studied proteases have their half life of greater than 16 h.Phylogenetic tree (Fig. 2) visibly reflected the superfamily results (Table 3). Although all the proteins were sequentially different, 10 acid proteases showed close relationship according to their evolution. They all showed same domains in superfamily search indicating their similar function. Functional similarity was also found between two germination proteases and also between two membrane bound proteases. The studied neutral proteases were found together in the tree. Among them neutral serine protease 8 showed highest similarity with alkaline serine protease one (Fig. 2), indicating their functional similarity. From the above result it was found that same sequence represented same functional domain and on the basis of that they showed evolutionary relationships.
According to Baillie (2001) and Russell et al. (2007) anthrax pathogen initiates their germination as well as infection by interacting with the host macrophages. The resulting vegetative cells spread in blood and other tissues as the causative agent of meningitis and ultimately causes death. Literature showed that among the secreted metalloproteases of
B .anthracis , immune inhibitor A1 (InhA1) was found to be the single pathogenic member and during the infection it helps to cleave mammalian cell matter with addition to the modulation of theB .anthracis secreted proteins (Pflughoeft et al., 2014) and increases the permeability of blood-brain barrier resulting cerebral hemorrhages (Dhritiman et al., 2011). Previous investigation of Pflughoeft (2010) suggested that the protease cascade regulated the organism response in altering environments like reacting to a changing signal or in presence of different types of tissue. Mukherjee et al. (2011), also showed that NprB and Npr599 are the extracellular enzymes which interact with inhA1 and were able to degrade the plasma and matrix proteins of host. From the secretom analysis through protein-protein interaction, it has been shown that maximum acid proteases (except 2, 3, 5, 6, and 11) interacts with immune inhibitor A (inhA or othrs) as shown in Table 5. So, in accordance to the above literature it can be concluded that protease from the organism with accession number YP_002869281.1, YP_002816538.1, YP_031148.1, EJY90308.1, EJY93035.1, EJY94285.1, AFH86425.1, AFH83987.1, ZP_05184332.1 and ZP_05183912.1 (Table 1) are extracellular in nature and they are proved to be the extracellular protein in this study through physicochemical parameter analysis. Among them zinc metalloprotease (Table 3) YP_002869281.1 (NprB), EJY94285.1 (extracellular protease) and ZP_05183912.1 (Npr599) are the part of regulatory cascade ofB .anthracis which helps the cell to react against changed external environment and supports its stabilization in altered condition and may have some direct relation to the permeability of endothelial cell and degradation of plasma or matrix protein at the time of infection (Mukherjee et al., 2011). Thus the above evidence linked physicochemical parameters and PPI outputs as a result of which extracellular and intracellular acid proteases could be identified. As neutral proteases have diversities in their sequence, they showed different interactions also (Table 5). Further investigation on neutral and alkaline protease PPI study is needed for the better understanding of their biological functions.As 100% sequence similarity were found among 31 retrieved alkaline proteases, it can be concluded that
B .anthracis alkaline protease are more conserved than others. A total of 10 aligned regions were found for alkaline protease 1 during the multiple sequence alignment result analysis in comparison with neutral protease sequence 8 (Fig. 1C and D). Identified short segments could be used as primer or probe to identifyB .anthracis alkaline protease. Besides specific regions in multiple sequence alignment results of 13 acid proteases (Fig. 1A and B) could be used for further investigation. The pink bars in acid protease group were the highly conserved regions which could be used as the target site for the inactivation of those proteases. Acid and neutral protease specific motifs were represented in Table 4 where peptidase activity was found for A2, A3, and B1. Similarity was found between B3 andB .cereus endopeptidase spore protease Gpr, which could be related with the germination procedure ofB .anthracis spore. For the preparation of acid protease specific primers and probes, or to inactivate the protease responsible for spore germination, identified motifs could be used. The detailed study of metabolic network could be investigated further.The
in silico characterization ofB .anthracis protease revealed pH range based sequence similarity. Multiple sequence alignment and motif finding result can be used to design degenerate primers or probes for specific sequences as to cloning the putative genes based on PCR amplification for further analysis. Conserved amino acid positions could be used as target site to deactivate the enzyme function. Among all the groups only acid protease were found to interact with InhA, which indicated that metallo proteases of acid protease group have the capability to develop pathogenesis duringB .anthracis infection. Deactivation of conserved amino acid position of germination protease can stop the sporulation and germination ofB anthracis cell. The detailed interaction study of neutral and alkaline proteases could also help to design the interaction network for the better understanding of anthrax disease. Further study on structure prediction and protein-protein or protein-ligand interaction ofB .anthracis proteases could reveal new drugs to inactivate the disease causing proteins.-
[Table 1.] List of acid neutral and alkaline proteases with their type and accession numbers which were taken for analysis
-
[Table 2.] Physiochemical parameter analysis
-
[Table 3.] Distribution of superfamily and family among acid, neutral and alkaline proteases of Bacillus anthracis
-
[Fig. 1A.] Multiple sequence alignment of 13 acid proteases shown in fig. 1A and B.
-
[Fig. 1B.] Multiple sequence alignment of 13 acid proteases shown in fig. 1B.
-
[Fig. 1C.] Fig. 1C represented multiple sequence alignment among 9 neutral proteases along with one alkaline proteases.
-
[Fig. 1D.] Fig. 1D represented multiple sequence alignment among 9 neutral proteases along with one alkaline proteases. Pink bars-highly conserved regions. Blue bars - conserved except sequence 3 (Acc. No. YP_030468.1). Red bars . near about conserved regions with change indicated in green bars. Red bar- highest similarity area. Horizontal green bar- similarity between alkaline protease (Acc. No.YP_002814818.1) and neutral protease sequence 8 (Acc. No. AFH85759.1). Ten violet box-short conserved regions of alkaline protease with respect to sequence 8.
-
[Fig. 2.] Phylogram of 13 acid protease, 9 neutral proteases and 1 alkaline proteases. Acid and neutral proteases were separated in two distinct groups. Except acid germination protease (Accession no. YP_030468.1), showed similarity with neutral germination protease. Two membrane bound acid protease (Accession no. YP_018763.1 & AFH83399.1) were also found together in the bottom of the tree. Alkaline protease was found with neutral protease sequence 8 (Accession no. AFH85759.1) according to its serine protease property. Phylogram was also indicating the evolutionary changes.
-
[Table 4.] Identified motifs for acid and neutral proteases with their function deduced by protein BLAST and INTERPROSCAN. Accession number, query coverage, e-value and maximum identity of highly similar sequence are represented here
-
[Table 5.] List of protein-protein interaction study. Accession number, name and functions of interacted proteins were listed here