Secondary Structure Prediction
A single consensus of the possible secondary structures of the PH4 domain present on my assigned protein where the sequence length ranges from 841-961 residue, 120 in total. At the same time this sequence was compared to a control protein with known structure from an organism with the closest alignment, namely from Homo Sapiens protein called 4X1V, whose sequence length is 56 amino acids. Sample images of the results of the programs for the query sequence are displayed below.
Figure 1. Showing PH 4 domain and the predicted secondary structure acquired from five programs using SYMPRED.
Figure 2. Showing PH domain and the secondary structure of the control protein acquired from six programs using SYMPRED.
Key showcasing the symbols and representation of the data expressed on MS Excel
Coils - C
Figure 3. Excel table showing Secondary Structure prediction Query from five databases
Figure 4. Previous in better viewing of categories and symbols used: Excel table showing Secondary Structure prediction Query from five databases
Figure 5. Excel table showing Secondary Structure Prediction: Control from five databases
Figure 6: Showing the structure of the assigned protein
Figure 7. Showing the known structures of the Control protein domain 4X1V,
The sequence range was retrieved from NCBI protein sequence database, [Chen et al, 2013].The sequence was then ran through SYMPRED, retrieving information from the programs PROF, SSPRO, YASPIN, JNET and PSIPRED, [McGuffin et al, 2003]. The sequence secondary Structure Prediction was then displayed in MS Excel, [figure 3] and the confidence and letter symbols for Helixes, Strands and Coils were manually inputted. The control protein sequence was retrieved from NCBI, and the same steps were taken as previously mentioned. The data was then analyzed and compared to the known structure2 of the control protein which led to the theory stated in the discussion.
The Programs PROF, SSPRO, PREDATOR, YASPIN and PSIPRED ran through the SYMPRED database were used to analyze the sequence of the PH5 domain in Arf-GAP with Rho-GAP domain, ANK repeat and PH domain-containing protein 1, (ARAP1), From research done of this assigned protein, it was noted that it contains nine domains one of them being the domain PH5. This domain was selected to be analyzed and its secondary structure predicted. The actual PH4 domain sequence was from 862-952 but around 20 more character were added to the sequence for accuracy. After investigating the sequence from 841-961 the data stated that this protein contained helices from 12-25, 34-42, 63-72, and 91-113, this was noted in all programs, however, helices were not shown in the sequence 63-72 in PROF and SSPRO, [Harada et al, 2014]. This information was compared to the known sequence of 4X1V, this was shown to not be helices but rather coils except for PREDATOR program. The results initially showed a consensus of approximately ten possible strands. The sequence of the 4X1V domain of the control protein was also analyzed via the secondary sequence prediction programs mentioned before. This control protein domain has been studied and its structure has been determined. thus, it was used as a control to easily comprehend better the validity of the data acquired for ARAP1 protein assigned. The analysis of the predictions and the reliability scores for this sequence alignment showed that this domain has seven strands. Six of the predictions have high reliability scores and the remaining one had a moderate score. Upon analyzing and comparing the prediction data to the known data from the control structure, it was noted that there was inconsistency between the two structures. It was evidence that there were missing strands while others were slightly displayed. The programs PRO and SSPRO provided data with the most discrepancy compared with the other programs and experimental data. These two programs stated that there were helixes from 12-25, 34-42, 63-72, and 91-113 in the sequence, as mentioned before this was also noted in the assigned protein structure. There were also a much longer coils displayed in Predator from reside 81-118 compared to the all the combined data. Overall, the results show that all the programs agree with the Helix structures except for PROF and SSPRO which was off by a few numbers. The validity for this region, which is between amino acids 51 and 87, is abate at best. This is an illustration of the also variation among programs assigned to do the same task and how they may be capable of reading certain sequences. Overall, the most program which showed the most discrepancy was SSPRO. The results of the analysis are not surprising because the prediction accuracy of these programs are below eighty percent, However, it was hard to figure out why SSPRO database result was so off compared to the other four. This process made it possible to validate the range of the assigned sequence PH4 domain. The original range for this domain was from amino acids 841-961, however after this analysis the range has been changed to between amino acids 860-962.
• BLAST: Basic Local Alignment Search Tool. 2017 Jun 7. National Center for Biotechnology Information (NP_001035207.1). [accessed2020 Jul5]. https://blast.ncbi.nlm.nih.gov/Blast.cgi
• Luo R, Chen PW, Kuo JC, Jenkins L, Jian X, Waterman CM and Randazzo PA. 2007 Mar 2. arf-GAP with Rho-GAP domain, ANK repeat and PH domain-containing prote - Protein - NCBI. National Center for Biotechnology Information. [accessed 2020 Jul 5]. https://www.ncbi.nlm.nih.gov/protein/NP_001035207.1?report=genbank&log$=protalign&blast_rank=1&RID=FZ1Y8Z8W016
• Li,H., Sato,M., Koshiba,S., Watanabe,S., Harada,T., Kigawa,T. and Yokoyama,S. 2013 Sep 3. SYMPRED Consensus Secondary Structure Prediction . SYMPRED results. [accessed 2020 Jul 5]. http://zeus.few.vu.nl/jobs/9709b06a417d23580bc1183a8777d62b/
• Tomizawa,T., Koshiba,S., Inoue,M., Kigawa,T. and Yokoyama,S. 2003 Apr 9. Chain A, Centaurin-delta 1 - Protein - NCBI. National Center for Biotechnology Information. [accessed 2020 Jul5]. https://www.ncbi.nlm.nih.gov/protein/Q96P48
• Li,H., Sato,M., Koshiba,S., Watanabe,S., Harada,T., Kigawa,T. and Yokoyama,S. 2013 Sep 3. SYMPRED Consensus Secondary Structure Prediction . SYMPRED results. [accessed 2020 Jul 5]. http://zeus.few.vu.nl/jobs/39a5a24de63b973a6e217149658b4b18/
These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction
of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice.
Unethical use is strictly forbidden.