Finding sequence similarity between protein sequences causing Hypophosphatemia
The multiple alignments is useful for predicting protein structure, for predicting function of proteins and phylogenetic analysis. For multiple sequence alignment we have to select sequences which are 30 to 70 percent identical [1].
The following case study will help you to understand the sequence as well as structural, evolutionary, functional, similarity between the closely related sequences of a particular protein.
- Go to Uniprot website http://www.uniprot.org
- Select blast
- Go to option on blast page and select 1000 hits which will give the best sequence results.
- Enter an accession number for hypophosphatemia P78562 in the a field. This field can also take protein/nucleotide sequence as identifiers.
- Click on blast. This help use to gather sequences.
You will get around 250 sequences. - Randomly select 8 sequences P78562, Q3TYM9, P70669, A2ICR0, O35812, B4DNS0,B4E334 and A2AC80.
- Click on these 8 sequences one by one in column and get their Fasta format by clicking on Fasta button. Fasta format refers to protein sequence.
- Save as Fasta formats for sequences in one text file with fasta_seq.txt (see attach file)




- To use multiple sequence alignment tool ClustalW, go to www.ebi.ac.uk/clustalw/index.html
- Choose Fast for Alignment and input for Output order pull down menu.If you choose fast it means that the alignment is fast but approximate; if you choose full it means you are using dynamic programming for alignment.If you set Output order to input; sequence is output in original order and if output order is set to Aligned; the sequence appears in the order they are aligned. Rests of the parameters are set to default.
- Click on choose file and give the path of the fasta_seq.txt file or paste the sequences and click on run button.
- Result is displayed in three sections Scores table, Alignment and guide tree. Your multiple sequence alignment comes under alignment heading.
- View multiple sequence alignment or click on view alignment file and save it.
- Click on show color to specify each column with different colors.
- Analyze amino acids in the same column with the help of symbols which are following: “*” indicates conserved column or pattern in nature, “:” Indicates columns where all the residues have same size and hydropathy and “.” Indicates columns where size or hydropathy has been preserved into the course of solution.




References
[1] Claverie JM, Notredame C, Bioinformatics For Dummies, 2nd Edition; December 2006
Jean-Michel Claverie, Ph. D., Cedric Notredame, Ph.D.
Contributed by: Shahina Hayat