Question
Transcribed Text
Solution Preview
This material may consist of step-by-step explanations on how to solve a problem or examples of proper writing, including the use of citations, references, bibliographies, and formatting. This material is made available for the sole purpose of studying and learning - misuse is strictly forbidden.
2. In order to obtain the distance matrix for a set of nucleotide sequences one needs a pairwise alignments score, between all pairs of sequences.Problem is, relating individual pairwise sequences can prove to be very unreliable, since the sequences are likely to have diverged differently from the ancestral gene (say, i.e. have maintained different fragments of the original gene). Thus, in order to produce a reasonable distance matrix one needs to build a multiple alignment (likely based on pairwise alignments) first. That is, the pairwise alignments must relate to the consensus sequence produced by the MSA – so that each sequence is considered against the ancestor sequence.
Each sequence pair can then be aligned, again, with each new pairwise global alignment score taken as the distance metric upon which to build the distance matrix.
The scoring used is edit distance, commonly achieved by dynamic programming of global alignment with typical penalty scores and (potentially) given substitution matrices (i.e. PAM, BLOSUM)....