QuestionQuestion

Transcribed TextTranscribed Text

1. Given the distance matrix below, perform clustering using UPGMA. A B C D E A 0 2 10 10 10 B 2 0 10 10 10 C 10 10 0 2 6 D 10 10 2 0 6 E 10 10 6 6 0 2. Given a set of nucleotide sequences, how would one obtain the distance matrix, used to do cluster analysis? 3. The distance (similarity) function used for sequence analysis must be a metric. What are the properties of a metric? Why is this important? 4. Dynamic programming can be used to determine the genetic distance between two species. Chromosomal rearrangements can also be used to compare similarity between different organisms. When would this be likely to be useful? 5. Align the two sequences AGTC and AGC using the dynamic programming algorithm with a match score of 1 and a mismatch score of 2.

Solution PreviewSolution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

2. In order to obtain the distance matrix for a set of nucleotide sequences one needs a pairwise alignments score, between all pairs of sequences.
Problem is, relating individual pairwise sequences can prove to be very unreliable, since the sequences are likely to have diverged differently from the ancestral gene (say, i.e. have maintained different fragments of the original gene). Thus, in order to produce a reasonable distance matrix one needs to build a multiple alignment (likely based on pairwise alignments) first. That is, the pairwise alignments must relate to the consensus sequence produced by the MSA – so that each sequence is considered against the ancestor sequence.
Each sequence pair can then be aligned, again, with each new pairwise global alignment score taken as the distance metric upon which to build the distance matrix.
The scoring used is edit distance, commonly achieved by dynamic programming of global alignment with typical penalty scores and (potentially) given substitution matrices (i.e. PAM, BLOSUM)....

By purchasing this solution you'll be able to access the following files:
Solution.docx.

$38.00
for this solution

or FREE if you
register a new account!

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

Find A Tutor

View available Computational Biology Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.

Decision:
Upload a file
Continue without uploading

SUBMIT YOUR HOMEWORK
We couldn't find that subject.
Please select the best match from the list below.

We'll send you an email right away. If it's not in your inbox, check your spam folder.

  • 1
  • 2
  • 3
Live Chats