QuestionQuestion

Write a program that will open a BLASTN (nucleotide to nucleotide search) output file, parse out specific information, and produce formatted output that will be written to STDOUT (i.e. Standard Output; the terminal window / command line).   
Your program should start by opening the input file (you may hardcode the filename in this case), parsing and storing both the query sequence ID (from near the top of the file; look for the string following “Query=“) and the query length (found on the line below the query sequence), and displaying them both to STDOUT. Add some additional characters and formatting to your output such that these two fields appear exactly like this in STDOUT:
Query ID: IREALLYLIKEPYTHON
Query Length: 15

Then, it is time to parse information about the significant alignments for this query. Each alignment begins with the “>” symbol. For just the first ten hits, parse out only the accession.
(located between the first set of pipe symbols, | | ), length and score. For each of these hits, these three fields should then be written to STDOUT in exactly this format including capitalization, spacing, and punctuation (as shown here using the real values for the first hit; study the file to understand exactly where these values came from):
Alignment #1: Accession = ref|XM_005094338.1| (Length = 2377, Score = 1098)

You must use regular expressions to pull out precisely the parts of the file that you want, which is the definition of parsing. Hint: you will very likely need to use parentheses to put some parts of those expressions into temporary memory (m.group(1), etc.) for later use. Do not have your regular expression search for hardcoded values; your program should be able to read another BLASTN output file and run successfully, not just this specific one. Pay careful attention to the exact appearance of the sample output, above. Although it is a good start to be able to, at a minimum, report the requested values, your program must also strive to match the formats specified.
Example blastn layout below.

Solution PreviewSolution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

# import Python Regular expression module
import re

# opening the input file
with open("input.txt") as file:
    txt = file.read()

# parse sequence ID
query_id = re.findall(r"\bQuery= (.*)", txt)
# parse lengths
query_length = re.findall(r"\bLength=(.*)", txt)
# parse accessions
accessions = re.findall(r"[>](.*)[\|$]", txt)...

By purchasing this solution you'll be able to access the following files:
Solution.py and SolutionInput.txt.

$27.50
for this solution

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

Find A Tutor

View available Python Programming Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.

Decision:
Upload a file
Continue without uploading

SUBMIT YOUR HOMEWORK
We couldn't find that subject.
Please select the best match from the list below.

We'll send you an email right away. If it's not in your inbox, check your spam folder.

  • 1
  • 2
  • 3
Live Chats