Your task is to write a Python program that opens and reads a very large text file.
The program prompts the user to enter the file name.
The program then computes some language statistics based on the contents of the file.
1. The longest word used in the file. If there is more than one, just print one of the longest. There is no need to find all the longest words.
2. The five most common words in the file with the number of times they appear in the file.
3. The word count of all the words in the file, sorted alphabetically – this last output has to be written to a file in the current working directory with the name ‘out.txt’. Open this file for writing in ‘w’ mode.
Your program must work with any input text file that uses 'UTF-8' encoding. The program must also be efficient, computing statistics for a large file (pride.txt) in seconds.
Make sure that you include docstrings, comments where necessary and follow the general Python style guidelines: no long lines in your code, pythonic variable and function names, etc…
Use functions to structure your code.
You don’t have to worry about validating the input at this point. Assume that the file name provided by the user exists.
Assume that the file is really large so you don’t want to read it all at once. However you only want to open the file once and read each line once.
Make sure that you ignore capitalization when you process the file. So ‘This’ and ‘this’ should be counted as the same word.
Also make sure that you take out leading and trailing punctuation characters and numbers from your words. ‘here’ and ‘here.’ should both be considered one word. Same with ‘Hi’, and ‘Hi!’.
Punctuation characters inside words should be kept so that hyphenated words such as 'arm-in-arm' and contractions such as "don't" are left intact.

Solution PreviewSolution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

    input_count (dict): represents words and their corresponding counts.
    min_length (int): optional - defaults to 0.
         minimum length of the words that will appear
         in the cloud representation.
    Only the 20 most common words (that satisfy the minimum length criteria)
    are included in the generated cloud.
    root = tkinter.Tk()
    root.title("Word Cloud Fun")
    # filter the dictionary by word length
    filter_count = {
       word: input_count[word] for word in input_count
       if len(word) >= min_length}
    max_count = max(filter_count.values())
    ratio = 100 / max_count
    frame = tkinter.Frame(root)
    current_row = 0
    for word in sorted(filter_count, key=filter_count.get, reverse=True)[0:20]:
       color = '#' + str(hex(random.randint(256, 4095)))[2:]
       word_font = tkinter.font.Font(size=int(filter_count[word] * ratio))
       label = tkinter.Label(frame, text=word, font=word_font, fg=color)
       label.grid(row=current_row % 5, column=current_row // 5)
       current_row += 1

def count_words(filename):
    Method to construct dictionary of word counts in a file

    filename (string): name of file to open and read
    word_dict (dictionary): dictionary of words and counts
    # build and return the dictionary for the given filename
    word_dict = {}
    f = open(filename...

By purchasing this solution you'll be able to access the following files:

for this solution

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

Find A Tutor

View available Python Programming Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.

Upload a file
Continue without uploading

We couldn't find that subject.
Please select the best match from the list below.

We'll send you an email right away. If it's not in your inbox, check your spam folder.

  • 1
  • 2
  • 3
Live Chats