QuestionQuestion

Part I
Programming tasks

Task 1. Declare a class WordNode to store a word and its frequency. You need store the words in a binary tree, so this class will have two pointers to the left and right WordNode. Then, implement a constructor for class WordNode with four parameters for its fields. The default value for the frequency is 1 and those of the left and right pointers are NULL. The prototype is: WordNode(int word, int freq = 1, WordNode *left = NULL, WordNode *right = NULL)
Then, declare a class WordTree to represent binary search trees for words. This class should have a WordNode* pointer for the root of the tree.

Task 2. Implement a member function of class WordTree to insert a word into the corresponding binary search tree sorted by words. Its prototype is: void insert(int word)

Task 3. Implement a member function class WordTree to insert a word into the corresponding binary sorted (descendingly) by frequencies and then (increasingly) by words. Its prototype is: void insert(int word, int freq)

Task 4. Implement a member function of class WordTree to traverse this tree to build another binary search tree (sorted by frequencies). Its prototype is: void copyTo(WordTree &tree)
This function copies words and their frequencies from the current tree to the binary search tree referred by 'tree'. It traverses all nodes of the current tree in the pre-order mode and insert its word and frequency into 'tree' using its 'insert' function implemented in Task 3.

Task 5. Implement a member function of class WordTree to read the words in an input file one-by-one and insert them into the tree. Its prototype is: void load(string filename)

Task 6. Implement a member function of class WordTree to write the words and frequencies to an output file in desending order. The prototype is: void save(string filename)
To write the frequencies by descending order, you need to traverse the tree using the in-order mode.

Task 7. Implement your main function with the following steps:
Initialize two WordTree objects named 'treeWord' and named 'treeFreq'.
Call function 'load' in Task 5 to read input words and store them in 'treeWord'.
After reading, call function 'copyTo' in Task 4 to copy words from 'treeWord' to 'treeFreq'.
Call the 'save' function from Task 6 to write the words and frequencies stored in 'treeFreq' to the output file.

Task 8. Test your program with a large input file with at least 10 million words (you could also use the input file given in Assignment 1). Compute the running time and compare the result to the best result you got in Assignment 1 with the same input file. Report the size and height of 'treeWord' and 'treeFreq'.

Part II
Problem description
Assume that you are working for Google and you want to write a program to analyze the trends of users' queries, e.g. what are the most frequent search terms. This program accepts input as a text file named input.txt containing search terms, each term is on a single line. Its output is a text file named output.txt which lists all distinct terms in the input file with their corresponding frequencies sorted decreasingly. Each term and its frequency are written on one line. To make the processing more efficient, in the input and output files, a search term is encoded as an unique, integral ID rather than its character sequence.

Example:
input.txt
----------
0
1
2
0
3
2
0
output.txt
------------
0 3
2 2
1 1
3 1

Since Google receives billions of queries each day, your program should run as fast and use as little memory as possible. In this assignment, your program is expected to be able to process an input file with a million (1,000,000) lines and ten thousand (10,000) distinct search terms within 10 minutes.
The following solution is potentially able to achieve that requirement. First, you will store all the distinct search terms and their frequencies in an array sorted by those terms. When you read a search term from the input file, you will use the binary search algorithm to look for that term in that array. If you find the term, just increase its frequency. If it is not there, you will insert it into the array at its proper location to keep the array sorted. After reading the input file, you sort the array, in descending order, by frequency and write its elements to the output file.

Programming tasks
Now, design and implement the aforementioned ideas with the following instructions.

Task 1. Declare a class TermTable to store all distinct search terms and their frequencies in two arrays of 10,000 elements: one for the terms (i.e., term array) and one for their corresponding frequencies (i.e., frequency array). This class also has a field (e.g., a data member) to store the current number of terms stored in those arrays. The default constructor assigns 0 to that field and to all elements of the two arrays.

Task 2. Implement a “binarySearch” method in class TermTable. This method uses the binary search algorithm to look for a search term in the term array.

Task 3. Implement an “insert” method in class TermTable to insert a new term to its two arrays (suggest to change it to the following: its term array and insert its count (e.g., 1) to the corresponding frequency array). In other words, this method increases the current number of terms, assigns the given term to the corresponding location in the term array, assigns 1 to the corresponding location in the frequency array, and shifts the elements after the inserted locations to their proper locations (e.g., the next locations) to keep all terms sorted.

Task 4. Implement a “sort” method in class TermTable to sort the term and frequency arrays descendingly by the frequencies.

Task 5. Implement the main function to read all terms in the input file, process them (i.e. search each term in the TermTable, update its frequency if found or insert it if not found), and write the distinct terms and their frequencies sorted in the descending order to the output.

To read and write files, you could use the following code snippets:
Reading file:
ifstream fin("input.txt");
while (fin >> term) {
// do something with term here
}
fin.close();
Writing file:
ofstream fout("output.txt");
// you could use the operator << to write a value to fout, for example: fout << x
fout.close();

Solution PreviewSolution Preview

This material may consist of step-by-step explanations on how to solve a problem or examples of proper writing, including the use of citations, references, bibliographies, and formatting. This material is made available for the sole purpose of studying and learning - misuse is strictly forbidden.

#include <cstdlib>
#include <ctime>
#include <string>
#include <fstream>
#include <iostream>
using namespace std;

class WordNode {
public:   
    WordNode(int word, int freq = 1, WordNode *left = NULL, WordNode *right = NULL);   
    virtual ~WordNode();
    int getWord();
    int getFreq();
    WordNode * getLeft();
    WordNode * getRight();
    void increaseFeq();
    void setLeft(WordNode * n);
    void setRight(WordNode * n);
   
private:
    int word;
    int freq;
    WordNode * left;
    WordNode * right;
};

/**
* constructor
* @param word
* @param freq
* @param left
* @param right
*/
WordNode::WordNode(int word, int freq , WordNode *left , WordNode *right ) {
    this->word = word;
    this->freq = freq;
    this->left = left;
    this->right = right;
}...
$58.00 for this solution

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

Find A Tutor

View available C-Family Programming Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.

Decision:
Upload a file
Continue without uploading

SUBMIT YOUR HOMEWORK
We couldn't find that subject.
Please select the best match from the list below.

We'll send you an email right away. If it's not in your inbox, check your spam folder.

  • 1
  • 2
  • 3
Live Chats