Question

There is a Gold Standard dataset that represents the "correct" annotations. To evaluate if the Machine can perform as well as humans, we compared the similarity between the Machine output to the Gold Standard output using Jaccard and Resnik metrics. We did the same for Human output compared to the Gold standard output. These semantic similarity scores are listed in SemanticSimilarityScores.tsv.

Your goal is to conduct hypothesis testing to compare Machine performance to Human performance for the two semantic similarity metrics. To do this, you will be comparing Column 2 (SimJ Score MACHINE) to Column 4 (SimJ Score HUMAN), and similarly, Column 3 (NIC Score MACHINE) to Column 5 (NIC Score HUMAN).

1. Formulate a null hypothesis
2. Pick an appropriate statistical test
3. Conduct the statistical test
4. Report the statistic and a p-value
5. Report your conclusion based on the result of the test.
I think you will conduct a T test judging from the notes but not sure you can see in the notes. There are other tests that are in notes too, but if you read it I think we use a T-test but you can also think which one is best.
Also it read input from file and give two output and the two output would generate in text file.
you will be comparing Column 2 (SimJ Score MACHINE) to Column 4 (SimJ Score HUMAN), and similarly, Column 3 (NIC Score MACHINE) to Column 5 (NIC Score HUMAN).

Solution Preview

This material may consist of step-by-step explanations on how to solve a problem or examples of proper writing, including the use of citations, references, bibliographies, and formatting. This material is made available for the sole purpose of studying and learning - misuse is strictly forbidden.

import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.math3.stat.inference.TestUtils;

/**
* This program reads data from the specified file
* and conducts two-tail unpaired t-test to test if mean score of
* the specified samples are same or not.
*/
public class TTest {
   
    // Declaring member variables.
    private List<Double> simJMachine;
    private List<Double> simJHuman;
    private List<Double> nicMachine;
    private List<Double> nicHuman;
   
    private final String OUTPUTFILE = "ttest.txt";
   
    private FileWriter fileWriter;
               
    /**
    * Constructor of the TTest class
    */
    public TTest(){
       simJMachine = new ArrayList<>();
       simJHuman = new ArrayList<>();
       nicMachine = new ArrayList<>();
       nicHuman = new ArrayList<>();
    }
   
    /**
    * This method conducts the t-test under the following hypothesis.
    *
    * Null Hypothesis, H0: SimJ Score of Machine and Human are drawn
    * from the population of same mean.
    *
    * Alternate Hypothesis, H1: SimJ Score of Machine and Human are not drawn
    * from the population of same mean, and the difference is significantly
    * different.
    */
    public void conductTTestSimJ(){
      
       try{
       double[] simJMAry = new double[simJMachine.size()];
       double[] simJHAry = new double[simJHuman.size()];...

This is only a preview of the solution. Please use the purchase button to see the entire solution

Assisting Tutor

Related Homework Solutions

Pascal's Triangle in Java
Homework Solution
$18.00
Computer Science
Java Programming
Pascal's Triangle
Edges
Arrays
Calculations
Coefficients
Integers
Java Graphics and Turtle Class
Homework Solution
$150.00
Computer Science
Java Programming
Graphics
Turtle Classes
Caterpillar
Colors
Canvas
Frames
Pictures
Statements
Variables
Java Programming Exercises
Homework Solution
$35.00
Computer Science
Java Programming
Arrays
Searching
Sorting
Supermarkets
Data
Activities
Profit
Average Values
Arguments
Constructors
Test Class
Declaration
GUI Calendar in Java
Homework Solution
$90.00
Computer Science
Java Programming
GUI
Calendar
MVC Pattern
Buttons
Fields
Rectangles
Scheduling
Error Messages
Events
Users
Java Programming Projects
Homework Solution
$90.00
Computer Science
Java Programming
Sorting Algorithms
Recursion
Critical Operations
Methods
Interfaces
Significance
Variance
Big O Analysis
Papers
Finite State Automation in Java
Homework Solution
$40.00
Computer Science
Java Programming
Finite State
Automation
English Letters
Algorithms
Input
Output
Strings
Procedures
Get help from a qualified tutor
Live Chats