QuestionQuestion

Problem 1
Explore realdirect.com thinking about how buyers and sellers would navigate, and how the website is organized. Use the datasets provided for Bronx, Brooklyn, Manhattan, Queens, and Staten Island. Do the following:

-Load in and clean the data.
-Conduct exploratory data analysis in order to find out where there are outliers or missing values, decide how you will treat them, make sure the dates are formatted correctly, make sure values you think are numerical are being treated as such, etc.
-Conduct exploratory data analysis to visualize and make comparisons for residential building category classes across boroughs and across time (select the following: 1-, 2-, and 3-family homes, coops, and condos). Use histograms, boxplots, scatterplots or other visual graphs. Provide summary statistics along with your conclusions.

Problem 2
The datasets provided nyt1.csv, nyt2.csv, and nyt3.csv represent three (simulated) days of ads shown and clicks recorded on the New York Times homepage. Each row represents a single user. There are 5 columns: age, gender (0=female, 1=male), number impressions, number of clicks, and logged-in. Use R to handle this data. Perform some exploratory data analysis:

Create a new variable, age_group, that categorizes users as "<20", "20-29", "30-39", "40-49", "50-59", "60-69", and "70+".

For each day:
-Plot the distribution of number of impressions and click-through-rate (CTR = #clicks / #impressions) for these age categories
-Define a new variable to segment or categorize users based on their click behavior.
-Explore the data and make visual and quantitative comparisons across user segments/demographics (<20-year-old males versus <20-year-old females or logged-in versus not, for example).
-Extend your analysis across days. Visualize some metrics and distributions over time

Solution PreviewSolution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

    By purchasing this solution you'll be able to access the following files:
    Solution1.pdf, Solution3.R and Solution2.Rmd.

    $125.00
    for this solution

    PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

    Find A Tutor

    View available Statistics-R Programming Tutors

    Get College Homework Help.

    Are you sure you don't want to upload any files?

    Fast tutor response requires as much info as possible.

    Decision:
    Upload a file
    Continue without uploading

    SUBMIT YOUR HOMEWORK
    We couldn't find that subject.
    Please select the best match from the list below.

    We'll send you an email right away. If it's not in your inbox, check your spam folder.

    • 1
    • 2
    • 3
    Live Chats