Questions to answer about the Kaggle dataset:

1. How many earthquake events are there in the training data set?
2. What is the average value of the acoustic sensor around earthquake events in the training data? You may consider averaging about 100 numbers (could be less more - use your judgement) around the event.
3. What is the time difference between consecutive earthquake events in the training data? Show output in the form time between event 1 and 2 = xy seconds, between events 2 and 3 = zw seconds etc.

Each team must create one jupyter notebook to answer the three questions listed under "Questions to answer about the Kaggle dataset". This notebook must have:
(1) Cells that contain executable code that the instructor can execute to get those answers (i.e. your instructor can execute them on his own machine, assuming that the location of the data file is adjusted properly).
(2) Additionally, it should have well-written markdown (text) cells that have brief answers to the questions (i.e. the instructor can read them to know what answers you came with up without executing the code cells).

Once each team is done with (1) and (2) above, they will work on the following objective in preparation for class discussion and presentation

Let's say you know (from previous work) that there are N earthquake events in the training data. Write code that tries to detect all other events based on the average around one event. The inputs to the code are -
(1) The "window size" used for rolling window around a specific time. In previous work you may have set it to a value like 100 for averaging, but this must now be an input to the program. Additionally you must use the "rolling" function of the Pandas Dataframe. See summary with code examples here Link (Links to an external site.)Links to an external site. . Doing it is regular Python code will result in very high compute times and your program is likely to crash - so use Pandas for efficiency! Note that if you do not choose the window type (win_type parameter) then all points in the window are equally weighed, and when you get the sum by calling the sum() function it is easy to compute the average.
(2) The name of the destination file where the rolling average will be saved for the time series. Saving to a file is important since you will not have to repeat the time-taking rolling average computation for a specific window size, if you need the averages in the future.
(3) The input "i" where the i-th earthquake event will be used as query (i.e. the window average around that event).
(4) The tolerance "e" so that all window averages around plus / minus of e around the i-th event average are "retrieved" by the program.
(5) The output of the program should - how many true positives, false positives, false positives. Here a positive is an earthquake event.

Solution PreviewSolution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

    By purchasing this solution you'll be able to access the following files:

    50% discount

    $50.00 $25.00
    for this solution

    PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

    Find A Tutor

    View available Artificial Intelligence Tutors

    Get College Homework Help.

    Are you sure you don't want to upload any files?

    Fast tutor response requires as much info as possible.

    Upload a file
    Continue without uploading

    We couldn't find that subject.
    Please select the best match from the list below.

    We'll send you an email right away. If it's not in your inbox, check your spam folder.

    • 1
    • 2
    • 3
    Live Chats