Discuss what kind of data mining task you would do to enhance the performance of such databases (classification, association or clustering and which algorithm). Or you can use any existing approach and discuss it.
Write a several pages report that includes the following:
1- which technique you used (classification, ... etc) and why ?
2- does it work? (advantages and disadvantages of your approach).
3- what are the results realized by your approach?
4- what are your evaluations and ideas for extensions and improvements.
Feel free to use any sources that you want.
This material may consist of step-by-step explanations on how to solve a problem or examples of proper writing, including the use of citations, references, bibliographies, and formatting. This material is made available for the sole purpose of studying and learning - misuse is strictly forbidden.Introduction
Applied to the provided scenario, different data mining tasks can reveal various patterns about the travel agency database; the obtained results are determined by the type of mining tasks that are employed in each case. For instance, if predictions are targeted, then the applied tasks are based on inference on available data stored. By comparison, if specific properties of the existing data must be discovered, then application of descriptive mining tasks is appropriate for achieving this goal. The discovered patterns usually have distinct levels of abstraction and require different types of knowledge. This report does not propose to exhaust data mining techniques, but to highlight only one direction that leads to improvement of results obtained without the application of data mining tasks.
Classification technique is crucial in the context of labeling existent data into categorical classes. There are several reasons for selecting this technique in the current context. The first of them is represented by the fact that classification represents the “most commonly applied technique” when generating models based on “pre-classified examples” able of classifying “the population of records at large” (Ramageri, 2010, p. 302). It can be assumed the database of the travel agency already has its objects (records) associated with common class labels. With the contribution of classification algorithms, the model is built from the existing data (considered training set) and new objects are classified with more accuracy. Another reason is given by the capability of classification algorithms in increasing the effectiveness of large data sets analysis, since the predictions made are more relevant when existing data is well-categorized....