Topic outline

  • ...
  • This section will teach you how to design and develop workflows for data exploration and pre-processing. In particular, you will learn to load data sets, to summarize categorical, nominal and numeric attributes, to replace missing data, to transform different types of attribute, and to reduce the dimension of a given data set.

    ...
  • This section will teach you how to design and develop a workflow to formulate and solve a classification problem with specific reference to binary classification. You will learn how to build different classification models including, decision trees, logistic regression, artificial neural networks, support vector machines, naïve Bayes classifier and Bayesian classifiers.

    ...
  • In this section you will learn how to develop a workflow for evaluating the performance of a classifier using the following performance measures; accuracy, error, precision, and recall. Furthermore, you will learn to develop a workflow to compare different classifiers, and to select the "optimal classifier".

    ...
  • This section will teach you about the class imbalance problem, and how to develop a workflow for comparing classifiers in terms of their effectiveness to select target customers. In this section you will also learn how to develop a workflow to make the decision about which are the “optimal features” to solve a classification problem. Finally, you will learn how to develop a workflow to solve non binary classification problems.

    ...
  • In this section you will learn about Cluster Analysis and it's main components. In particular, you will learn about different cluster purposes, and different types of clustering. Furthermore, you will learn to develop workflows for computing proximity, similarity and dissimilarity, between records consisting of multiple attributes having different types.

    ...
  • In this section you will learn to design and develop a workflow to cluster records of a dataset by using prototype-based, agglomerative hierarchical, density-based and graph-based clustering algorithms.

    ...
  • In this section you will learn about cluster validation, i.e. how to validate the results of Cluster Analysis. In particular, you will learn about internal and external validation measures as well as about relative indices and the fundamental problem of cluster validity.

    ...
  • This topic

    Association Analysis

    In this section you will learn how to extract association rules from transactions data. You will also learn about the apriori principle, how to select, evaluate and compare association rules. Finally, you will learn about the Simpson’s paradox.

    ...
  • ...
  • ...
    • SIGN ON THE EXAM LIST - 2020 FEBRUARY 24 Group choice
    • SIGN ON THE EXAM LIST - 2020 JANUARY 30 Group choice
    • PROJECT SUBMISSION - 2020 JANUARY 20 Assignment
  • ...