Course information | Machine Learning M

Course Syllabus

Italiano ‎(it)‎
English ‎(en)‎

Export

Obiettivi

Lo studente apprenderà le tecniche di Machine Learning più efficaci, comprendendo i fondamenti teorici di ogni tecnica e acquisendo il know-how per poterle applicare con successo alla risoluzione di problemi pratici. Sarà inoltre fornita una panoramica sulle più innovative soluzioni per l’identificazione del miglior algoritmo di Machine Learning e della sua configurazione ottimale (Automated Machine Learning – AutoML), dato un dataset. Lo strumento di riferimento per il corso sarà R, ma verranno anche presentate alcune soluzioni equivalenti in Python (ad esempio scikit-learn) e Java (ad esempio WEKA, KNIME).

Contenuti sintetici

Concetti basi del Machine Learning: tipologie di dati, istanze, features, tasks e scenarios, parametri e iper-parametri, misure di performance
Tecniche di apprendimento non-supervisionato
Tecniche di apprendimento supervisionato: classificazione e regressione
Modellare non-linearità nei dati: tecniche basate sul concetto di kernel
Automated Machine Learning: configurazione automatica di un modello di Machine Learning

Programma esteso

Introduzione

Machine Learning scenarios & tasks, notazioni utili
Tipi di dati e problemi: tabular, streams, text, time-series, sequences, spatial, graph, web, social, immagini, distribuzioni

Unsupervised Learning

Concetti di similarità e distanza
Distanze tra punti: Minkowski e casi specifici (Manhattan, Euclidea, Chebyshev/Lagrange), Mahalanobis
Wasserstein: una distanza (non una divergenza!) tra distribuzioni di probabilità e/o nuvole di punti (datasets)
Clustering: approcci deterministici vs probabilistici; flat vs gerarchici; basati su distanza/similarità vs densità
Outlier and anomaly detection

Supervised Learning

I fondamenti del "learning": classificazione binaria, processo di generazione dei dati, concetto vs ipotesi, errore empirico vs errore di generalizzazione
Classificazione e regressione: metriche e tecniche di validazione (hold-out, k fold-cross, leave-one-out)
Approcci model-free/instance-based, un semplice algoritmo: k-nearest neighbors (KNN)
Approcci model-based: Support Vector Machine (lineari)

Supervised Learning per dati non-lineari

Non-linearità, VC dimensions, kernel-trick
Un richiamo a Decision Tree e Random Forest
Kernel-based learning: kernel-SVM e Gaussian Processes per classificazione e regressione
Dimensionlity reduction: Principal Component Analysis (PCA) e kernel-based PCA (kPCA)

L’approccio connessionista

Artificial Neural Networks: paradigma di apprendimento
Deep Learning: “a fraction of the connectionist tribe”
Modelli neurali Generativi: Auto-Encoder (AE) e Variational-AE (VAE), Generative Adversarial Network (GAN) e Wasserstein-GAN (WGAN), Transformer

Automated Machine Learning (AutoML)

Ottimizzazione degli iperparametri di un algoritmo di Machine Learning
Selezione del miglior algoritmo di Machine Learning e (simultanea) ottimizzazione dei suoi iperparametri

Esercizi ed esempi pratici

Prerequisiti

Si consiglia la conoscenza di elementi di base di informatica, matematica applicata, probabilità e statistica

Modalità didattica

L'intera attività formativa viene svolta attraverso lezioni in presenza. Le lezioni riguarderanno sia aspetti teorici che applicazioni pratiche, specificatamente l'utilizzo di librerie software e dati open.

Materiale didattico

Testo di riferimento: Mehryar Mohri, Afshin Rostamizadeh and Ameet Talwalkar (2018). Foundations of Machine Learning.
Slides e materiale didattico fornito dal docente

Altri tesi suggeriti:

Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020). Mathematics for machine learning. Cambridge University Press.
Charu C. Aggarwal (2023). Neural Networks and Deep Learning – A Textbook
Robert B. Gramacy (2020). Surrogates – Gaussian Processes Modeling, Design, and Optimization for the Applied Statistics.
Charu C. Aggarwal (2015). Data Mining – the Textbook

Periodo di erogazione dell'insegnamento

Primo semestre - primo periodo

Modalità di verifica del profitto e valutazione

La modalità di verifica prevede le seguenti 2 prove:

lo svolgimento di un progetto con associata redazione di un rapporto tecnico, stile articolo scientifico,
un esame orale (individuale e obbligatorio) finalizzato a verificare il grado di comprensione degli argomenti trattati.

Il progetto può essere svolto in team (max 3 studenti per gruppo) ed i dataset oggetto delle attività saranno concordati con il docente a partire da piattaforme open quali OpenML, Kaggle o UCI Repository.
La qualità del progetto è stabilita sulla base del corretto utilizzo degli algoritmi di ML e all'analisi critiva dei risultati. L'esame orale è finalizzato alla verifica della comprensione di aspetti teorici e metodologici del ML.
Il progetto contribuisce al 60% della valutazione finale, la prova orale al restante 40%.

Non sono previste prove intermedie.

Orario di ricevimento

Su appuntamento

Sustainable Development Goals

ISTRUZIONE DI QUALITÁ

Export

Aims

The student will learn the most effective Machine Learning techniques, understanding the theoretical foundations of each technique and acquiring the know-how to successfully apply them to solving practical problems. An overview of the most innovative solutions for the identification of the best Machine Learning algorithm and its optimal configuration, given a dataset (Automated Machine Learning - AutoML), will also be provided. The reference tool for the course will be R, but some equivalent solutions will also be presented in Python (for example scikit-learn) and Java (for example WEKA, KNIME).

Machine Learning basics: types of data, instances, features, tasks and scenarios, parameters and hyper-parameters, performance measures
Unsupervised learning techniques
Supervised learning techniques: classification and regression
Modeling non-linearity in data: kernel-based techniques
Automated Machine Learning: automatic configuration of a Machine Learning model

Detailed program

Introduction

Machine Learning scenarios & tasks, useful notations
Types of data and problems: tabular, streams, text, time-series, sequences, spatial, graph, web, social, immagini, distribuzioni

Unsupervised Learning

Similarity and distance
Distances between points: Minkowski and special cases (Manhattan, Euclidean, Chebyshev/Lagrange), Mahalanobis
Wasserstein: a distance (not a divergence!) between probability distributions and/or point-clouds (i.e., datasets)
Clustering: deterministic vs probabilistic approaches; flat vs hierarchical; distance/similarity vs density -based
Outlier and anomaly detection

Supervised Learning

Foundations of "learning": binary classification, data generation process, concept vs hypothesis, empirical vs generalization error
Classification and regression: metrics and validation techniques (hold-out, k fold-cross, leave-one-out)
Model-free/instance-based approaches, a simple algorithm: the k-nearest neighbors (KNN)
Model-based approaches: Support Vector Machine (linear)

Supervised Learning for non-linear data

Non-linearity, VC dimensions, kernel-trick
A brief recall on Decision Tree and Random Forest
Kernel-based learning: kernel-SVM and Gaussian Processes, for classification and regression
Dimensionlity reduction: Principal Component Analysis (PCA) and kernel-based PCA (kPCA)

The connectionist approach

Artificial Neural Networks: learning paradigm
Deep Learning: “a fraction of the connectionist tribe”
Generative neural models: Auto-Encoder (AE) and Variational-AE (VAE), Generative Adversarial Network (GAN) and Wasserstein-GAN (WGAN), Transformer

Automated Machine Learning (AutoML): an overview

Hyeprparameter optimization of a Machine Learning algorithm
Algorithm selection and (simultaneous) hyperparameter optimization of a Machine Learning algorithm

Exercises and examples

Prerequisites

Basic knowledge on computer science, applied math, probability calculus and statistics

Teaching form

Teaching is provided in-person. Lectures will address both theory and hands-on, specifically the adoption of open data and software libraries.

Textbook and teaching resource

Reference textbook: Mehryar Mohri, Afshin Rostamizadeh and Ameet Talwalkar (2018). Foundations of Machine Learning.
Slides and materials provided by the lecturer

Other suggested texts:

Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020). Mathematics for machine learning. Cambridge University Press.
Charu C. Aggarwal (2023). Neural Networks and Deep Learning – A Textbook
Robert B. Gramacy (2020). Surrogates – Gaussian Processes Modeling, Design, and Optimization for the Applied Statistics.
Charu C. Aggarwal (2015). Data Mining – the Textbook

Semester

First semester - First period

Assessment method

Assessment is organized on two tests:

the development of a project along with the preparation of an associated technical report, similar to a scientific paper,
an oral examination (individual and mandatory) aimed at assessing the degree of understanding of the course's topics.

The project can be performed by working in team (max 3 students per group) and the datasets to adopt, in agreement with the lecturer, will be selected among those available on open platforms such as OpenML, Kaggle or UCI Repository.
The quality of the project will be assessed according to the correct adoption of ML algorithms and the analysis of the results. Oral examination is devoted to assess the understanding of theoretical and methodological aspects of ML.
The project amounts for 60% of the final mark, while the oral examination amounts for the remaining 40%.

There are no mid-term review(s)

Office hours

On appointment

Sustainable Development Goals

QUALITY EDUCATION

Enter

Field of research

INF/01

ECTS

Term

First semester

Activity type

Mandatory to be chosen

Course Length (Hours)

Degree Course Type

2-year Master Degreee

Language

English

Teacher

Antonio Candelieri

View previous A.Y. opinion

Find the books for this course in the Library

Manual enrolments

Self enrolment (Student)

Course Syllabus

Obiettivi

Contenuti sintetici

Programma esteso

Prerequisiti

Modalità didattica

Materiale didattico

Periodo di erogazione dell'insegnamento

Modalità di verifica del profitto e valutazione

Orario di ricevimento

Sustainable Development Goals

Aims

Contents

Detailed program

Prerequisites

Teaching form

Textbook and teaching resource

Semester

Assessment method

Office hours

Sustainable Development Goals

Key information

Staff

Teacher

Students' opinion

Bibliography

Enrolment methods

Sustainable Development Goals