    All material related to this course is at

    Data mining and machine learning are computational subjects. One does not understand how to treat scientific data by reading equations on the blackboard: you will need to get your hands dirty (and this is the fun part!). Students are required to come to classes with a laptop or any device where you can code on (larger than a smartphone I would say...). Each class will pair theoretical explanations to hands-on exercises and demonstrations. These are the key content of the course, so please engage with them as much a possible.

    • [01-02] 2-3-22. Introduction to class. IT setup. git version control
    • [03-03] 7-3-22. Introduction to probability. Bayes theorem. Monty hall problem. Transformation of random variables.  [The recording was interrupted at some point because the connection in the building dropped.]
    • [05-06] 8-3-22. Monte Carlo integrations. Descriptive statistics. Mean vs population quantities. Typical distributions.
    • [07-08] 9-3-22. Central limit theorem. Multivariate pdfs. Correlation coefficients. Sampling from arbitrary pdfs.
    • [09-10] 14-3-22. Frequentist vs Bayesian inference. Maximum likelihood estimation. Omoscedastic Gaussian data, Heteroscedastic Gaussian data, non Gaussian data.
    • [11-12] 15-3-22. Maximum likelihood fit. Role of outliers. Goodness of fit. Model comparison. Gaussian mixtures. Boostrap and jackknife.
    • [13-14] 16-3-22. Hypothesis testing. Comparing distributions, KS test. Histograms. Kernel density estimators.
    • [15-16] 21-3-22. The Bayesian approach to statistics. Prior distributions. Credible regions. Parameter estimation examples. Marginalization. Parameter estimation examples. Model comparison: odds ratio. Approximate model comparison.
    • [17-18] 23-3-22. Monte Carlo methods. Markov chains. Burn-in. Metropolis-Hastings algorithm.
    • [19-20] 28-3-22. MCMC diagnostics. Traceplots. Autocorrelation lenght. Samplers in practice: emcee and PyMC3. Gibbs sampling. Conjugate priors.
    • [21-22] 30-3-22. Evidence evaluation. Model selection. Nested sampling. Samplers in practice: dynesty.
    • [23-24] 4-4.22. Data mining and machine learning. Supervised and unsupervised learning. Overview of scikit-learn. Examples.
    • [25-26] 6-4.22 . K-fold cross validation. Unsupervised clustering. K-Means Clustering. Mean-shift Clustering. Correlation functions. 
    • [27-28] 20-4-22.  Curse of dimensionality. Principal component analysis. Missing data. Non-negative matrix factorization. Independent component analysis.
    • [29-30] 27-4-22. Non-linear dimensional reduction. Locally linear embedding. Isometric mapping. Stochastic neighbor embedding. Data visualization. Recap of density estimation. KDE. Nearest-Neighbor. Gaussian Mixtures. Modern astrostats with Matthew Mould (Birmingham UK) and Riccardo Buscicchio (Milano-Bicocca, Italy). [There are two gaps in the recording because the connection in the building dropped.]
    • [31-32] 2-5-22 What is regression? Linear regression. Polynomial regression. Basis function regression. Kernel regression. Over/under fitting. Cross validation. Learning curves. [And the connection dropped again, so this is another half-recorded lecture. I'm really sorry, but this is so unreliable. I'll look for an offline recording software]
    • [33-34] 4-5-22 Regularization. Ridge. LASSO. Non-linear regression. Gaussian process regression. Total least squares.
    • [35-36] 9-5-22. Generative vs discriminative classification. Receiver Operating Characteristic (ROC) curve. Naive Bayes. Gaussian naive Bayes. Linear and quadratic discriminant analysis. GMM Bayes classification. K-nearest neighbor classifier.
    • [37-28] 11-5-22. Logistic regression. Support vector machines. Decision trees. Bagging. Random forests. Boosting.
    • [39-40] 16-5-22 Loss functions. Gradient descent, learning rate. Adaptive boosting. Neural networks. Backpropagation. Layers, neurons, activation functions, regularization schemes.
    • [41-42] 18-5-22 TensorFlow, keras, and pytorch. Convolutional neural networks. Autoencoders. Generative adversarial networks.

