Insegnamento
Course full name
Astrostatistics
Course ID number
2122-1-F5802Q014
Section outline
-
-
Forum
-
-
All material related to this course is at github.com/dgerosa/astrostatistics_bicocca_2022
-
Data mining and machine learning are computational subjects. One does not understand how to treat scientific data by reading equations on the blackboard: you will need to get your hands dirty (and this is the fun part!). Students are required to come to classes with a laptop or any device where you can code on (larger than a smartphone I would say...). Each class will pair theoretical explanations to hands-on exercises and demonstrations. These are the key content of the course, so please engage with them as much a possible.
-
- [01-02] 2-3-22. Introduction to class. IT setup. git version control
- [03-03] 7-3-22. Introduction to probability. Bayes theorem. Monty hall problem. Transformation of random variables. [The recording was interrupted at some point because the connection in the building dropped.]
- [05-06] 8-3-22. Monte Carlo integrations. Descriptive statistics. Mean vs population quantities. Typical distributions.
- [07-08] 9-3-22. Central limit theorem. Multivariate pdfs. Correlation coefficients. Sampling from arbitrary pdfs.
- [09-10] 14-3-22. Frequentist vs Bayesian inference. Maximum likelihood estimation. Omoscedastic Gaussian data, Heteroscedastic Gaussian data, non Gaussian data.
- [11-12] 15-3-22. Maximum likelihood fit. Role of outliers. Goodness of fit. Model comparison. Gaussian mixtures. Boostrap and jackknife.
- [13-14] 16-3-22. Hypothesis testing. Comparing distributions, KS test. Histograms. Kernel density estimators.
- [15-16] 21-3-22. The Bayesian approach to statistics. Prior distributions. Credible regions. Parameter estimation examples. Marginalization. Parameter estimation examples. Model comparison: odds ratio. Approximate model comparison.
- [17-18] 23-3-22. Monte Carlo methods. Markov chains. Burn-in. Metropolis-Hastings algorithm.
- [19-20] 28-3-22. MCMC diagnostics. Traceplots. Autocorrelation lenght. Samplers in practice: emcee and PyMC3. Gibbs sampling. Conjugate priors.
- [21-22] 30-3-22. Evidence evaluation. Model selection. Nested sampling. Samplers in practice: dynesty.
- [23-24] 4-4.22. Data mining and machine learning. Supervised and unsupervised learning. Overview of scikit-learn. Examples.
- [25-26] 6-4.22 . K-fold cross validation. Unsupervised clustering. K-Means Clustering. Mean-shift Clustering. Correlation functions.
- [27-28] 20-4-22. Curse of dimensionality. Principal component analysis. Missing data. Non-negative matrix factorization. Independent component analysis.
- [29-30] 27-4-22. Non-linear dimensional reduction. Locally linear embedding. Isometric mapping. Stochastic neighbor embedding. Data visualization. Recap of density estimation. KDE. Nearest-Neighbor. Gaussian Mixtures. Modern astrostats with Matthew Mould (Birmingham UK) and Riccardo Buscicchio (Milano-Bicocca, Italy). [There are two gaps in the recording because the connection in the building dropped.]
- [31-32] 2-5-22 What is regression? Linear regression. Polynomial regression. Basis function regression. Kernel regression. Over/under fitting. Cross validation. Learning curves. [And the connection dropped again, so this is another half-recorded lecture. I'm really sorry, but this is so unreliable. I'll look for an offline recording software]
- [33-34] 4-5-22 Regularization. Ridge. LASSO. Non-linear regression. Gaussian process regression. Total least squares.
- [35-36] 9-5-22. Generative vs discriminative classification. Receiver Operating Characteristic (ROC) curve. Naive Bayes. Gaussian naive Bayes. Linear and quadratic discriminant analysis. GMM Bayes classification. K-nearest neighbor classifier.
- [37-28] 11-5-22. Logistic regression. Support vector machines. Decision trees. Bagging. Random forests. Boosting.
- [39-40] 16-5-22 Loss functions. Gradient descent, learning rate. Adaptive boosting. Neural networks. Backpropagation. Layers, neurons, activation functions, regularization schemes.
- [41-42] 18-5-22 TensorFlow, keras, and pytorch. Convolutional neural networks. Autoencoders. Generative adversarial networks.
-
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-
Kaltura Video Resource
-