- Advanced Statistics
- Summary
Course Syllabus
Obiettivi formativi
L'obiettivo del corso è quello di introdurre alcuni dei principali concetti del ragionamento statistico. Gli studenti saranno in grado di formalizzare un problema statistico e di identificare gli strumenti probabilistici necessari e i metodi statistici più adeguati all'analisi di un dataset. Il modulo si concentrerà sulla logica alla base dei metodi statistici così che gli studenti possano sviluppare un approccio critico autonomo, ed essere in grado di comprendere ed applicare tecniche statistiche, anche se non affrontate all'interno del corso.
Contenuti sintetici
- Elementi di statistica descrittiva per la sintesi di dataset
- Ripasso essenziale di teoria della probabilità
- Distribuzioni delle statistiche campionarie
- Metodi di stima
- Test d'ipotesi
- Modello di regressione lineare
- Analisi dei dati con Stata
Prerequisiti
Calcolo di base e statistica descrittiva
Metodi didattici
35 lezioni frontali, di cui 8 ore di laboratorio.
Modalità di verifica dell'apprendimento
L'apprendimento sarà verificato tramite esame scritto. Un esame orale aggiuntivo -e opzionale- potrà essere richiesto dallo studente o dal docente.
Testi di riferimento
Libro di testo: Sheldon Ross, Introductory Statistics (4th edition)
Altro materiale sarà fornito durante il corso
Periodo di erogazione dell’insegnamento
Primo semestre (primo ciclo)
Lingua di insegnamento
Inglese
Learning objectives
The course aims at introducing the main concepts of statistical reasoning. When analysing a dataset, students will be able to formalise a statistical problem and to identify probabilistic tools and suitable statistical methods for the data. The focus of the module will be on the rationale underlying basic statistical methods so that students might develop an autonomous critical approach and be able to understand and apply statistical techniques, even if not covered by the syllabus of this module.
Contents
- Using statistics to summarise datasets
- Essential review of probability theory
- Distribution of sampling statistics
- Estimation
- Testing statistical hypothesis
- Linear regression
- Data analysis with Stata
Detailed program
Using statistics to summarise datasets
- Populations and samples
- Sample mean
- Linearity of sample mean, with proof
- Deviations
- Sample median
- Sample percentile of order 100p
- Sample quartiles
- Sample mode
- Sample variance and standard deviation
- Property of sample variance, with proof
- Sample correlation coefficient and interpretation
Essential review of probability theory
- random experiment, sample space, events
- defining properties of probability
- frequentist interpretation of probability
- probability of the complement and addition rule
- equally likely outcomes
- independent events
- random variables (RV)
- discrete RVs
- distribution of a RV
- expected value of a discrete RV
- properties of the expectation
- variance of a discrete RV (equivalent definitions)
- Bernoulli RV
- Binomail distribution, with derivation
- Factiorial numbers and binomail coefficient
- continuous RVs
- probability density function of X and P(a<X<b)
- uniform random variable
- normal random variable
- standard normal random variable
- computing normal probabilities
- additivity of normal RVs
Distribution of sampling statistics
- sample from a distribution
- sample mean as a RV
- expected value and variance of sample mean
- simulation study with binomial sample
- approximate areas under the normal curve
- central limit theorem
- normal approximation for the sample mean (for large n)
Estimation
- estimator and estimate
- unbiased estimator of a parameter
- point estimator of population mean
- margins of error via normal approximation
- point estimator of the population variance
- sampling proportions from a finite population
- random sample of size n from a population of size N
- approximate Binomial distribution for the sample when N is much larger than n
- sample proportion as estimator for population proportion
- interval estimator
- point and interval estimate for the sample proportion
Testing statistical hypothesis
- introduction to hypothesis testing
- null and alternative hypothesis
- test statistic
- critical region
- type I and type II errors
- level of significance of a test
- two-sided and one-sided tests
- p-value
- one-sample Z-test (derivation from the distribution of the test statistic)
- T-test
Linear regression
- simple linear regression
- independent and dependent variables
- least square estimates for the parameters of a linear regression model
- estimated regression line
- prediction via the estimated regression line
- confidence intervals for the predicted values
- coefficient of determination (and its rationale)
Data analysis with Stata
- Interface of Stata
- Menu system, command windows and Do-files
- Importing and editing datasets
- computing probabilities with Stata
- examples with binomial and normal random variables
- Z-test: practical examples with Stata
- T-test: practical examples with Stata
- test for proportions: practical examples with Stata
- simple linear regression with Stata
- introduction to residual analysis
- multiple linear regression with Stata
Prerequisites
Basic calculus and descriptive statistics
Teaching methods
The course will consist of a total of 35 classes, of which 27 regular classes and 8 lab sessions (with Stata)
Assessment methods
Assessment will be based on a written exam. Optionally, an additional oral exam might be required by the student or the instructor.
Textbooks and Reading Materials
Textbook: Sheldon Ross, Introductory Statistics (4th edition)
Other material is provided during the course
Semester
First semester (first cycle)
Teaching language
English