Skip to main content
If you continue browsing this website, you agree to our policies:
  • Condizioni di utilizzo e trattamento dei dati
Continue
x
e-Learning - UNIMIB
  • Home
  • My Media
  • More
Listen to this page using ReadSpeaker
 Log in
e-Learning - UNIMIB
Home My Media
Percorso della pagina
  1. Science
  2. Master Degree
  3. Data Science [FDS02Q - FDS01Q]
  4. Courses
  5. A.A. 2023-2024
  6. 2nd year
  1. Natural Language Processing
  2. Summary
Insegnamento Course full name
Natural Language Processing
Course ID number
2324-2-FDS01Q011
Course summary SYLLABUS

Course Syllabus

  • Italiano ‎(it)‎
  • English ‎(en)‎
Export

Obiettivi

Il corso si propone di introdurre gli elementi fondazionali e i più recenti modelli computazionali avanzati relativi al processamento del linguaggio naturale. Lo studente, al termine dell'attività formativa, avrà acquisito conoscenze e competenze relative ad algoritmi, strumenti e modelli per l’elaborazione e l’analisi del linguaggio naturale, al fine di sfruttare i più recenti sistemi di processamento presenti allo stato dell’arte.

Contenuti sintetici

Elementi fondazionali di rappresentazione del linguaggio naturale
Semantica delle parole
Large Language Models
Applicazioni di NLP

Programma esteso

  1. Fundamentals

    • Rationalist and Empiricist Approaches to Language
    • The Ambiguity of Language: Why NLP Is Difficult
    • Linguistic Essentials
      • Words, Tokens, Lemmas, Stems
      • Parts of Speech and Morphology
      • Phrase Structure
    • Dirty Hands-on Text
      • 1.4.1 Lexical resources
      • 1.4.2 Word counts
      • 1.4.3 Zipf’s laws
      • 1.4.4 Collocations
      • 1.4.5 Concordances
  2. Vector Semantics

    • Frequentist Representation of Text (TF, TF-IDF, etc..)
    • Word Embeddings
      • Word2Vec
      • FastText
      • Glove
    • Visualization of Embeddings:
      • Principal Components Analysis
      • T-distributed stochastic neighbor embedding
      • Uniform Manifold Approximation and Projection
  3. Transformers and Large Language Models

    • Attention Mechanisms: Self and Multi Head Attention
    • Positional Embeddings
    • Transformers as Language Models
    • Pretraining Large Language Models
    • Prompting and Instruct Tuning
    • Interpretability and Explainability of Language Models
  4. NLP Applications

    • Text and Token Classification
    • Chatbots and Dialog Systems
    • Word Sense Disambiguation
    • Topic Modeling
    • Machine Translation

Prerequisiti

Utile, ma non obbligatorio: apprendimento automatico, programmazione in python

Modalità didattica

Lezioni ed esercitazioni in aula.
Il corso verrà erogato in lingua inglese.

Materiale didattico

Cristopher MANNING and Hinrich SCHÜTZE. Foundations of Statistical Natural Language Processing. MIT Press.
Dan JURAFSKY and James H. MARTIN. Speech and Language Processing. Prentice Hall.

Periodo di erogazione dell'insegnamento

Secondo semestre

Modalità di verifica del profitto e valutazione

Progetto ed Esame Orale. Sono assenti prove in itinere intermedie.
Il progetto consisterà nello sviluppo di uno strumento di natural language processing basato su metodi e modelli presentati a lezione. Il progetto prevede una valutazione espressa in un range 0-24.
L'orale prevede 4 domande di teoria tra gli argomenti del corso elencati nel programma dettagliato. Per ciascuna domanda verrà data una valutazione compresa pari a -2, per una risposta errata o mancata risposta, e +2 punti per una risposta corretta.

Orario di ricevimento

Su appuntamento da concordare via email con il docente.

Export

Aims

The course aims to introduce the foundational elements and the most recent advanced computational models related to natural language processing. At the end of the training activity, the student will have acquired knowledge and skills related to algorithms, tools and models for processing and analyzing natural language, in order to exploit the most recent state-of-the-art processing systems.

Contents

Foundations of natural language representation
Semantics of words
Large Language Models
NLP applications

Detailed program

  1. Fundamentals

    • Rationalist and Empiricist Approaches to Language
    • The Ambiguity of Language: Why NLP Is Difficult
    • Linguistic Essentials
      • Words, Tokens, Lemmas, Stems
      • Parts of Speech and Morphology
      • Phrase Structure
    • Dirty Hands-on Text
      • 1.4.1 Lexical resources
      • 1.4.2 Word counts
      • 1.4.3 Zipf’s laws
      • 1.4.4 Collocations
      • 1.4.5 Concordances
  2. Vector Semantics

    • Frequentist Representation of Text (TF, TF-IDF, etc..)
    • Word Embeddings
      • Word2Vec
      • FastText
      • Glove
    • Visualization of Embeddings:
      • Principal Components Analysis
      • T-distributed stochastic neighbor embedding
      • Uniform Manifold Approximation and Projection
  3. Transformers and Large Language Models

    • Attention Mechanisms: Self and Multi Head Attention
    • Positional Embeddings
    • Transformers as Language Models
    • Pretraining Large Language Models
    • Prompting and Instruct Tuning
    • Interpretability and Explainability of Language Models
  4. NLP Applications

    • Text and Token Classification
    • Chatbots and Dialog Systems
    • Word Sense Disambiguation
    • Topic Modeling
    • Machine Translation

Prerequisites

Useful, but not required: machine learning, python programming

Teaching form

Lectures and classroom exercises.
The course will be given in English.

Textbook and teaching resource

Cristopher MANNING and Hinrich SCHÜTZE. Foundations of Statistical Natural Language Processing. MIT Press.
Dan JURAFSKY and James H. MARTIN. Speech and Language Processing. Prentice Hall.

Semester

Second semester

Assessment method

Project and Oral Exam. Intermediate tests are absent.
The project will consist in the development of a natural language processing tool based on methods and models presented during the course. The project is evaluated in the range of 0-24 points.
The oral exam consists of 4 questions about theory addressed during the course and listed in the detailed program. For each question, an evaluation of -2 will be given, for an incorrect answer or no answer, and +2 points for a correct answer.

Office hours

By appointment to be agreed via email with the teacher.

Enter

Key information

Field of research
INF/01
ECTS
6
Term
Second semester
Activity type
Mandatory to be chosen
Course Length (Hours)
42
Degree Course Type
2-year Master Degreee
Language
English

Staff

    Teacher

  • EF
    Elisabetta Fersini
  • AR
    Alessandro Raganato

Students' opinion

View previous A.Y. opinion

Bibliography

Find the books for this course in the Library

Enrolment methods

Manual enrolments
Self enrolment (Student)

You are not logged in. (Log in)
Policies
Get the mobile app
Powered by Moodle
© 2025 Università degli Studi di Milano-Bicocca
  • Privacy policy
  • Accessibility
  • Statistics