Dear students,
let me add some further details. Unlike flat clustering, hierarchical (or agglomerative or divisive) clustering does not, in principle, require defining the desired number of clusters, because the limiting cases are obtaining a single cluster (agglomerative approach) or as many clusters as there are documents (divisive approach).
In practice then, since these are limiting cases, and since hierarchical algorithms are much more computationally burdensome than flat clustering algorithms, it is necessary to set criteria that can “stop” the algorithm once a “satisfactory” division into clusters has been obtained.
Several criteria (explained in class) can be considered at this point, including those also correctly suggested by Luca.
Best regards,
Marco Viviani
let me add some further details. Unlike flat clustering, hierarchical (or agglomerative or divisive) clustering does not, in principle, require defining the desired number of clusters, because the limiting cases are obtaining a single cluster (agglomerative approach) or as many clusters as there are documents (divisive approach).
In practice then, since these are limiting cases, and since hierarchical algorithms are much more computationally burdensome than flat clustering algorithms, it is necessary to set criteria that can “stop” the algorithm once a “satisfactory” division into clusters has been obtained.
Several criteria (explained in class) can be considered at this point, including those also correctly suggested by Luca.
Best regards,
Marco Viviani