{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Clustering, Density Estimation, and Principal Component Analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " ### K-Means Clustering\n", " \n", " L'algoritmo K-Means si pone l'obiettivo di trovare gruppi (**clusters**) di dati in uno spazio multidimensionale. Vediamo un esempio di uso dell'algoritmo K-Means. Consideriamo il dataset \"Old Faithful\". Il dataset contiene misurazioni sulle eruzioni del geyser Old Faithful nel Yellowstone National Park, USA. In particolare, ogni record riporta la durata di una eruzione e il tempo passato tra l'eruzione corrente e la successiva. Carichiamo il dataset:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Numero di record: 272\n" ] }, { "data": { "text/html": [ "
\n", " | eruptions | \n", "waiting | \n", "
---|---|---|
0 | \n", "3.600 | \n", "79 | \n", "
1 | \n", "1.800 | \n", "54 | \n", "
2 | \n", "3.333 | \n", "74 | \n", "
3 | \n", "2.283 | \n", "62 | \n", "
4 | \n", "4.533 | \n", "85 | \n", "
KMeans(n_clusters=2)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
KMeans(n_clusters=2)
\n", " | X | \n", "Y | \n", "C | \n", "
---|---|---|---|
0 | \n", "0.098318 | \n", "0.596025 | \n", "0 | \n", "
1 | \n", "-1.478733 | \n", "-1.242890 | \n", "1 | \n", "
2 | \n", "-0.135612 | \n", "0.228242 | \n", "0 | \n", "
3 | \n", "-1.055558 | \n", "-0.654437 | \n", "1 | \n", "
4 | \n", "0.915755 | \n", "1.037364 | \n", "0 | \n", "
KMeans(n_clusters=10)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
KMeans(n_clusters=10)
GaussianMixture(n_components=2)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
GaussianMixture(n_components=2)
\n", " | x | \n", "y | \n", "
---|---|---|
0 | \n", "6.403621 | \n", "-0.742478 | \n", "
1 | \n", "2.782241 | \n", "-4.126082 | \n", "
2 | \n", "6.354109 | \n", "-2.308102 | \n", "
3 | \n", "5.499853 | \n", "-1.986180 | \n", "
4 | \n", "5.492006 | \n", "-5.074300 | \n", "
PCA()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
PCA()
PCA(n_components=1)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
PCA(n_components=1)
PCA()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
PCA()