K-means聚類——matlab-創(chuàng)新互聯(lián)

成都創(chuàng)新互聯(lián)專業(yè)為企業(yè)提供江南網(wǎng)站建設(shè)、江南做網(wǎng)站、江南網(wǎng)站設(shè)計、江南網(wǎng)站制作等企業(yè)網(wǎng)站建設(shè)、網(wǎng)頁設(shè)計與制作、江南企業(yè)網(wǎng)站模板建站服務(wù)，十余年江南做網(wǎng)站經(jīng)驗，不只是建網(wǎng)站，更提供有價值的思路和整體網(wǎng)絡(luò)服務(wù)。

1.簡介

2.算法原理

3.實例分析

3.1 讀取數(shù)據(jù)

3.2?原理推導(dǎo)K均值過程

3.3 自帶kmeans函數(shù)求解過程

完整代碼

1.簡介

聚類是一個將數(shù)據(jù)集中在某些方面相似的數(shù)據(jù)成員進(jìn)行分類組織的過程，聚類就是一種發(fā)現(xiàn)這種內(nèi)在結(jié)構(gòu)的技術(shù)，聚類技術(shù)經(jīng)常被稱為無監(jiān)督學(xué)習(xí)。

K均值聚類是最著名的劃分聚類算法，由于簡潔和效率使得他成為所有聚類算法中最廣泛使用的。給定一個數(shù)據(jù)點集合和需要的聚類數(shù)目K，K由用戶指定，K均值算法根據(jù)某個距離函數(shù)反復(fù)把數(shù)據(jù)分入K個聚類中。

2.算法原理

K-means算法是典型的基于距離的聚類算法，采用距離作為相似性的評價指標(biāo)，即認(rèn)為兩個對象的距離越近，其相似度就越大。該算法認(rèn)為簇是由距離靠近的對象組成的，因此把得到緊湊且獨(dú)立的簇作為最終目標(biāo)。

K-mean算法步驟如下：

（1）隨機(jī)選取K個樣本為中?

（2）分別計算所有樣本到隨機(jī)選取的K個中?的距離

（3）樣本離哪個中?近就被分到哪個中?

（4）計算各個中?樣本的均值（最簡單的?法就是求樣本每個維度的平均值）作為新的中心

（5）重復(fù)（2）（3）（4）直到新的中?和原來的中?基本不變化的時候，算法結(jié)束

3.實例分析

數(shù)據(jù)來源于：統(tǒng)計年鑒

從數(shù)據(jù)中，我們可以看到，實際數(shù)據(jù)是被分為三類的。

3.1 讀取數(shù)據(jù)

data=xlsread('D:\桌面\kmeans.xlsx')

在這里我們看到，xlsread讀取數(shù)據(jù)時沒有讀取變量名，但序號也被加進(jìn)去了，接下來我們需要將其剔除

data=data(:,2:7)

3.2?原理推導(dǎo)K均值過程

%% 原理推導(dǎo)K均值
[m,n]=size(data); %讀取數(shù)據(jù)的行數(shù)與列數(shù)
cluster_num=3; %自定義分類數(shù)
cluster=data(randperm(m,cluster_num),:);
epoch_max=1000;%大次數(shù)
therad_lim=0.001;%中心變化閾值
epoch_num=0;
while(epoch_numtherad_lim)
        cluster=cluster_new;
    else
        break;
    end
end
%% 畫出聚類效果
figure(2)
subplot(2,1,1)
a=unique(index_cluster); %找出分類出的個數(shù)
C=cell(1,length(a));
for i=1:length(a)
   C(1,i)={find(index_cluster==a(i))};
end
for j=1:cluster_num
    data_get=data(C{1,j},:);
    scatter(data_get(:,1),data_get(:,2),100,'filled','MarkerFaceAlpha',.6,'MarkerEdgeAlpha',.9);
    hold on
end
plot(cluster(:,1),cluster(:,2),'kd','LineWidth',2);
hold on
sc_t=mean(silhouette(data,index_cluster'));
title_str=['原理推導(dǎo)K均值聚類','  聚類數(shù)為：',num2str(cluster_num),'  SC輪廓系數(shù):',num2str(sc_t)];
title(title_str)

3.3 自帶kmeans函數(shù)求解過程

%% MATLAB自帶kmeans函數(shù)
subplot(2,1,2) %畫子圖，在這里是一圖上可畫兩張子圖
cluster_num=3; %自定義分類數(shù)
[index_km,center_km]=kmeans(data,cluster_num) ;%MATLAB自帶kmeans函數(shù)
a=unique(index_km); %找出分類出的個數(shù)
C=cell(1,length(a));
for i=1:length(a)
   C(1,i)={find(index_km==a(i))};
end
for j=1:cluster_num
    data_get=data(C{1,j},:);
    scatter(data_get(:,1),data_get(:,2),100,'filled','MarkerFaceAlpha',.6,'MarkerEdgeAlpha',.9);
    hold on
end
plot(center_km(:,1),center_km(:,2),'kd','LineWidth',2);
hold on
sc_k=mean(silhouette(data,index_km));
title_str1=['MATLAB自帶kmeans函數(shù)','  聚類數(shù)為：',num2str(cluster_num),'  SC輪廓系數(shù):',num2str(sc_k)];
title(title_str1)

返回結(jié)果如下：

完整代碼

clear;clc;
data=xlsread('D:\桌面\kmeans.xlsx')
data=data(:,2:7)
%% 原理推導(dǎo)K均值
[m,n]=size(data); %讀取數(shù)據(jù)的行數(shù)與列數(shù)
cluster_num=3; %自定義分類數(shù)
cluster=data(randperm(m,cluster_num),:);
epoch_max=1000;%大次數(shù)
therad_lim=0.001;%中心變化閾值
epoch_num=0;
while(epoch_numtherad_lim)
        cluster=cluster_new;
    else
        break;
    end
end
%% 畫出聚類效果
figure
subplot(2,1,1) %畫子圖，在這里是一圖上可畫兩張子圖
a=unique(index_cluster); %找出分類出的個數(shù)
C=cell(1,length(a));
for i=1:length(a)
   C(1,i)={find(index_cluster==a(i))};
end
for j=1:cluster_num
    data_get=data(C{1,j},:);
    scatter(data_get(:,1),data_get(:,2),100,'filled','MarkerFaceAlpha',.6,'MarkerEdgeAlpha',.9);
    hold on
end
plot(cluster(:,1),cluster(:,2),'kd','LineWidth',2);
hold on
sc_t=mean(silhouette(data,index_cluster'));
title_str=['原理推導(dǎo)K均值聚類','  聚類數(shù)為：',num2str(cluster_num),'  SC輪廓系數(shù):',num2str(sc_t)];
title(title_str)

%% MATLAB自帶kmeans函數(shù)
subplot(2,1,2) %畫子圖，在這里是一圖上可畫兩張子圖
cluster_num=3; %自定義分類數(shù)
[index_km,center_km]=kmeans(data,cluster_num) ;%MATLAB自帶kmeans函數(shù)
a=unique(index_km); %找出分類出的個數(shù)
C=cell(1,length(a));
for i=1:length(a)
   C(1,i)={find(index_km==a(i))};
end
for j=1:cluster_num
    data_get=data(C{1,j},:);
    scatter(data_get(:,1),data_get(:,2),100,'filled','MarkerFaceAlpha',.6,'MarkerEdgeAlpha',.9);
    hold on
end
plot(center_km(:,1),center_km(:,2),'kd','LineWidth',2);
hold on
sc_k=mean(silhouette(data,index_km));
title_str1=['MATLAB自帶kmeans函數(shù)','  聚類數(shù)為：',num2str(cluster_num),'  SC輪廓系數(shù):',num2str(sc_k)];
title(title_str1)

每次返回結(jié)果也不盡相同，原理推導(dǎo)的和自帶的函數(shù)的求解結(jié)果也相差不是很大，但與原始數(shù)據(jù)的分類相比較，還是有一定差距

你是否還在尋找穩(wěn)定的海外服務(wù)器提供商？創(chuàng)新互聯(lián)www.cdcxhl.cn海外機(jī)房具備T級流量清洗系統(tǒng)配攻擊溯源，準(zhǔn)確流量調(diào)度確保服務(wù)器高可用性，企業(yè)級服務(wù)器適合批量采購，新人活動首月15元起，快前往官網(wǎng)查看詳情吧

網(wǎng)站題目：K-means聚類——matlab-創(chuàng)新互聯(lián)
網(wǎng)頁網(wǎng)址：http://muchs.cn/article46/dshihg.html

成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián)，為您提供App開發(fā)、全網(wǎng)營銷推廣、靜態(tài)網(wǎng)站、網(wǎng)站制作、移動網(wǎng)站建設(shè)、軟件開發(fā)

聲明：本網(wǎng)站發(fā)布的內(nèi)容（圖片、視頻和文字）以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主，如果涉及侵權(quán)請盡快告知，我們將會在第一時間刪除。文章觀點不代表本網(wǎng)站立場，如需處理請聯(lián)系客服。電話：028-86922220；郵箱：631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載，或轉(zhuǎn)載時需注明來源：創(chuàng)新互聯(lián)

猜你還喜歡下面的內(nèi)容