我在使用 R
和 autoplot
的教程中看到了这个。他们绘制了加载和加载标签:
autoplot(prcomp(df), data = iris, colour = 'Species', loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3)
https://cran.r-project.org/web/packages/ggfortify/vignettes/plot_pca.html
我更喜欢使用 Python 3
结合 matplotlib, scikit-learn, 和 pandas
来进行数据分析。然而,我不知道如何添加这些内容?
如何使用 matplotlib
绘制这些向量?
我一直在阅读 在 sklearn 中使用 PCA 恢复 explained_variance_ratio_ 的特征名称,但还没有弄明白
这是我在 Python
中绘制的方式
import numpy as npimport pandas as pdimport matplotlib.pyplot as pltfrom sklearn.datasets import load_irisfrom sklearn.preprocessing import StandardScalerfrom sklearn import decompositionimport seaborn as sns; sns.set_style("whitegrid", {'axes.grid' : False})%matplotlib inlinenp.random.seed(0)# Iris datasetDF_data = pd.DataFrame(load_iris().data, index = ["iris_%d" % i for i in range(load_iris().data.shape[0])], columns = load_iris().feature_names)Se_targets = pd.Series(load_iris().target, index = ["iris_%d" % i for i in range(load_iris().data.shape[0])], name = "Species")# Scaling mean = 0, var = 1DF_standard = pd.DataFrame(StandardScaler().fit_transform(DF_data), index = DF_data.index, columns = DF_data.columns)# Sklearn for Principal Componenet Analysis# Dimsm = DF_standard.shape[1]K = 2# PCA (How I tend to set it up)Mod_PCA = decomposition.PCA(n_components=m)DF_PCA = pd.DataFrame(Mod_PCA.fit_transform(DF_standard), columns=["PC%d" % k for k in range(1,m + 1)]).iloc[:,:K]# Color classescolor_list = [{0:"r",1:"g",2:"b"}[x] for x in Se_targets]fig, ax = plt.subplots()ax.scatter(x=DF_PCA["PC1"], y=DF_PCA["PC2"], color=color_list)
回答:
尝试使用 PCA 库。它与 Pandas 对象配合得很好(不需要强制使用)。
首先安装包:
pip install pca
以下代码将绘制解释方差、散点图和双向图。
from pca import pcaimport pandas as pd############################################################ SETUP DATA############################################################ Load sample data, represent the data as a pd.DataFramefrom sklearn.datasets import load_irisiris = load_iris()X = pd.DataFrame(data=iris.data, columns=iris.feature_names)X.columns = ["sepal_length", "sepal_width", "petal_length", "petal_width"]y = pd.Categorical.from_codes(iris.target, iris.target_names)############################################################ COMPUTE AND VISUALIZE PCA############################################################ Initialize the PCA, either reduce the data to the number of# principal components that explain 95% of the total variance...model = pca(n_components=0.95)# ... or explicitly specify the number of PCsmodel = pca(n_components=2)# Fit and transformresults = model.fit_transform(X=X, row_labels=y)# Plot the explained variancefig, ax = model.plot()# Scatter the first two PCsfig, ax = model.scatter()# Create a biplotfig, ax = model.biplot(n_feat=4)
标准的双向图看起来会像这样。