我使用了Sklearn的Digits数据集,并尝试使用TSNE(t分布随机邻域嵌入)将维度从64降至3:
import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as sns#%matplotib inlinefrom sklearn.manifold import TSNEfrom sklearn.datasets import load_digitsfrom mpl_toolkits.mplot3d import Axes3Ddigits = load_digits()digits_df = pd.DataFrame(digits.data,)digits_df["target"] = pd.Series(digits.target)tsne = TSNE(n_components=3)digits_tsne = tsne.fit_transform(digits_df.iloc[:,:64])digits_df_tsne = pd.DataFrame(digits_tsne, columns =["Component1","Component2","Component3"])finalDf = pd.concat([digits_df_tsne, digits_df["target"]], axis = 1)#Visualizing 3Dfigure = plt.figure(figsize=(9,9))axes = figure.add_subplot(111,projection = "3d")dots = axes.scatter(xs = finalDf[:,0],ys = finalDf[:,1],zs = finalDf[:,2], c = digits.target, cmap = plt.cm.get_cmap("nipy_spectral_r",10))
finalDf的数据如下:
错误信息:
TypeError: '(slice(None, None, None), 0)' is an invalid key
哪里出错了?谁能帮帮我?
回答:
您试图对pandas数据框进行numpy切片操作,这是无效的,因此首先需要将数据框转换为numpy数组。
这是更新后的代码:
import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as sns#%matplotib inlinefrom sklearn.manifold import TSNEfrom sklearn.datasets import load_digitsfrom mpl_toolkits.mplot3d import Axes3Ddigits = load_digits()digits_df = pd.DataFrame(digits.data,)digits_df["target"] = pd.Series(digits.target)tsne = TSNE(n_components=3)digits_tsne = tsne.fit_transform(digits_df.iloc[:,:64])digits_df_tsne = pd.DataFrame(digits_tsne, columns =["Component1","Component2","Component3"])finalDf = pd.concat([digits_df_tsne, digits_df["target"]], axis = 1)#Visualizing 3Dfigure = plt.figure(figsize=(9,9))axes = figure.add_subplot(111,projection = "3d")dots = axes.scatter(xs = finalDf.to_numpy()[:,0],ys = finalDf.to_numpy()[:,1],zs = finalDf.to_numpy()[:,2], c = digits.target, cmap = plt.cm.get_cmap("nipy_spectral_r",10))