使用点列表列训练模型

我想根据裂缝的深度对其进行分类。为此，我在数据框架中存储了以下特征：

WindowsDf = pd.DataFrame(dataForWindowsDf, columns=['IsCrack', 'CheckTypeEncode', 'DepthCrack',                                                    'WindowOfInterest'])#dataForWindowsDf is a list which iteratively built from csv files.#Windows data frame taking this list and build a data frame from it.

因此，我的目标列是’DepthCrack’，其他列是特征向量的一部分。’WindowOfInterest’是一个二维列表列 – 点列表 – 一个图表，代表所进行的测试（基于电磁波从表面返回的时间函数）：

[[0.9561600000000001, 0.10913097635410397], [0.95621,0.1100000]...]

我遇到的问题是如何使用一个二维列表列来训练模型（我尝试直接使用它但没有成功）？你建议用什么方法来解决这个问题？

我想过从二维列表中提取特征 – 得到一维特征（积分等）。

回答：

你可以将这个特征转换为两个特征，例如WindowOfInterest可以变为：

WindowOfInterest_x1和WindowOfInterest_x2

例如，从你的DataFrame中：

>>> import pandas as pd>>> df = pd.DataFrame({'IsCrack': [1, 1, 1, 1, 1], ...                    'CheckTypeEncode': [0, 1, 0, 0, 0], ...                    'DepthCrack': [0.4, 0.2, 1.4, 0.7, 0.1], ...                    'WindowOfInterest': [[0.9561600000000001, 0.10913097635410397], [0.95621,0.1100000], [0.459561, 0.635410397], [0.4495621,0.32], [0.621,0.2432]]}, ...                   index = [0, 1, 2, 3, 4])>>> df    IsCrack CheckTypeEncode DepthCrack  WindowOfInterest0   1       0               0.4         [0.9561600000000001, 0.10913097635410397]1   1       1               0.2         [0.95621, 0.11]2   1       0               1.4         [0.459561, 0.635410397]3   1       0               0.7         [0.4495621, 0.32]4   1       0               0.1         [0.621, 0.2432]]

我们可以这样split这个list：

>>> df[['WindowOfInterest_x1','WindowOfInterest_x2']] = pd.DataFrame(df['WindowOfInterest'].tolist(), index=df.index)>>> df        IsCrack  CheckTypeEncode    DepthCrack          WindowOfInterest                           WindowOfInterest_x1  WindowOfInterest_x20       1        0                  0.4                 [0.9561600000000001, 0.10913097635410397]  0.956160             0.1091311       1        1                  0.2                 [0.95621, 0.11]                            0.956210             0.1100002       1        0                  1.4                 [0.459561, 0.635410397]                    0.459561             0.6354103       1        0                  0.7                 [0.4495621, 0.32]                          0.449562             0.3200004       1        0                  0.1                 [0.621, 0.2432]                            0.621000             0.243200

最后，我们可以drop掉WindowOfInterest列：

>>> df = df.drop(['WindowOfInterest'], axis=1)>>> df    IsCrack CheckTypeEncode DepthCrack  WindowOfInterest_x1 WindowOfInterest_x20   1       0               0.4         0.956160            0.1091311   1       1               0.2         0.956210            0.1100002   1       0               1.4         0.459561            0.6354103   1       0               0.7         0.449562            0.3200004   1       0               0.1         0.621000            0.243200

现在你可以将WindowOfInterest_x1和WindowOfInterest_x2作为模型的特征使用。

学技术

使用点列表列训练模型

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复