隔离森林 – TypeError: 无效的类型提升

我试图将隔离森林应用于从事件日志转换而来的数据，但得到了“TypeError: invalid type promotion”的错误，这是否是因为日期时间造成的？我不明白我做错了什么！

我的表格的一部分（处理后）：

 +--------------+----------------------+--------------+--------------------+--------------------+-------------------+-----------------+| org:resource | lifecycle:transition | concept:name |   time:timestamp   |   case:REG_DATE    | case:concept:name | case:AMOUNT_REQ |+--------------+----------------------+--------------+--------------------+--------------------+-------------------+-----------------+|           52 |                    0 |            9 | 2011 10-01 38:44.5 | 2011 10-01 38:44.5 |                 0 |           20000 ||           52 |                    0 |            6 | 2011 10-01 38:44.9 | 2011 10-01 38:44.5 |                 2 |           20000 ||           52 |                    0 |            7 | 2011 10-01 39:37.9 | 2011 10-01 38:44.5 |                 0 |           20000 ||           52 |                    1 |           19 | 2011 10-01 39:38.9 | 2011 10-01 38:44.5 |                 1 |           20000 ||           68 |                    2 |           19 | 2011 10-01 36:46.4 | 2011 10-01 38:44.5 |                 3 |           20000 |+--------------+----------------------+--------------+--------------------+--------------------+-------------------+-----------------+

当打印时

df.info()<class 'pandas.core.frame.DataFrame'>RangeIndex: 262200 entries, 0 to 262199Data columns (total 7 columns): #   Column                Non-Null Count   Dtype         ---  ------                --------------   -----          0   org:resource          262200 non-null  int64          1   lifecycle:transition  262200 non-null  int64          2   concept:name          262200 non-null  int64          3   time:timestamp        262200 non-null  datetime64[ns] 4   case:REG_DATE         262200 non-null  datetime64[ns] 5   case:concept:name     262200 non-null  int64          6   case:AMOUNT_REQ       262200 non-null  int32         dtypes: datetime64[ns](2), int32(1), int64(4)memory usage: 13.0 MB

我的代码是：

from sklearn.ensemble import IsolationForestcontamination = 0.05model = IsolationForest(contamination=contamination, n_estimators=10000)model.fit(df)df["iforest"] = pd.Series(model.predict(df))df["iforest"] = df["iforest"].map({1: 0, -1: 1})df["score"] = model.decision_function(df)df.sort_values("score")

然而，我得到了以下错误：

---------------------------------------------------------------------------TypeError                                 Traceback (most recent call last)<ipython-input-23-5edb86351ac8> in <module>      4       5 model = IsolationForest(contamination=contamination, n_estimators=10000)----> 6 model.fit(df)      7       8 df["iforest"] = pd.Series(model.predict(df))~\.conda\envs\process_mining\lib\site-packages\sklearn\ensemble\_iforest.py in fit(self, X, y, sample_weight)    261                 )    262 --> 263         X = check_array(X, accept_sparse=['csc'])    264         if issparse(X):    265             # Pre-sort indices to avoid that each individual tree of the~\.conda\envs\process_mining\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)     70                           FutureWarning)     71         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})---> 72         return f(**kwargs)     73     return inner_f     74 ~\.conda\envs\process_mining\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)    531     532         if all(isinstance(dtype, np.dtype) for dtype in dtypes_orig):--> 533             dtype_orig = np.result_type(*dtypes_orig)    534     535     if dtype_numeric:<__array_function__ internals> in result_type(*args, **kwargs)TypeError: invalid type promotion

回答：

我通过这个答案找到了解决方案：Python – linear regression TypeError: invalid type promotion

技术上，你需要将时间戳转换为序数，这样就可以工作了，我使用以下代码进行了转换：

df['time:timestamp'] = df['time:timestamp'].map(dt.datetime.toordinal)df['case:REG_DATE'] = df['case:REG_DATE'].map(dt.datetime.toordinal)

学技术

隔离森林 – TypeError: 无效的类型提升

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复