我有一个代码示例。代码运行得很好,但我的问题是,代码不够简洁,占用了太多的行数。我相信通过使用方法或for循环可以减少代码量,但我不知道如何实现。代码片段有90%是相同的,仅在变量部分有所变化。我只展示了两个片段,但我的代码中有五个这样的片段。
#KFOLD-1all_fold_X_1 = pd.DataFrame(columns=['Sentence_txt'])index = 0for k, i in enumerate(dfNew['Sentence_txt'].values): if k in kFoldsTrain1: all_fold_X_1 = all_fold_X_1.append({index:i}, ignore_index=True)X_train1 = count_vect.fit_transform(all_fold_X_1[0].values)Y_train1 = [i for k,i in enumerate(dfNew['Sentence_Polarity'].values) if k in kFoldsTrain1]Y_train1 = np.asarray(Y_train1)#KFOLD-2all_fold_X_2 = pd.DataFrame(columns=['Sentence_txt'])index = 0for k, i in enumerate(dfNew['Sentence_txt'].values): if k in kFoldsTrain2: all_fold_X_2 = all_fold_X_2.append({index:i}, ignore_index=True)X_train2 = count_vect.fit_transform(all_fold_X_2[0].values)Y_train2 = [i for k,i in enumerate(dfNew['Sentence_Polarity'].values) if k in kFoldsTrain2]Y_train2 = np.asarray(Y_train2)
回答:
由于没有提供完整的示例,我做了一些假设。可能类似于以下内容:
def train(dataVar, dfNew): ret = {} index = 0 for k, i in enumerate(dfNew['Sentence_txt'].values): if k in kFoldsTrain1: dataVar = dataVar.append({index:i}, ignore_index=True) ret['x'] = count_vect.fit_transform(dataVar[0].values) ret['y'] = [i for k,i in enumerate(dfNew['Sentence_Polarity'].values) if k in kFoldsTrain1] ret['y'] = np.asarray(Y_train1) return ret#KFOLD-1kfold1 = train(pd.DataFrame(columns=['Sentence_txt']), dfNew)#KFOLD-2kfold2 = train(pd.DataFrame(columns=['Sentence_txt']), dfNew)
你可能已经明白了。如果变量’dfNew’是全局的,你可能不需要函数中的第二个参数。我也不是Python专家!;)