如何将函数（BigramCollocationFinder）应用到Pandas DataFrame

我不太习惯编程，需要一些帮助来解决一个问题。我有一个包含4列和大约5000行的.csv文件，里面填满了问题和答案。我想在每个单元格中查找词语搭配。

起点：包含4列和大约5000行的Pandas数据框。（Id, Title, Body, Body2）

目标：包含7列的数据框（Id, Title, Title-Collocations, Body, Body_Collocations, Body2, Body2-Collocations），并对每一行应用一个函数。

我在NLTK文档中找到了一个关于Bigramm搭配的例子。

bigram_measures = nltk.collocations.BigramAssocMeasures()finder.apply_freq_filter(3)finder = BigramCollocationFinder.from_words(nltk.corpus.genesis.words('english-web.txt'))print (finder.nbest(bigram_measures.pmi, 5))>>>[('Beer', 'Lahai'), ('Lahai', 'Roi'), ('gray', 'hairs'), ('Most', 'High'), ('ewe', 'lambs')]

我想将这个函数适应我的Pandas数据框。我知道Pandas数据框的apply函数，但无法使其正常工作。

这是我对其中一列的测试方法：

df['Body-Collocation'] = df.apply(lambda df: BigramCollocationFinder.from_words(df['Body']),axis=1)

但如果我打印出示例行的结果，我得到的是

print (df['Body-Collocation'][1])>>> <nltk.collocations.BigramCollocationFinder object at 0x113c47ef0>

我甚至不确定这是否是正确的方法。有人能指导我正确的方向吗？

回答：

如果你想对Body列中的每个value应用BigramCollocationFinder.from_words()，你需要这样做：

df['Body-Collocation'] = df.Body.apply(lambda x: BigramCollocationFinder.from_words(x))

本质上，apply允许你遍历rows，并将Body列的相应value提供给应用的函数。

但正如评论中建议的，提供数据样本将更容易解决你的具体情况。

学技术

如何将函数（BigramCollocationFinder）应用到Pandas DataFrame

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复