以下是我的代码:
from fuzzywuzzy import fuzzcheck = open("text.txt","a") MIN_MATCH_SCORE = 30heard_word = 'i5-1135G7 'possible_words = checkguessed_word = [word for word in possible_words if fuzz.ratio(heard_word, word) >= MIN_MATCH_SCORE]print ('this one - ', guessed_word)
预期输出:
11th Generation Intel® Core™ i5-1135G7 Processor
只通过提供’i5-1135G7 ‘就可以得到预期输出的整个句子吗?有没有其他解决方案可以达到我想要的结果?提前感谢您。
下面是text.txt的链接
https://drive.google.com/file/d/1Mo3qFmeOAqa3WPPyg8SpeFVSjDx7AQBj/view
回答:
为了处理较长的句子并确保在词级别上的重叠,你应该使用token_set_ratio
。如果你希望完全匹配单词,那么将MIN_MATCH_SCORE
提高到接近100。
from fuzzywuzzy import fuzz MIN_MATCH_SCORE = 90heard_word = 'i5-1135G7'possible_words = ['11th Generation Intel® Core™ i5-1135G7 Processor (2.40 GHz,up to 4.20 GHz with Turbo Boost, 4 Cores, 8 Threads, 8 MB Cache)', 'windows 10 64 bit', 'intel i7'] print ([word for word in possible_words if fuzz.token_set_ratio(heard_word, word) >= MIN_MATCH_SCORE])
输出:
['11th Generation Intel® Core™ i5-1135G7 Processor (2.40 GHz,up to 4.20 GHz with Turbo Boost, 4 Cores, 8 Threads, 8 MB Cache)']