我在阅读一篇文章时遇到了FastText类,这篇文章是我在这里找到的…
https://stackabuse.com/python-for-nlp-working-with-facebook-fasttext-library/#disqus_thread
作者使用了一个未定义的对象”word_tokenized_corpus”。
ft_model = FastText(word_tokenized_corpus, size=embedding_size, window=window_size, min_count=min_word, sample=down_sampling, sg=1, iter=100)
显然,我遇到了一个错误。我该如何正确地初始化这个类?
回答:
文章已被作者更新,并添加了以下代码:
final_corpus = [ preprocess_text(sentence) for sentence in artificial_intelligence if sentence.strip() != ""]word_punctuation_tokenizer = nltk.WordPunctTokenizer()word_tokenized_corpus = [ word_punctuation_tokenizer.tokenize(sent) for sent in final_corpus]