fchollet / deep-learning-with-python-notebooks

Jupyter notebooks for the code samples of the book "Deep Learning with Python"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ch11 Understanding TF-IDF normalization

intelligencethink opened this issue · comments

The explanation of tfidf shown at page326 as below.

def tfidf(term, document, dataset):
term_freq = document.count(term)
doc_freq = math.log(sum(doc.count(term) for doc in dataset) + 1)
return term_freq / doc_freq

Is it right? According to the formula, the total number of documents in the dataset is not shown in doc_freq.