miso-belica / sumy

Module for automatic summarization of text documents and HTML pages.

Home Page:https://miso-belica.github.io/sumy/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Console being spammed when using library.

Meatfucker opened this issue · comments

Using this library in my own project, it spamming things to the console I do not want. Did a bit of digging and it seems to be coming from this loop in

for name, evaluate_document, evaluate in AVAILABLE_EVALUATIONS:
if evaluate_document:
result = evaluate(evaluated_sentences, document.sentences)
else:
result = evaluate(evaluated_sentences, reference_sentences)
print("%s: %f" % (name, result))

Pretty sure its that loop and print doing it. If this could be removed or made configurable that would be great, thanks. Tested locally and commenting out that loop solves it for me.

Hello, this file is a CLI script and it is its purpose to print results into console. If you use sumy as a library I recommend to not use anything from the __main__.py files. You can always use only the functions and control the output yourself.

I am getting these results printed from the console while using it as a library using the example code. This is the code Im using.

from sumy.parsers.html import HtmlParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lex_rank import LexRankSummarizer as Summarizer
from sumy.nlp.stemmers import Stemmer
from sumy.utils import get_stop_words

parser = HtmlParser.from_url(url, Tokenizer("english")) 
stemmer = Stemmer("english")
summarizer = Summarizer(stemmer)
summarizer.stop_words = get_stop_words("english") #sumy summarizer setup stuff
compileddescription = ""
for sentence in summarizer(parser.document, 4):
     compileddescription = (f' {compileddescription} {sentence}')
     sitedescription = (f'The URL is a website about the following:{compileddescription}')`

That's really weird. What is the name of the file your code is written in and how do you run it? Maybe that's the problem. The above code should never produce any console output.

The project im using it in can be found at https://github.com/Meatfucker/metatron in metatron.py in the function extract_text_from_url.

@Meatfucker I am really sorry but I have no idea why this happens. If you debug it and find the cause for it I am all ears. The code in https://github.com/Meatfucker/metatron/blob/215026e88671c84b7094da2c10497d7d5e96b186/metatron.py#L230-L238 should not print anything into console or stderr.