huggingface / evaluate

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.

Home Page:https://huggingface.co/docs/evaluate

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Loading metrics that are shipped with the evaluate package is slow

harmbuisman opened this issue · comments

Loading metrics that are shipped with the evaluate package takes way too long to load, up to or more than a second whereas I expect it to be near instant.

Repro:
Run the following in a jupyter notebook:
import evaluate

%%prun
evaluate.load("accuracy")

This outputs the following, suggesting that even for this metric that is available in the package itself it sets up all kinds of communication with the HF hub:

         22184 function calls (21809 primitive calls) in 1.351 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        2    0.643    0.322    0.643    0.322 {method 'load_verify_locations' of '_ssl._SSLContext' objects}
        2    0.240    0.120    0.240    0.120 {method 'read' of '_ssl._SSLSocket' objects}
        2    0.223    0.112    0.223    0.112 {method 'do_handshake' of '_ssl._SSLSocket' objects}
        2    0.145    0.073    0.145    0.073 {method 'connect' of '_socket.socket' objects}
        2    0.038    0.019    0.038    0.019 {built-in method _socket.getaddrinfo}
       80    0.009    0.000    0.009    0.000 {built-in method nt.stat}
       45    0.006    0.000    0.006    0.000 {built-in method __new__ of type object at 0x00007FF81939AD50}

Yes, the metrics are loaded from the Hub, which is why you are observing that it takes 1-2 seconds to load, but in follow-up loading they should be cached.

The evaluate.load("accuracy") loads the sklearn wrapper that is shipped with the package, so it should not go to the hub, See the location within the package: https://github.com/huggingface/evaluate/blob/main/metrics/accuracy/accuracy.py

It takes 1-2 seconds every call to evaluate.load, so no speed improvements on a follow-up call.