stephenhillier / starlette_exporter

Prometheus exporter for Starlette and FastAPI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Duplicated timeseries in Collector Registry

scotgopal opened this issue · comments

Hi. I'm trying to use this package to store metrics for the model that I am serving using FastAPI. While I was able to reproduce the the default metrics, I wanted to explore on making my custom metrics. But I encountered this error when I tried.

ValueError: Duplicated timeseries in CollectorRegistry: {'predict_created', 'predict_total'}

I've scoured the net for a solution, but none was able to solve my problem. Please do help. The following is my source code.

from starlette_exporter import PrometheusMiddleware, handle_metrics
from prometheus_client import Counter

TOTAL_PREDICT_REQUEST = Counter("predict", "Count of predicts", ("from",))

app = FastAPI()
image_classifier = ImageClassifier()

"""Add instrumentation"""
app.add_middleware(
    PrometheusMiddleware,
    app_name = "fastapi",
    prefix = "fastapi",
    filter_unhandled_paths = True
    )
app.add_route("/metrics", handle_metrics)

@app.get("/")
def home():
    return "Hello!"


@app.post("/predict", response_model=ResponseDataModel)
async def predict(file: UploadFile = File(...)):

    TOTAL_PREDICT_REQUEST.labels(endpoint="/predict").inc()

    if file.content_type.startswith("image/") is False:
        raise HTTPException(
            status_code=400, detail=f"File '{file.filename}' is not an image."
        )

    try:
        contents = await file.read()
        image = Image.open(io.BytesIO(contents)).convert("RGB")

        predicted_class = image_classifier.predict(image)

        logging.info(f"Predicted Class: {predicted_class}")

        return {
            "filename": file.filename,
            "content_type": file.content_type,
            "likely_class": predicted_class,
        }

    except Exception as error:
        logging.exception(error)
        e = sys.exc_info()[1]
        raise HTTPException(status_code=500, detail=str(e))


if __name__ == "__main__":
    uvicorn.run("app.main:app", host="127.0.0.1", port=8000, log_level="info")

Full traceback:

Traceback (most recent call last):
  File "d:/CertifAI/deployment-course-labs/day_4/model_monitoring/app/main.py", line 71, in <module>
    uvicorn.run("app.main:app", host="127.0.0.1", port=8000, log_level="info")
  File "C:\Users\USER\miniconda3\envs\day4-demo\lib\site-packages\uvicorn\main.py", line 386, in run
    server.run()
  File "C:\Users\USER\miniconda3\envs\day4-demo\lib\site-packages\uvicorn\server.py", line 49, in run
    loop.run_until_complete(self.serve(sockets=sockets))
  File "C:\Users\USER\miniconda3\envs\day4-demo\lib\asyncio\base_events.py", line 616, in run_until_complete
    return future.result()
  File "C:\Users\USER\miniconda3\envs\day4-demo\lib\site-packages\uvicorn\server.py", line 56, in serve
    config.load()
  File "C:\Users\USER\miniconda3\envs\day4-demo\lib\site-packages\uvicorn\config.py", line 308, in load
    self.loaded_app = import_from_string(self.app)
  File "C:\Users\USER\miniconda3\envs\day4-demo\lib\site-packages\uvicorn\importer.py", line 20, in import_from_string      
    module = importlib.import_module(module_str)
  File "C:\Users\USER\miniconda3\envs\day4-demo\lib\importlib\__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "d:\certifai\deployment-course-labs\day_4\model_monitoring\app\main.py", line 19, in <module>
    TOTAL_PREDICT_REQUEST = Counter("predict", "Count of predicts", ("from",))
  File "C:\Users\USER\miniconda3\envs\day4-demo\lib\site-packages\prometheus_client\metrics.py", line 107, in __init__      
    registry.register(self)
  File "C:\Users\USER\miniconda3\envs\day4-demo\lib\site-packages\prometheus_client\registry.py", line 27, in register      
    raise ValueError(
ValueError: Duplicated timeseries in CollectorRegistry: {'predict_created', 'predict_total'}

Hi @scotgopal , thanks for reporting an issue. It looks like TOTAL_PREDICT_REQUEST is being called twice when your app starts.

I'll have to test and work on this on the weekend but as a short term fix, here are some things you can try:

  • change uvicorn.run("app.main:app", ...) to just uvicorn.run(app, ...). I'm not familiar with using the "app.main:app" syntax but I'm wondering if that's causing your app.py to be loaded twice
  • move the metric to another module/file and import it e.g. from metrics import TOTAL_PREDICT_REQUEST

http://python-notes.curiousefficiency.org/en/latest/python_concepts/import_traps.html#executing-the-main-module-twice

Thanks for the reply @stephenhillier . uvicorn is called in that manner because the FastAPI script is in a subfolder, app under the working directory. I'm not so sure if it will work with uvicorn(app, ...).

I will give a try on the suggestions and see if any of 'em solves the issue. Will report back soon. 👍

Hi there @stephenhillier . Popped in to report that the code works now using the second method that you suggested with some additional modifications.

I exported the metric into another file, monitoring.py.

from prometheus_client import Counter
TOTAL_PREDICT_REQUEST = Counter("predict", "Count of predicts", ["from"])

Then I imported it into my main script.

from monitoring import TOTAL_PREDICT_REQUEST
...
@app.post("/predict", response_model=ResponseDataModel)
async def predict(file: UploadFile = File(...)):

    TOTAL_PREDICT_REQUEST.labels("/predict").inc()

I was able to start my server without the previous ValueError: Duplicated timeseries in CollectorRegistry: {'predict_created', 'predict_total'}. And I verified that the Counter is doing its job by checking the host:port/metrics endpoint.

This solution works no matter if uvicorn server was started with either uvicorn.run("app.main:app", ...) or uvicorn.run(app, ...) (Yeap, both commands are valid).

I will be suggesting changes to the readme.md as some example script there results in Syntax error.