neuml / txtchat

💭 Retrieval augmented generation (RAG) and language model powered search applications

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wikisearch pipeline error

hsidky opened this issue · comments

Hi,

I was trying to get txtchat up and running (thanks for the great application!) and I ran into an issue with the wikitalk example. The persona I am using is wikitalk.yml available on HF hub: https://huggingface.co/NeuML/txtchat-personas/blob/main/wikitalk.yml.

The issue arises when trying to initialize the agent. I get the following traceback:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/ubuntu/anaconda3/envs/txtchat/lib/python3.11/site-packages/txtchat/agent/__main__.py", line 21, in <module>
    agent = AgentFactory.create(sys.argv[1])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/anaconda3/envs/txtchat/lib/python3.11/site-packages/txtchat/agent/factory.py", line 34, in create
    return RocketChat(config)
           ^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/anaconda3/envs/txtchat/lib/python3.11/site-packages/txtchat/agent/rocketchat.py", line 30, in __init__
    super().__init__(config)
  File "/home/ubuntu/anaconda3/envs/txtchat/lib/python3.11/site-packages/txtchat/agent/base.py", line 32, in __init__
    self.application = Application(config)
                       ^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/anaconda3/envs/txtchat/lib/python3.11/site-packages/txtai/app/base.py", line 72, in __init__
    self.pipes()
  File "/home/ubuntu/anaconda3/envs/txtchat/lib/python3.11/site-packages/txtai/app/base.py", line 129, in pipes
    self.pipelines[pipeline] = PipelineFactory.create(config, pipeline)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/anaconda3/envs/txtchat/lib/python3.11/site-packages/txtai/pipeline/factory.py", line 55, in create
    return pipeline if isinstance(pipeline, types.FunctionType) else pipeline(**config)
                                                                     ^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/anaconda3/envs/txtchat/lib/python3.11/site-packages/txtchat/pipeline/wikisearch.py", line 32, in __init__
    self.workflow = Workflow([Question(action=application.pipelines["extractor"]), WikiAnswer()])
                                              ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
KeyError: 'extractor'

To my untrained eye, the issue appears to be coming from the Wikisearch pipeline constructor here:
https://github.com/neuml/txtchat/blob/master/src/python/txtchat/pipeline/wikisearch.py#L32

Specifically, the action argument: action=application.pipelines["extractor"]),

When txtai.application.Application is creating pipelines, it attempts uses PipelineFactory to create the Wikisearch pipeline here: https://github.com/neuml/txtai/blob/f9229bc27c5160402ffb8caee3ab24620c8e602a/src/python/txtai/app/base.py#L129

self.pipelines[pipeline] = PipelineFactory.create(config, pipeline)

config contains the application key which is just a reference to Application. Wikisearch tries to access application.pipelines["extractor"] but that's not defined yet since it is in the process of being defined on the line where the error is thrown.

Perhaps the Wikisearch pipeline code is not longer in sync with the latest txtai API?

Any help is appreciated.

Thanks!
Hythem

Hello, thank you for the detailed write-up.

You are correct, there is an issue with how the Wikisearch pipeline is being created. I just checked in a change that fixes the issue. If you install from source, you'll be able to use txtchat as you intend.

pip install git+https://github.com/neuml/txtchat