deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Invalid max_token values

audreyeternal opened this issue · comments

I followed the instruction in README about how to utilize deepseek in langchain:

model = ChatOpenAI(
        model="deepseek-chat",
        openai_api_key=API_KEY,
        openai_api_base='https://api.deepseek.com/v1',
        temperature=0.85,
        max_tokens=8000)

However, seems like the max_tokens is still restricted to 4k and an error will be raised when intending to integrate the model into the chain and invoke it:

qa_with_sources = RetrievalQAWithSourcesChain.from_chain_type(
    llm=model,
    chain_type="stuff",
    retriever=pinecone.as_retriever()
)
query = "foobar"

result = qa_with_sources.invoke(query)

Warning

File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain/chains/base.py", line 153, in invoke
self._call(inputs, run_manager=run_manager)
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py", line 137, in _call
output, extra_return_dict = self.combine_docs(
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 244, in combine_docs
return self.llm_chain.predict(callbacks=callbacks, **inputs), {}
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain/chains/llm.py", line 316, in predict
return self(kwargs, callbacks=callbacks)[self.output_key]
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain_core/_api/deprecation.py", line 148, in warning_emitting_wrapper
return wrapped(*args, **kwargs)
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain/chains/base.py", line 378, in call
return self.invoke(
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain/chains/base.py", line 163, in invoke
raise e
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain/chains/base.py", line 153, in invoke
self._call(inputs, run_manager=run_manager)
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain/chains/llm.py", line 126, in _call
response = self.generate([inputs], run_manager=run_manager)
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain/chains/llm.py", line 138, in generate
return self.llm.generate_prompt(
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 560, in generate_prompt
return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 421, in generate
raise e
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 411, in generate
self._generate_with_cache(
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 632, in _generate_with_cache
result = self._generate(
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/langchain_openai/chat_models/base.py", line 522, in _generate
response = self.client.create(messages=message_dicts, **params)
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/openai/_utils/_utils.py", line 277, in wrapper
return func(*args, **kwargs)
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 590, in create
return self._post(
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/openai/_base_client.py", line 1240, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/openai/_base_client.py", line 921, in request
return self._request(
File "/Users/zhouyu/miniconda3/envs/ob_chatbot/lib/python3.10/site-packages/openai/_base_client.py", line 1020, in _request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'detail': 'Invalid max_tokens value, the valid range of max_tokens is [0, 4096]'}

@audreyeternal The DeepSeek-V2 API currently only supports 32K context length (input + output) with a maximum of 4K tokens for the output.

PS: Although the DeepSeek-V2 model supports 128K context length, we have restricted the API to 32K to ensure efficiency in our services.