takuseno / d3rlpy

An offline deep reinforcement learning library

Home Page:https://takuseno.github.io/d3rlpy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] DiscreteDecisionTransformer Inference Problem, AttributeError: 'numpy.ndarray' object has no attribute 'length'

waltersharpWEI opened this issue · comments

⚠️

  • Please don't post bug reports without a minimal example codes. Otherwise, it will take really long to solve the issue.
  • Please be polite in discussion. This is an open-sourced project by voluntary contributors.

Describe the bug
After I train the DiscreteDecisionTransformer, when I predict, it raises an AttributeError that 'numpy.ndarry' object as no attribute 'length'.
It seems to me that the TorchTransformerInput datalass is not working properly, maybe it's linked to the @datasetclass annotation feature which may work differently on different Python Version. (I am using python 3.10 on windows 10 64bit).

To Reproduce

  1. Build a discerete trajectory dataset.
  2. Train(Fit) the DiscreteDecisionTransformer on it.
  3. Predict with test state_list.

Env:
python 3.10
d3rlpy 2.5.0
gymnasium 0.29.0

Expected behavior
There should be no error on length.

Additional context

algo = d3rlpy.algos.DiscreteDecisionTransformerConfig(batch_size=128,
                                                      gamma=0.99, context_size=20, learning_rate=0.0006,
                                                      num_heads=8, num_layers=6,
                                                      attn_dropout=0.1, activation_type='gelu',
                                                      embed_activation_type='tanh', warmup_tokens=10240).create(device="cpu:0")
algo.build_with_dataset(dataset)


td_error_evaluator = d3rlpy.metrics.TDErrorEvaluator(episodes=dataset.episodes)


algo.fit(dataset, n_steps=100)


actions = algo.predict(state_list)

Traceback (most recent call last):
File "D:\simulation\models\d3rlpy_offline_rl\offline_rl_dt_d3rl.py", line 84, in
actions = algo.predict(state_list)
File "D:\venv\lib\site-packages\d3rlpy\algos\transformer\base.py", line 219, in predict
torch_inpt = TorchTransformerInput.from_numpy(
File "D:\venv\lib\site-packages\d3rlpy\algos\transformer\inputs.py", line 68, in from_numpy
if context_size < inpt.length:
AttributeError: 'numpy.ndarray' object has no attribute 'length'

@waltersharpWEI Hi, thanks for the issue. The short answer is that Decision Transformer algorithms don't support any of evaluators because they're essentially different from Q-learning-based algorithms. Please let me close this issue since it's not a bug.

Sorry, looking at your code again, you're not using evaluators technically. For inference, please check this documentation for the usage.
https://d3rlpy.readthedocs.io/en/v2.5.0/references/algos.html#decision-transformer

# start training (save logs to LOGS_DIR)
dt.fit(
    dataset,
    n_steps=100,
    n_steps_per_epoch=10,
    eval_target_return=0,
    # manually specify action-sampler
    eval_action_sampler=d3rlpy.algos.IdentityTransformerActionSampler(),
)
actor = dt.as_stateful_wrapper(
    target_return=0,
    action_sampler=d3rlpy.algos.IdentityTransformerActionSampler(),
)

# interaction
observation, reward = env.reset(), 0.0
for i in range(100):
    action = actor.predict(observation, reward)
    observation, reward, done, truncated, _ = env.step(action)
    if done or truncated:
        break

print("Complete../")

Thanks for your reply, I modified the code in the above form.
I just have one question. Does the inference must be done using the .as_stateful_wrapper() function?
So does that mean the predict() is not supposed to be used in the inference?

It's kind of right. This is because of the nature of stateful behavior of Decision Transformer, which requires carefully crafted inputs. I don't think users want to see this level of complication:

class StatefulTransformerWrapper(Generic[TTransformerImpl, TTransformerConfig]):
.