Mixed Discrete + Continuous Action Spaces

Question

Mixed Discrete + Continuous Action Spaces

kuza55 opened this issue 6 months ago · comments

Hi,

I was wondering if there was an easy way to use Pearl with an environment that had both a discrete and continuous action space at the same time.

Alex

Zheqing (Bill) Zhu · Answer 1 · Mon Dec 18 2023 12:02:43 GMT+0800 (China Standard Time)

Hi there, unfortunately, that is not supported at the moment, but would you please help explain what your environment/problem is so that we can understand better?

kuza55 · Answer 2 · Mon Dec 18 2023 13:49:28 GMT+0800 (China Standard Time)

I don't want to get too in the weeds about my problem since the formulation will probably change over time, but I am working on a scheduler where I want to assign tasks to machines and also turn the machines off with a somewhat annoying cost model of how the machines are paid for.

So I want to model the assignment of tasks to machines as a discrete action space, and then I would like to model a time at which to turn the machines off as a continuous action space to avoid needing to call the model repeatedly to find out if it should turn off when nothing has been submitted, and to avoid discretizing the continuous time action in the hopes of getting better learning behaviour.

Angelo · Answer 3 · Thu Jan 04 2024 18:14:51 GMT+0800 (China Standard Time)

I think that the main issue here is to be able to support dictionary action spaces. Therefore, the sub-action spaces of the keys of that dictionary would combine BoxActionSpaces (continuous) and MultiDiscreteActionSpaces (discrete) for example.

Zheqing (Bill) Zhu · Answer 4 · Thu Jan 11 2024 04:05:35 GMT+0800 (China Standard Time)

Sorry for getting back a bit late. We just came back from holidays.

If I were to understand correct, you want to tell the machine to turn of at a certain time, which involves both discrete and continuous action space. However, to formulate your problem as an RL problem, how are discrete time steps defined, i.e., how often do you make decisions?

kuza55 · Answer 5 · Thu Jan 11 2024 10:02:13 GMT+0800 (China Standard Time)

Our system gets items that require scheduling that generate events with additional information. Those are the time steps I would like to make decisions at rather than discretizing time into equally sized time steps. Obviously it is possible to discretize time and e.g. run every second, and that may even turn out to be better, but I would obviously prefer it if the RL framework did not force my hand here and let me evaluate the empirical results of different modeling choices.

…

On Wed, Jan 10, 2024, 2:05 PM Zheqing (Bill) Zhu ***@***.***> wrote: Sorry for getting back a bit late. We just came back from holidays. If I were to understand correct, you want to tell the machine to turn of at a certain time, which involves both discrete and continuous action space. However, to formulate your problem as an RL problem, how are discrete time steps defined, i.e., how often do you make decisions? — Reply to this email directly, view it on GitHub <#22 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAC6XESUHHMCXSR4X2PVVCDYN3YBTAVCNFSM6AAAAABAYWWMEGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBVGYZTOMRXGM> . You are receiving this because you authored the thread.Message ID: ***@***.***>

rodrigodesalvobraz · Answer 6 · Sat Jan 13 2024 02:07:47 GMT+0800 (China Standard Time)

I think it is not easy to do that with Pearl currently (I expect it will get there in the future).

I wonder if you can work around that by having two agents, one that assigns tasks and the other choosing turnoff times. The two agents would have no knowledge of each other; from each agent's point of view, the changes performed by the other agent look like just environment changes following their last action. You could run them (one turn each) every time new tasks are submitted.