Collect Resource Game with Reinforcement Learning

Neural Networks

Raycast observation vector for return distance between agent and other object (174)
➡️ (2 * 20[number of raycast] + 1) * (4[one-hot vector size equal detectable tag] + 1[boolean raycast hit something]
+ 1[float distance between agent and other object])
Raycast observation vector for return distance between agent and enemy (174)
➡️ (2 * 20[number of raycast] + 1) * (1[one-hot vector size equal detectable tag] + 1[boolean raycast hit something]
+ 1[float distance between agent and other object])
Raycast observation vector for backward agent (15)
➡️ (2 * 1[number of raycast] + 1) * (3[one-hot vector size equal detectable tag] + 1[boolean raycast hit something]
+ 1[float distance between agent and other object])
Raycast observation vector for jumping (15)
➡️ (2 * 1[number of raycast] + 1) * (3[one-hot vector size equal detectable tag] + 1[boolean raycast hit something]
+ 1[float distance between agent and other object])
Observation vector (15 (24/18)[Collector/Disruptor])
- boolean agent can jump
- Vector3 agent position
- Vector3 agent enler angles
- Float dot product between forward velocity with forward axis
- Float dot product between right velocity with right axis
- Vector3 destination position
- Float agent dash cooldown
- boolean agent is stun
- int number of items
- (Collector Observation) Vector3[] position of items
- (Disruptor Observation) Vector3 position of Collector agent

Discrete action (13)

Move direction Z [one-hot vector size 3] (argmax in one-hot [action])
- 0 [no action]
- 1 [forward]
- 2 [backward]
Rotate direction [one-hot vector size 3] (argmax in one-hot [action])
- 0 [no action]
- 1 [right rotate]
- 2 [left rotate]
Move direction X [one-hot vector size 3] (argmax in one-hot [action])
- 0 [no action]
- 1 [left]
- 2 [right]
Jump [one-hot vector size 2] (argmax in one-hot [action])
- 0 [no action]
- 1 [jump]
Dash [one-hot vector size 2] (argmax in one-hot [action])
- 0 [no action]
- 1 [dash]

CollectAgent.yaml ➡️ CollectAgent behavior

In Scence: