#Bishop
Bishop, after Washington Bishop, is a python (2.7) package for modeling Theory of mind. Given some observable behavior, Bishop infers (through Bayesian inference over a rational model of decision making and planning under uncertainty) the cost and reward functions that explain the agent's choices and actions.
python setup.py install
pip uninstall Bishop
ImportError: No module named 'MDP'
Bishop is not compatible with python 3.x. Try using python 2.7 instead
If you create a new python 2.7 environment and you install Bishop, you may get an error when installing django. You can fix this by forcing the installation to use django 1.11:
pip install django==1.11
The main object in Bishop are observers. Observers are agents with a Theory of Mind that can infer mental-states from action, predict an agent's future actions, and simulate behaviors.
Simulate agents:
from Bishop import *
Observer = LoadObserver("P_TT_TRE") # Load a ToM agent that is observing the P_TT_TRE map.
R = Observer.SimulateAgents(Samples=100) # Have the observer 'imagine' the behavior of 100 random agents.
R.SaveCSV("MySamples.csv") # Save costs, rewards, actions, and state transitions as a CSV file.
R.Display() # Print everything
To see a list of available maps:
ShowAvailableMaps() # Print all maps
ShowAvailableMaps("Flag") # Print maps that contain the word "Flag"
$Bishop --help
$Bishop -m P_TT_TRE -sp 0 -a "R R" -s 5000 -o MySamples -v
uses the P_TT_TRE file (in Bishop's library) to load the map and places an agent in location 0 who took two steps to the right. It then infers the cost and reward function using 5000 samples and stores the output in "MySamples.p"
Obs = LoadObserver("Tatik_T1_L1") # Load a ToM agent observing the Tatik_T1_L1 map.
# InferAgent takes a sequence of actions and run mental-state inference on them.
# The actions must be given as a list of the following movements: 'U' (up), 'D' (down), 'L' (left), 'R' (right)
# 'UL' ("up-left"; northwest diagonal), 'UR' (northeast diagonal), 'DL' (southwest diagonal), and 'DR' (southeast diagonal).
Res = Obs.InferAgent(['UL'], Samples=100, Feedback=True) #UL (Up-Left) is a diagonal move
The Observer.InferAgent returns a PosteriorContainer object. This object contains the mental-state and competence inferences as well as functions to assess the quality of inference. Here are some things you can do with it
Res.Summary()
Res.Summary(human=False) # Or print it in csv-format
Res.AnalyzeConvergence() # Visually check if sampling converged
Res.PlotCostPosterior()
Res.PlotRewardPosterior()
Res.LongSummary() # Do everything above.
SaveSamples(Res, "MyResults") # Bishop is sampling based, so you can store the samples with their likelihoods
You can reload the samples and the observer model later with
Res = LoadSamples("MyResults.p")
Obs = LoadObserverFromPC(Res)
A map consists of two files: An ASCII description, and a .ini configuration file.
ASCII files begin with a map drawing, with each terrain type indicated numerically. After a line break, each terrain name is specified in a single line. These are the files for "FlagSetup" map
FlagSetup.ini
[MapParameters]
DiagonalTravel: True
MapName: Flag_Map
# Starting point can get overriden later with Observer.SetStartingPoint()
StartingPoint: 2
ExitState: 58
[Objects]
ObjectLocations: 41 49
ObjectTypes: 0 1
ObjectNames: LTreat RTreat
# If the two treats were the same type:
# ObjectTypes: 0 0
# ObjectNames: OnlyOneNameNeeded
[AgentParameters]
Method: Linear # Determines how costs are treated.
# If linear then costs are substracted from rewards.
# if discount then costs are treated as future discounts over rewards.
# Prior over costs and rewards.
Prior: ScaledUniform
# Force terrain 0 to be always less costly than the rest?
Restrict: False
SoftmaxChoice = False
SoftmaxAction = False
# Softmax parameters
# actionTau = 0.01
# choiceTau = 0.01
# When different than 0 prior becomes a mixture of the
# prior above with a peak in 0. The value determines the mass on that point.
RNull = 0.2
CNull = 0
# Parameters for priors. Meaning changes depending on the prior. See docstrings
CostParameters = 1
RewardParameters = 10
FlagMap
0000011122222
0000011122222
0000011122222
0000011122222
0000011122222
0000011122222
0000011122222
LeftTerrainName
CenterTerrainName
RightTerrainName
To generate a simple grid-world with one terrain start with
MyMap = Map()
MyMap.BuildGridWorld(5,3,Diagonal=True)
This creates a 5 by 3 map that can be navigated diagonally. Terrain type is stored in MyMap.StateTypes. The first terrain has by default a value of 0. New terrains are added through squares:
MyMap.InsertSquare(2, 1, 2, 3, 1):
added a 2x3 square with the top-left corner positioned on (2,1). Both coordinates begin in 1 and the y-axis is counted from top to bottom. The last argument (1) gives the terrain code. Inserting overlapping squares always rewrites past terrain. You can then add terrain names
MyMap.AddTerrainNames(["Water","Jungle"])
To see what your map looks like type
MyMap.PrintMap()
See docstrings for
MyMap.AddStartingPoint()
MyMap.AddExitState()
MyMap.InsertObjects()
Once you have a map, you need to create an agent, and use both to create an observer
MyAgent = Agent(MyMap, CostPrior, RewardPrior, CostPriorParameters, RewardPriorParameters)
MyObserver = Observer(MyMap, MyAgent)
See Agent's constructor docstring for list of all parameters agent can take and more details.