This repository contains pre-processed data for Manning JR, Lew TF, Li N, Sekuler R, Kahana MJ (2014) MAGELLAN: a cognitive map-based model of human wayfinding. Journal of Experimental Psychology: General, 143(3): 1314--1330. The raw experimental data may be downloaded here (warning: large file!).
In an unfamiliar environment, searching for and navigating to a target requires that spatial information be acquired, stored, processed, and retrieved. In a study encompassing all of these processes, participants acted as taxicab drivers who learned to pick up and deliver passengers in a series of small virtual towns. We used data from these experiments to refine and validate MAGELLAN, a cognitive map–based model of spatial learning and wayfinding. MAGELLAN accounts for the shapes of participants’ spatial learning curves, which measure their experience-based improvement in navigational efficiency in unfamiliar environments. The model also predicts the ease (or difficulty) with which different environments are learned and, within a given environment, which landmarks will be easy (or difficult) to localize from memory. Using just 2 free parameters, MAGELLAN provides a useful account of how participants’ cognitive maps evolve over time with experience, and how participants use the information stored in their cognitive maps to navigate and explore efficiently.
A total of 108 participants played the role of "taxicab drivers" in a series of virtual environments. Within each environment, performed a series of 15 deliveries to a set of specific targets. Each delivery comprised "foraging" phase and a "seeking" phase:
- During the foraging phase, the participant drove freely for a short distance, until they were prompted with a notice saying that they had "picked up a passenger." This ended the foraging phase and initiated the "seeking" phase.
- During the seeking phase, the participant was given a "target" destination to drive their "passenger" to.
The order in which each participant encountered target destinations within a given environment was held constant across participants. Every participant also encountered the same 8 environments, but the order in which the environments were visited was counterbalanced across participants, and across two testing sessions. The environments varied in "expected difficulty," as predicted by the MAGELLAN model presented in the paper above:
- Easy: environments A and E
- Medium-easy: environments B and F
- Medium-difficult: environments C and G
- Difficult: environments D and H
Each environment was laid out on a 6 x 6 block square grid, with one landmark (building) centered on each block. Five of the squares in each environment held "stores" that the passengers asked to be delivered to. Each store was selected as a target for a total of 3 deliveries.
The repository is organized into two main folders:
- data: contains a single json file for each experimental participant
- env: contains a single json file for each virtual environment that participants navigated
The simplest way to load in data files is using the pandas Python library. If
'fname.json' is the name of a given file, then its (participant or environment) data can be read in as follows:
import pandas as pd
data = pd.read_json('fname.json')Several convenience functions are provided in magellan_loader.py. To use
them, you can call
import magellan_loader as mlInputs:
fname: file path to an environment's .json file, specified as a string
Outputs:
env: a pandas DataFrame describing the environment, with one row per structure (specified in DataFrame's index) and the following columns:x,y: the x-coordinate (or y-coordinate) of the given structure (in blocks).type: either 'store' (if the structure is a potential target) or 'landmark' (if the structure is never used as a target).delivery 1,delivery 2,delivery 3: for stores, specifies the first (or second, or third) delivery number (1-indexed) when the store is selected as a target. For landmarks, these are set toNaN.
Inputs:
env: an environment's DataFrame
Outputs:
dims: a tuple whose first element indicates the environment's width (in blocks) and whose second element indicates the environment's height (in blocks)
Note: stores are denoted by black circles; landmarks are denoted by gray squares; and intersections are denoted by gray dots.
Inputs:
env: an environment's DataFrame
Outputs:
ax: amatplotlibaxis handle for the resulting figure
Inputs:
envs: a dictionary whose keys are environment names and whose values are DataFrames for the corresponding environments. Ifrootis the current directory (containing this repository), thenenvsmay be generated as follows:
from glob import glob as lsdir
root = 'magellan_data'
envs = {os.path.split(e)[-1].split('.')[0]: ml.load_env(e) for e in lsdir(os.path.join(root, 'env', '?.json'))}size: a 2-element list or array specifying the number of rows (size[0]) and columns (size[1]) of subplots to createscale: optional argument specifying how large to draw each environment in the resulting figure (default: 4)
Inputs:
fname: file path to an subject's .json file, specified as a stringfreq: optional argument specifying the sampling frequency of the data (default: use all available data)
Outputs:
data: a pandas DataFrame whose rows denote data collected during a single timepoint. Each row is indexed by the time the corresponding datapoint was collected. The DataFrame has the following columns:x,y: the subject's x and y position (in blocks)heading: the subject's heading (in degrees; 0 degrees corresponds to "north" and 90 degrees corresponds to "west", etc.)mode: 'forage' if the datapoint was collected while the subject was searching for a new passenger, or 'seek' if the datapoint was collected while the subject was delivering the passenger to a target (store)target: the name of the passenger's destination while in 'seek' mode, specified as a string (set toNonein 'forage' mode)subj: the subject's unique identification codesession: zero-indexed session numberenv_num: zero-index environment number (within the current session)env: the current environment (corresponds to the json files for each environment)delivery: zero-indexed delivery number (within the current environment)
Inputs:
data: a pandas DataFrame with one subject's data, or several stacked DataFrames from multiple subjectscolumns: a string, list, or numpy array specifying the list of columns whose values should be read out (these must be a subset of the columns in the subject DataFrames)unique: an optional argument specifying whether or not to return the unique values for each column. Default:False(return values, with potential repeats, in their original orders); if set toTrue, a sorted list of unique values from each column is returned
Outputs:
- The resulting values extracted from
data
apply_by_condition: apply a function to data corresponding to each combination of unique values from the given columns
Inputs:
data: a pandas DataFrame with one subject's data, or several stacked DataFrames from multiple subjectscolumns: a string, list, or numpy array specifying the list of columns whose values should be read out (these must be a subset of the columns in the subject DataFrames)f: a function that takes a DataFrame (created from a subset of the rows indata) as input, and (optionally) produces any outputargs: a list of arguments that will be passed tof(the same values are passed to every call tof)kwargs: a dictionary of keyword arguments that will be passed tof(the same values are passed to every call tof)- Note: each call to
flooks like:f(data.loc[inds], *args, **kwargs), whereindsis a binary array specifying which rows fromdataare from the current combination of conditions.
- Note: each call to
Outputs:
results: a list of results (returned by each call tof)-- one for each unique combination of conditions across the specified columns
Inputs:
data: the object to process. Must containxandyfields or columns-- e.g. bothdata['x']anddata['y']must return one or more to-be-rounded values.
Outputs:
rounded_data: a copy ofdatawith the coordinates inxandyrounded to the nearest integer.
plot_paths: display behavioral data from each session, environment, and delivery from a single subject
The resulting figure contains one subplot per session and environment, each containing the foraging and delivery data:
- Foraging paths, denoted by
mode == 'forage', are colored (by delivery) using thecrestpalette - Participant's seek paths, denoted by
mode == 'seek', are colored (by delivery) using theflarepalette - MAGELLAN's seek paths, denoted by
mode == 'autopilot', are colored (by delivery) using theseagreenpalette
Inputs:
data: a DataFrame containing one subject's data. Note: thesubjcolumn is not checked; strange results may be generated if data from multiple subjects are included.envs: a dictionary whose keys are environment names and whose values are DataFrames for the corresponding environments.scale: optional argument specifying how large to draw each environment in the resulting figure (default: 4)
Outputs:
fig,ax: handles to the matplotlib Figure and list of axes objects for the subplots, respectively