cmdp We provide an algorithm combining methods in deep reinforcement learning, multi-task learning, and Bayesian optimization for estimating structural models in economics and finance.