Paper abstract bibtex
To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. An important class of instructions in-volve interacting with people, such as " Follow the person to the kitchen " or " Meet the person at the elevators. " These instructions require that the robot fluidly react to changes in the environment, not simply follow a pre-computed plan. We present an algorithm for understanding natural language commands with three components. First, we create a cost function that scores the language according to how well it matches a candidate plan in the environment, defined as the log-likelihood of the plan given the command. Components of the cost function include novel models for the meanings of motion verbs such as " follow, " " meet, " and " avoid, " as well as spa-tial relations such as " to " and landmark phrases such as " the kitchen. " Second, an inference method uses this cost function to perform forward search, finding a plan that matches the natural language command. Third, a high-level controller repeat-edly calls the inference method at each timestep to compute a new plan in response to changes in the environment such as the movement of the human partner or other people in the scene. When a command consists of more than a single task, the con-troller switches to the next task when an earlier one is satisfied. We evaluate our approach on a set of example tasks that require the ability to follow both simple and complex natural language commands.