Correspondence-Driven Trajectory Warping for Data-Efficient Imitation and Autonomous Play. Liang, W., Wang, S., Wang, H., Bastani, O., Ma, Y. J., & Jayaraman, D. (Under review at NeurIPS), 2025.
abstract   bibtex   
Imitation learning has emerged as a promising paradigm for learning robotic manipulation policies, but leading methods require large, expensive datasets of human-collected demonstrations. Towards data-efficient imitation learning, we propose a novel non-parametric policy that produces useful behaviors from a few demonstrations. Given two-view images, it identifies semantic correspondences to anchor warps of demonstration trajectories into new real-world scenes. We show that this design is efficient and robust to significant variations in the spatial and semantic configuration of the scene, such as dramatic positional differences and out-of-distribution objects. As a result, the policy excels at a variety of manipulation tasks involving deformation, complex contacts, articulation, and precision, highlighting its flexibility and generality. In addition, we demonstrate how a bank of such policies can power autonomous multi-task play in the real world via a continuous cycle of task selection, execution, evaluation, and policy improvement, guided by vision-language models. This procedure generates an increasingly diverse demonstration dataset over time for each task, while minimizing the need for manual resets and human interventions. In a real household-like multi-object environment, our method is among the first to bootstrap many hours of diverse play from few demonstrations, autonomously producing hundreds of expert-level trajectories that could be used for downstream policy learning.
@article{liang2025trajectory,
  title={Correspondence-Driven Trajectory Warping for Data-Efficient Imitation and Autonomous Play},
  author={Liang, William and Wang, Sam and Wang, Hung-Ju and Bastani, Osbert and Ma, Yecheng Jason and Jayaraman, Dinesh},
  journal={(Under review at NeurIPS)},
  year={2025},
  abstract={Imitation learning has emerged as a promising paradigm for learning robotic manipulation policies, but leading methods require large, expensive datasets of human-collected demonstrations. Towards data-efficient imitation learning, we propose a novel non-parametric policy that produces useful behaviors from a few demonstrations. Given two-view images, it identifies semantic correspondences to anchor warps of demonstration trajectories into new real-world scenes. We show that this design is efficient and robust to significant variations in the spatial and semantic configuration of the scene, such as dramatic positional differences and out-of-distribution objects. As a result, the policy excels at a variety of manipulation tasks involving deformation, complex contacts, articulation, and precision, highlighting its flexibility and generality. In addition, we demonstrate how a bank of such policies can power autonomous multi-task play in the real world via a continuous cycle of task selection, execution, evaluation, and policy improvement, guided by vision-language models. This procedure generates an increasingly diverse demonstration dataset over time for each task, while minimizing the need for manual resets and human interventions. In a real household-like multi-object environment, our method is among the first to bootstrap many hours of diverse play from few demonstrations, autonomously producing hundreds of expert-level trajectories that could be used for downstream policy learning.}
}

Downloads: 0