Correspondence-Driven Trajectory Warping for Data-Efficient Imitation and Autonomous Play

Correspondence-Driven Trajectory Warping for Data-Efficient Imitation and Autonomous Play. Liang, W., Wang, S., Wang, H., Bastani, O., Ma, Y. J., & Jayaraman, D. (Under review at NeurIPS), 2025.
abstract bibtex

Imitation learning has emerged as a promising paradigm for learning robotic manipulation policies, but leading methods require large, expensive datasets of human-collected demonstrations. Towards data-efficient imitation learning, we propose a novel non-parametric policy that produces useful behaviors from a few demonstrations. Given two-view images, it identifies semantic correspondences to anchor warps of demonstration trajectories into new real-world scenes. We show that this design is efficient and robust to significant variations in the spatial and semantic configuration of the scene, such as dramatic positional differences and out-of-distribution objects. As a result, the policy excels at a variety of manipulation tasks involving deformation, complex contacts, articulation, and precision, highlighting its flexibility and generality. In addition, we demonstrate how a bank of such policies can power autonomous multi-task play in the real world via a continuous cycle of task selection, execution, evaluation, and policy improvement, guided by vision-language models. This procedure generates an increasingly diverse demonstration dataset over time for each task, while minimizing the need for manual resets and human interventions. In a real household-like multi-object environment, our method is among the first to bootstrap many hours of diverse play from few demonstrations, autonomously producing hundreds of expert-level trajectories that could be used for downstream policy learning.

@article{liang2025trajectory,
  title={Correspondence-Driven Trajectory Warping for Data-Efficient Imitation and Autonomous Play},
  author={Liang, William and Wang, Sam and Wang, Hung-Ju and Bastani, Osbert and Ma, Yecheng Jason and Jayaraman, Dinesh},
  journal={(Under review at NeurIPS)},
  year={2025},
  abstract={Imitation learning has emerged as a promising paradigm for learning robotic manipulation policies, but leading methods require large, expensive datasets of human-collected demonstrations. Towards data-efficient imitation learning, we propose a novel non-parametric policy that produces useful behaviors from a few demonstrations. Given two-view images, it identifies semantic correspondences to anchor warps of demonstration trajectories into new real-world scenes. We show that this design is efficient and robust to significant variations in the spatial and semantic configuration of the scene, such as dramatic positional differences and out-of-distribution objects. As a result, the policy excels at a variety of manipulation tasks involving deformation, complex contacts, articulation, and precision, highlighting its flexibility and generality. In addition, we demonstrate how a bank of such policies can power autonomous multi-task play in the real world via a continuous cycle of task selection, execution, evaluation, and policy improvement, guided by vision-language models. This procedure generates an increasingly diverse demonstration dataset over time for each task, while minimizing the need for manual resets and human interventions. In a real household-like multi-object environment, our method is among the first to bootstrap many hours of diverse play from few demonstrations, autonomously producing hundreds of expert-level trajectories that could be used for downstream policy learning.}
}

Downloads: 0

{"_id":"uZcyojfDf4dJXW4QX","bibbaseid":"liang-wang-wang-bastani-ma-jayaraman-correspondencedriventrajectorywarpingfordataefficientimitationandautonomousplay-2025","author_short":["Liang, W.","Wang, S.","Wang, H.","Bastani, O.","Ma, Y. J.","Jayaraman, D."],"bibdata":{"bibtype":"article","type":"article","title":"Correspondence-Driven Trajectory Warping for Data-Efficient Imitation and Autonomous Play","author":[{"propositions":[],"lastnames":["Liang"],"firstnames":["William"],"suffixes":[]},{"propositions":[],"lastnames":["Wang"],"firstnames":["Sam"],"suffixes":[]},{"propositions":[],"lastnames":["Wang"],"firstnames":["Hung-Ju"],"suffixes":[]},{"propositions":[],"lastnames":["Bastani"],"firstnames":["Osbert"],"suffixes":[]},{"propositions":[],"lastnames":["Ma"],"firstnames":["Yecheng","Jason"],"suffixes":[]},{"propositions":[],"lastnames":["Jayaraman"],"firstnames":["Dinesh"],"suffixes":[]}],"journal":"(Under review at NeurIPS)","year":"2025","abstract":"Imitation learning has emerged as a promising paradigm for learning robotic manipulation policies, but leading methods require large, expensive datasets of human-collected demonstrations. Towards data-efficient imitation learning, we propose a novel non-parametric policy that produces useful behaviors from a few demonstrations. Given two-view images, it identifies semantic correspondences to anchor warps of demonstration trajectories into new real-world scenes. We show that this design is efficient and robust to significant variations in the spatial and semantic configuration of the scene, such as dramatic positional differences and out-of-distribution objects. As a result, the policy excels at a variety of manipulation tasks involving deformation, complex contacts, articulation, and precision, highlighting its flexibility and generality. In addition, we demonstrate how a bank of such policies can power autonomous multi-task play in the real world via a continuous cycle of task selection, execution, evaluation, and policy improvement, guided by vision-language models. This procedure generates an increasingly diverse demonstration dataset over time for each task, while minimizing the need for manual resets and human interventions. In a real household-like multi-object environment, our method is among the first to bootstrap many hours of diverse play from few demonstrations, autonomously producing hundreds of expert-level trajectories that could be used for downstream policy learning.","bibtex":"@article{liang2025trajectory,\n title={Correspondence-Driven Trajectory Warping for Data-Efficient Imitation and Autonomous Play},\n author={Liang, William and Wang, Sam and Wang, Hung-Ju and Bastani, Osbert and Ma, Yecheng Jason and Jayaraman, Dinesh},\n journal={(Under review at NeurIPS)},\n year={2025},\n abstract={Imitation learning has emerged as a promising paradigm for learning robotic manipulation policies, but leading methods require large, expensive datasets of human-collected demonstrations. Towards data-efficient imitation learning, we propose a novel non-parametric policy that produces useful behaviors from a few demonstrations. Given two-view images, it identifies semantic correspondences to anchor warps of demonstration trajectories into new real-world scenes. We show that this design is efficient and robust to significant variations in the spatial and semantic configuration of the scene, such as dramatic positional differences and out-of-distribution objects. As a result, the policy excels at a variety of manipulation tasks involving deformation, complex contacts, articulation, and precision, highlighting its flexibility and generality. In addition, we demonstrate how a bank of such policies can power autonomous multi-task play in the real world via a continuous cycle of task selection, execution, evaluation, and policy improvement, guided by vision-language models. This procedure generates an increasingly diverse demonstration dataset over time for each task, while minimizing the need for manual resets and human interventions. In a real household-like multi-object environment, our method is among the first to bootstrap many hours of diverse play from few demonstrations, autonomously producing hundreds of expert-level trajectories that could be used for downstream policy learning.}\n}\n\n","author_short":["Liang, W.","Wang, S.","Wang, H.","Bastani, O.","Ma, Y. J.","Jayaraman, D."],"key":"liang2025trajectory","id":"liang2025trajectory","bibbaseid":"liang-wang-wang-bastani-ma-jayaraman-correspondencedriventrajectorywarpingfordataefficientimitationandautonomousplay-2025","role":"author","urls":{},"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"https://gist.githubusercontent.com/dineshj1/0185709a89b3de5cb7c763e36c0cb031/raw/","dataSources":["azLig8QfcbeEgFsYA"],"keywords":[],"search_terms":["correspondence","driven","trajectory","warping","data","efficient","imitation","autonomous","play","liang","wang","wang","bastani","ma","jayaraman"],"title":"Correspondence-Driven Trajectory Warping for Data-Efficient Imitation and Autonomous Play","year":2025}