Training Language Models to Follow Instructions with Human Feedback. Ouyang, L., Wu, J., Jiang, X., & others In Advances in Neural Information Processing Systems, 2022.
bibtex   
@inproceedings{ouyang2022rlhf,
  title={Training Language Models to Follow Instructions with Human Feedback},
  author={Ouyang, Long and Wu, Jeffrey and Jiang, Xu and others},
  booktitle={Advances in Neural Information Processing Systems},
  year={2022}
}

Downloads: 0