Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron.

Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron. Zhao, Y., Zhang, W., Xie, Y., Goyal, A., Kawaguchi, K., & Shieh, M. In ICLR, 2025. OpenReview.net.

Link

Paper bibtex

@inproceedings{conf/iclr/00060XGKS25,
  added-at = {2025-05-15T00:00:00.000+0200},
  author = {Zhao, Yiran and Zhang, Wenxuan and Xie, Yuxi and Goyal, Anirudh and Kawaguchi, Kenji and Shieh, Michael},
  biburl = {https://www.bibsonomy.org/bibtex/2ed98a12bdb834a43e255800db61056f9/dblp},
  booktitle = {ICLR},
  crossref = {conf/iclr/2025},
  ee = {https://openreview.net/forum?id=yR47RmND1m},
  interhash = {06e7a13e129905936d2a53d9e93f0c7d},
  intrahash = {ed98a12bdb834a43e255800db61056f9},
  keywords = {dblp},
  publisher = {OpenReview.net},
  timestamp = {2025-05-19T07:11:53.000+0200},
  title = {Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron.},
  url = {http://dblp.uni-trier.de/db/conf/iclr/iclr2025.html#00060XGKS25},
  year = 2025
}

Downloads: 0