STAIR Research Group | Scalable & Trustworthy AI Research
STAIR Research Group | Scalable & Trustworthy AI Research
People
Projects
Talks
Publications
Light
Dark
Automatic
"Not Aligned" is Not "Malicious": Being Careful about Hallucinations of Large Language Models' Jailbreak
Lingrui Mei
,
Shenghua Liu
,
Yiwei Wang
,
Baolong Bi
,
Jiayi Mao
,
Xueqi Cheng
January 2025
Cite
PDF
Type
Conference paper
Publication
Proc. of the International Conference on Computational Linguistics, Coling, 2025, pages 2144-2162
Cite
×