"Not Aligned" is Not "Malicious": Being Careful about Hallucinations of Large Language Models' Jailbreak

Publication
Proc. of the International Conference on Computational Linguistics, Coling, 2025, pages 2144-2162