[Submitted on 20 Mar 2023]
Authors:Zhengliang Liu, Xiaowei Yu, Lu Zhang, Zihao Wu, Chao Cao, Haixing Dai, Lin Zhao, Wei Liu, Dinggang Shen, Quanzheng Li, Tianming Liu, Dajiang Zhu, Xiang Li
Abstract: The digitization of healthcare has facilitated the sharing and re-using of
medical data but has also raised concerns about confidentiality and privacy.
HIPAA (Health Insurance Portability and Accountability Act) mandates removing
re-identifying information before the dissemination of medical records. Thus,
effective and efficient solutions for de-identifying medical data, especially
those in free-text forms, are highly needed. While various computer-assisted
de-identification methods, including both rule-based and learning-based, have
been developed and used in prior practice, such solutions still lack
generaliz