File Download

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Bilingual Autoencoder-based Efficient Harmonization of Multi-source Private Data for Accurate Predictive Modeling

Author(s)
Lee, Taek-HoLee, JunghyeJun, Chi-Hyuck
Issued Date
2021-08
DOI
10.1016/j.ins.2021.03.064
URI
https://scholarworks.unist.ac.kr/handle/201301/52735
Fulltext
https://www.sciencedirect.com/science/article/pii/S0020025521003157?via%3Dihub
Citation
INFORMATION SCIENCES, v.568, pp.403 - 426
Abstract
Sharing electronic health record data is essential for advanced analysis, but may put sensitive information at risk. Several studies have attempted to address this risk using contextual embedding, but with many hospitals involved, they are often inefficient and inflexible. Thus, we propose a bilingual autoencoder-based model to harmonize local embeddings in different spaces. Cross-hospital reconstruction of embeddings makes encoders map embeddings from hospitals to a shared space and align them spontaneously. We also suggest two-phase training to prevent distortion of embeddings during harmonization with hospitals that have biased information. In experiments, we used medical event sequences from the Medical Information Mart for Intensive Care-III dataset and simulated the situation of multiple hospitals. For evaluation, we measured the alignment of events from different hospitals and the prediction accuracy of a patient & rsquo;s diagnosis in the next admission in three scenarios in which local embeddings do not work. The proposed method efficiently harmonizes embeddings in different spaces, increases prediction accuracy, and gives flexibility to include new hospitals, so is superior to previous methods in most cases. It will be useful in predictive tasks to utilize distributed data while preserving private information.
Publisher
ELSEVIER SCIENCE INC
ISSN
0020-0255
Keyword (Author)
Distributed EHRContextual embeddingSpace alignmentAutoencoderPredictive tasks
Keyword
REGRESSIONSECURE

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.