Hi! I am currently pursuing my second master's degree at the School of Biomedical Informatics, University of Texas Health Science Center at Houston, with plans to continue into a PhD program.
Previously, I earned my first master's degree in Computer Science & Engineering at Korea University under Dr. Jaewoo Kang, after completing a bachelor's degree in Biotechnology at Korea University.
I have also conducted research as a visiting scholar at Boston Children's Hospital and Harvard Medical School, advised by Dr. Timothy A. Miller.
My research focuses on natural language processing (NLP) for biomedical and clinical applications, with particular interests in multimodal large language models and developing interpretable, trustworthy AI for healthcare and scientific discovery.
My work has been presented at venues such as ACL, EMNLP, and NAACL.
About me
Research Interests
- ClinicalNLPClinical/Medical NLP
- ScientificAIScientific AI
- MultimodalityMultimodality
- LLMsLarge Language Models
Education
M.S. Student in Biomedical Informatics
[Aug 2025 - ]University of Texas Health Science Center at Houston
@ Houston, Texas, USA
M.S. in Computer Science and Engineering
[Mar 2023 - Feb 2025]Korea University
Data Mining and Information Systems (DMIS) Lab (Advisor: Dr. Jaewoo Kang)
GPA: 4.0/4.0
@ Seoul, South Korea
Exchange Student in School of Biology
[Feb 2018 - Jul 2018]University of Lausanne
@ Lausanne, Switzerland
B.S. in Biotechnology
[Mar 2016 - Feb 2022]Korea University
Controlled Environment for Plant Production Lab (Advisor: Dr. Jongyun Kim)
GPA: 3.6/4.0
@ Seoul, South Korea
Academic Experience
Boston Children's Hospital
(Harvard Medical School)
[Oct 2024 - Apr 2025]
Visiting Scholar
Machine Learning for Medical Language (MLML) Lab
(Advisor: Dr. Timothy A. Miller)
@ Boston, Massachusetts, USA
Korea University
[Dec 2022 - Feb 2023]Research Intern
Data Mining and Information Systems (DMIS) Lab
(Advisor: Dr. Jaewoo Kang)
@ Seoul, South Korea
Korea University
[Feb 2021 - Feb 2021]Research Intern
Lab of Matrix Biology (Advisor: Dr. Chungho Kim)
@ Seoul, South Korea
KIST (Korea Institute of Science and Technology)
[Jan 2019 - Feb 2019]Research Intern
Theragnosis Lab (Advisor: Dr. Hyunkyung Kim)
@ Seoul, South Korea
Industry Experience
Silvia Health
[Jun 2025 - Aug 2025]Independent Researcher
Identifying voice-based biomarkers for Alzheimer's disease, continuing as an academic–industry collaboration
@ Seoul, Republic of Korea
Kolon Lifescience
[Jan 2022 - Oct 2022]Clinical Trial Project Manager
Long term follow-up team
@ Seoul, South Korea
BIO division at CJ CheilJedang
[Nov 2021 - Dec 2021]Research Intern
Purification group
@ Seoul, South Korea
Samsung Bioepis
[Jul 2021 - Aug 2021]Research Intern
Purification group
@ Incheon, South Korea
Projects
Classifying Alzheimer’s Disease with Voice and Transcript Data
[Jun 2025 – Ongoing]
– Classified Alzheimer’s disease and healthy controls using real-world Korean patient voice recordings and transcripts
– Compared multimodal LLMs against modality-specific models, analyzing acoustic and textual contributions
– Expanding the study to distinguish Alzheimer’s disease and mild cognitive impairment (MCI)
Integrating Radiology Findings for Mortality Prediction
[Oct 2024 – Ongoing]
– Compared VLMs that combine discharge notes with either CXR images or radiology reports for mortality prediction
– Found that CXRs provide stronger and complementary signals than radiology reports and validated by radiologist
– In preparation as “Integration of Radiology Findings for Mortality Prediction: Radiology Reports vs. CXR Images”
Aspect-Oriented Summarization for Psychiatric Readmission Prediction
[Oct 2024 – Aug 2025]
– Developing LLM-based, aspect-oriented summarization pipelines for 30-day psychiatric readmission prediction
– Published at EMNLP 2025: “Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction”
ANGEL Framework
[Oct 2023 – May 2025]
– Enhanced a generative biomedical entity-linking model via preference optimization with negative samples
– Published at ACL 2025 Findings: “Learning from Negative Samples in Biomedical Generative Entity Linking”
– Released source code on GitHub and checkpoints on Hugging Face
Human-in-the-loop LLMs for Biomedical Hypothesis Generation
[Jul 2023 – Sep 2024]– Designed RAG-based approaches for biomedical hypothesis generation to support human-in-the-loop LLMs
AI Platform for Precision Medicine in Diabetes Using Big Data
[May 2023 – Sep 2024]– Fine-tuned LLMs to deliver expert-level explanations for diabetes care by integrating clinical guidelines
Open-source LLM for Explaining Korean Cultural Heritage
[2023 – 2024]
– Pretrained and finetuned multilingual LLMs and integrated RAG to enhance contextual understanding to generate personalized storytelling about Korean heritage
Publications
[In Preparation] [2025]
, W. Yoon, H. Lee, J. O. Lee, J. Kang, T. Miller
[EMNLP 2025 Main] [2025]
W. Yoon, B. Ren, S. Thomas,
, G. Savova, M. Hall, T. Miller[ACL 2025 Findings] [2025]
, H. Kim*, S. Park, J. Lee, M. Sung, J. Kang (*Equal contribution)
[NAACL 2024 ClinicalNLP Workshop] [2024]
H. Kim*,
, H. Lee, K. Jang, J. Lee, K. Lee, G. Kim, J. Kang (*Equal contribution)[IEEE BIBM 2023] [2023]
S. Lee, G. Jang,
, S. Park, K. Yoo, J. Kim, S. Kim, J. Kang[BioCreative VIII Track 3] [2023]
H. Kim,
, J. Sohn, T. Beck, M. Rei, S. Kim, T. Simpson, J. Posma, A. Lain, M. Sung, J. Kang[ACL 2023 BioNLP Workshop] [2023]
G. Kim, H. Kim, L. Ji, S. Bae,
, M. Sung, H. Kim, K. Yan, E. Chang, J. KangHonors
Korea University - KIAT Scholarship Program
[2024]Awarded to support a research secondment at Harvard Medical School, funded by the Ministry of Trade, Industry and Energy (MOTIE), Korea
Top 9 at [HackGlobal Grand Finals]
[2024]
AI-powered Social Gathering Platform Based on Hotel Guests' Individual Preferences
[Certificate]
[News Article (English)]
1st prize at [HackSeoul 2024 Social Responsibility Track]
[2024]
AI-Powered Express Bus Reservation Service Designed for Digitally Excluded Elderly
[News Article (Korean)]
[News Article (English)]
4th prize at [NAACL 2024 ClinicalNLP Workshop]
[2024]
Reliable Text-to-SQL Modeling on Electronic Health Records
[Challenge paper]
3rd Prize at [BioCreative VIII Track 3]
[2023]
Genetic Phenotype Extraction and Normalization from Dysmorphology Physical Examination Entries
[Challenge paper]
1st prize at [ACL 2023 BioNLP Workshop]
[2023]
RadSum23: Multi-modal and Multi-anatomical Radiology Report Summarization
[Challenge paper]
[News Article (Korean)]