Jenny Li © 2025

Projects

Thyroid carcinoma is the most common endocrine malignancy in children, and current guidelines recommend total thyroidectomy for nearly all pediatric cases. While effective, the procedure carries higher complication risks in children, including hypoparathyroidism and nerve injury. Improved preoperative diagnostics could reduce unnecessary surgeries and lifelong hormone dependence. Existing imaging-based approaches are subjective and variable. In this study, we demonstrate that genome-wide DNA methylation profiling robustly captures molecular features of pediatric thyroid carcinoma, including invasiveness and driver mutations. These findings support the potential of DNA methylation as a preoperative prognostic tool to inform treatment decisions and minimize surgical risk.

2025-26

Clonal hematopoiesis in germline BRCA1/2 carriers

Python

R

Shell scripting

PLINK

LPC

Clonal hematopoiesis of indeterminate potential (CHIP) arises from somatic mutations in hematopoietic stem cells that confer growth advantage and is influenced by germline variations in DNA damage response (DDR) genes. Whether germline BRCA1/2 (gBRCA1/2) carrier status independently predisposes individuals to CHIP independent of cancer diagnosis or genotoxic therapy exposure remains unclear. Whole-exome sequencing data from the Penn Medicine Biobank (PMBB) were processed through somatic and germline variant calling pipelines to identify CHIP variants and gBRCA1/2 carriers, respectively. gBRCA1/2 carrier status was independently associated with increased CHIP prevalence in a propensity-matched population, suggesting that inherited DNA repair deficiency may promote clonal hematopoiesis in the absence of genotoxic therapy.

Breast cancer is a leading cause of cancer morbidity among women worldwide. Large biobanks linked to electronic health records provide an opportunity to address these gaps by enabling ancestry-aware genetic discovery at scale. In this study, I used imputed genotype sequencing data from the Penn Medicine Biobank to perform an ancestry-stratified genome-wide association study of breast cancer risk. Employing clinical ICD-9/10 codes for case/control selection and genetic ancestry inference, I analyzed breast cancer susceptibility using REGENIE, a program for whole genome regression modeling of large genome-wide association studies. These results contributed to the NCI-led Confluence Project, a large international consortia that has conducted the largest and most ancestrally diverse GWAS of breast cancer to date, nearly tripling the effective sample size of previous GWAS and substantially increasing sample diversity.

FALL 2025

Exome-Wide association study of breast cancer in the Penn Medicine Biobank

Python

R

Shell scripting

REGENIE

PLINK

Exome

LPC

Breast cancer susceptibility is not only influenced by common genetic variation but also by rare, protein-altering variants not well captured by traditional genome-wide association studies. Whole-exome sequencing in large, clinically linked biobanks enables systematic interrogation of these rare variants at scale. In this study, we use whole-exome sequencing data from the Penn Medicine Biobank to perform an exome-wide association study of breast cancer risk using both single-variant and gene-based aggregation approaches. Employing ICD-9/10 codes for case/control selection, functional variant annotation, and multiple burden masks within the REGENIE mixed-model framework, we assess the contribution of rare coding variation to breast cancer susceptibility. This work establishes a scalable and reproducible exome analysis pipeline and contributes to the SIMPLEXO breast cancer project, supporting gene-level discovery in large biobank cohorts.

Conference & Presentations

APRIL 2026

Second author, Poster Presentation at the AACR Annual Meeting; Cancer Research

Summer 2025

Methylation-based Prognosis of Pediatric Thyroid Carcinoma Invasiveness

American Physician Scientists Association Mid-Atlantic Conference

SPRING 2025

DNA methylation-based stratification of pediatric thyroid tumor invasiveness

University of Pennsylvania Spring Research Exposition & Women in Stem Symposium

FALL 2023

Cross-platform DNA methylome-based cancer classification

MidAtlantic Bioinformatics Conference

FALL 2023

Exploration of feature selection strategies in DNA methylome-based cancer classification

University of Pennsylvania Fall Research Exposition