Jenny Li © 2025

Projects

2023-25

First author, Clinical Cancer Research

R

Shell scripting

EPICv2 Methylation Arrays

HPC

Thyroid carcinoma is the most common endocrine malignancy in children, and current guidelines recommend total thyroidectomy for nearly all pediatric cases. While effective, the procedure carries higher complication risks in children, including hypoparathyroidism and nerve injury. Improved preoperative diagnostics could reduce unnecessary surgeries and lifelong hormone dependence. Existing imaging-based approaches are subjective and variable. In this study, we demonstrate that genome-wide DNA methylation profiling robustly captures molecular features of pediatric thyroid carcinoma, including invasiveness and driver mutations. These findings support the potential of DNA methylation as a preoperative prognostic tool to inform treatment decisions and minimize surgical risk.

FALL 2025

Exome-Wide Association Study of Breast Cancer Risk in Penn Medicine Biobank

Python

R

Shell scripting

REGENIE

PLINK

Exome

LPC

Breast cancer susceptibility is influenced not only by common genetic variation but also by rare, protein-altering variants with potentially large effects that are not well captured by traditional genome-wide association studies. Whole-exome sequencing in large, clinically linked biobanks enables systematic interrogation of these rare variants at scale. In this study, we use whole-exome sequencing data from the Penn Medicine Biobank to perform an exome-wide association study of breast cancer risk using both single-variant and gene-based aggregation approaches. Using ICD-9/10 codes for case/control selection, functional variant annotation, and multiple burden masks within a mixed-model framework, we assess the contribution of rare coding variation to breast cancer susceptibility. This work establishes a scalable and reproducible exome analysis pipeline and contributes rare variant association results to the SIMPLEXO breast cancer project, supporting gene-level discovery in large biobank cohorts

SUMMER 2025

Ancestry-Stratified Genome-Wide Association Study of Breast Cancer Risk in Penn Medicine Biobank

Python

R

Shell scripting

REGENIE

PLINK

LPC

Breast cancer is a leading cause of cancer morbidity among women worldwide. Large biobanks linked to electronic health records provide an opportunity to address these gaps by enabling ancestry-aware genetic discovery at scale. In this study, I used imputed genotype sequencing data from the Penn Medicine Biobank to perform an ancestry-stratified genome-wide association study of breast cancer risk. Using clinical ICD-9/10 codes for case/control selection and genetic ancestry inference, I analyzed breast cancer susceptibility across diverse populations using a REGENIE, a mixed-model framework that accounts for population structure, relatedness, and technical confounders. These results establish a reproducible pipeline for integrating biobank genomic data with clinical phenotypes and contribute ancestry-informed association results to the Confluence Project, which supports robust discovery of genetic risk factors underlying breast cancer.

2025-6

Clonal hematopoiesis in germline BRCA1/2 carriers

Python

R

Shell scripting

PLINK

LPC

Clonal hematopoiesis (CH) is the age-related expansion of blood cell clones harboring somatic mutations and is associated with increased risk of hematologic malignancies, cardiovascular disease, and inflammatory conditions. In cancer patients, genotoxic therapies such as platinum chemotherapy and PARP inhibitors preferentially select for CH mutations in DNA damage response genes (e.g., TP53, PPM1D, ATM, CHEK2), in addition to common age-related epigenetic regulators (DNMT3A, TET2, ASXL1), because these clones resist DNA damage–induced apoptosis. Germline BRCA1/2 (gBRCA1/2) mutation carriers have impaired double-strand break repair and genomic instability, which may predispose them to CH even in the absence of cancer or therapy; however, prior studies have largely focused on individuals with cancer or on therapy-related myeloid neoplasms, making it difficult to separate inherited risk from treatment effects. This study therefore aims to determine whether gBRCA1/2 mutation carriers exhibit increased CH prevalence independent of cancer diagnosis or exposure to genotoxic therapy.

Presentations

Summer 2025

Methylation-based Prognosis of Pediatric Thyroid Carcinoma Invasiveness

American Physician Scientists Association Mid-Atlantic Conference

SPRING 2025

DNA methylation-based stratification of pediatric thyroid tumor invasiveness

University of Pennsylvania Spring Research Exposition & Women in Stem Symposium

FALL 2023

Cross-platform DNA methylome-based cancer classification

MidAtlantic Bioinformatics Conference

FALL 2023

Exploration of feature selection strategies in DNA methylome-based cancer classification

University of Pennsylvania Fall Research Exposition