Discrepancies are Virtue: Weak-to-Strong Generalization through Lens of Intrinsic Dimension [Slides for Flatiron CCM ML Seminar]
Toward Fast and Provable Data Selection under Low Intrinsic Dimension [Slides for JMM 2025 ILAS Special Session on Randomness in Numerical Linear Algebra] [Slides for University of Delaware Numerical Analysis and PDE Seminar]
Robust Blockwise Random Pivoting: Fast and Accurate Adaptive Interpolative Decomposition [Slides for SIAM PP24 minisymposium on Randomized Methods in Linear Solvers and Matrix Decompositions]
Efficient Bounds and Estimates for Canonical Angles in Randomized Subspace Approximations [Slides for ICIAM 2023 minisymposium on Randomized Numerical Linear Algebra]
Preprints
Conference Publications
Discrepancies are Virtue: Weak-to-Strong Generalization through Lens of Intrinsic Dimension. Yijun Dong, Yicheng Li, Yunai Li, Jason D. Lee, Qi Lei. ICML 2025.
Greedy Output Approximation: Towards Efficient Structured Pruning for LLMs Without Retraining. Jianwei Li, Yijun Dong, Qi Lei. CPAL 2025.
Sketchy Moment Matching: Toward Fast and Provable Data Selection for Finetuning. Yijun Dong*, Hoang Phan*, Xiang Pan*, Qi Lei. NeurIPS 2024. [GitHub]
Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering. Yijun Dong*, Kevin Miller*, Qi Lei, Rachel Ward. NeurIPS 2023. [GitHub]
Adaptively Weighted Data Augmentation Consistency Regularization for Robust Optimization under Concept Shift. Yijun Dong*, Yuege Xie*, Rachel Ward. ICML 2023. [GitHub, poster]
Sample Efficiency of Data Augmentation Consistency Regularization. Shuo Yang*, Yijun Dong*, Rachel Ward, Inderjit Dhillon, Sujay Sanghavi, Qi Lei. AISTATS 2023.
Journal Publications
Robust Blockwise Random Pivoting: Fast and Accurate Adaptive Interpolative Decomposition. Yijun Dong, Chao Chen, Per-Gunnar Martinsson, Katherine Pearce. SIAM Journal on Matrix Analysis and Applications 2025. [GitHub]
Adaptive Parallelizable Algorithms for Interpolative Decompositions via Partially Pivoted LU. Katherine J. Pearce, Chao Chen, Yijun Dong, Per-Gunnar Martinsson. Numerical Linear Algebra with Applications 2025. [GitHub]
Efficient Bounds and Estimates for Canonical Angles in Randomized Subspace Approximations. Yijun Dong, Per-Gunnar Martinsson, Yuji Nakatsukasa. SIAM Journal on Matrix Analysis and Applications 2024. [GitHub]
Simpler is better: A comparative study of randomized algorithms for computing the CUR decomposition. Yijun Dong, Per-Gunnar Martinsson. Advances in Computational Mathematics 2023. [GitHub]
Quantifying Biofilm Formation of Sinorhizobium meliloti Bacterial Strains in Microfluidic Platforms by Measuring the Diffusion Coefficient of Polystyrene Beads. Chen Cheng*, Yijun Dong*, Matthew Dorian*, Farhan Kamili*, Effrosyni Seitaridou. Open Journal of Biophysics 2017.
Workshop Publications
Randomly Pivoted V-optimal Design: Fast Data Selection under Low Intrinsic Dimension. Yijun Dong*, Xiang Pan*, Hoang Phan*, Qi Lei. Workshop on Machine Learning and Compression @ NeurIPS 2024.
(* denotes equal contribution or alphabetical order)