Understanding Post-training through the Lens of Intrinsic Dimension. [Slides]
Discrepancies are Virtue: Weak-to-Strong Generalization through Lens of Intrinsic Dimension. [Slides]
Toward Fast and Provable Data Selection under Low Intrinsic Dimension. [Slides]
Robust Blockwise Random Pivoting: Fast and Accurate Adaptive Interpolative Decomposition. [Slides]
Efficient Bounds and Estimates for Canonical Angles in Randomized Subspace Approximations. [Slides]
Conference Publications
Does Weak-to-strong Generalization Happen under Spurious Correlations? Chenruo Liu*, Yijun Dong*, Qi Lei. ICLR 2026.
Discrepancies are Virtue: Weak-to-Strong Generalization through Lens of Intrinsic Dimension. Yijun Dong, Yicheng Li, Yunai Li, Jason D. Lee, Qi Lei. ICML 2025.
Greedy Output Approximation: Towards Efficient Structured Pruning for LLMs Without Retraining. Jianwei Li, Yijun Dong, Qi Lei. CPAL 2025.
Sketchy Moment Matching: Toward Fast and Provable Data Selection for Finetuning. Yijun Dong*, Hoang Phan*, Xiang Pan*, Qi Lei. NeurIPS 2024. [GitHub]
Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering. Yijun Dong*, Kevin Miller*, Qi Lei, Rachel Ward. NeurIPS 2023. [GitHub]
Adaptively Weighted Data Augmentation Consistency Regularization for Robust Optimization under Concept Shift. Yijun Dong*, Yuege Xie*, Rachel Ward. ICML 2023. [GitHub]
Sample Efficiency of Data Augmentation Consistency Regularization. Shuo Yang*, Yijun Dong*, Rachel Ward, Inderjit Dhillon, Sujay Sanghavi, Qi Lei. AISTATS 2023.
Journal Publications
Robust Blockwise Random Pivoting: Fast and Accurate Adaptive Interpolative Decomposition. Yijun Dong, Chao Chen, Per-Gunnar Martinsson, Katherine Pearce. SIAM Journal on Matrix Analysis and Applications 2025. [GitHub]
Adaptive Parallelizable Algorithms for Interpolative Decompositions via Partially Pivoted LU. Katherine J. Pearce, Chao Chen, Yijun Dong, Per-Gunnar Martinsson. Numerical Linear Algebra with Applications 2025. [GitHub]
Efficient Bounds and Estimates for Canonical Angles in Randomized Subspace Approximations. Yijun Dong, Per-Gunnar Martinsson, Yuji Nakatsukasa. SIAM Journal on Matrix Analysis and Applications 2024. [GitHub]
Simpler is better: A comparative study of randomized algorithms for computing the CUR decomposition. Yijun Dong, Per-Gunnar Martinsson. Advances in Computational Mathematics 2023. [GitHub]
Quantifying Biofilm Formation of Sinorhizobium meliloti Bacterial Strains in Microfluidic Platforms by Measuring the Diffusion Coefficient of Polystyrene Beads. Chen Cheng*, Yijun Dong*, Matthew Dorian*, Farhan Kamili*, Effrosyni Seitaridou. Open Journal of Biophysics 2017.
Preprints & Other Publications
A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula. Chenruo Liu, Yijun Dong, Yiqiu Shen, Qi Lei.
Randomized Time Stepping of Nonlinearly Parametrized Solutions of Evolution Problems. Yijun Dong, Paul Schwerdtner, Benjamin Peherstorfer.
Balanced Locality-sensitive Hashing for Online Data Selection. Hoang Phan, Yijun Dong, Andrew Gordon Wilson, Qi Lei. Optimization for Machine Learning Workshop @ NeurIPS 2025.
Randomly Pivoted V-optimal Design: Fast Data Selection under Low Intrinsic Dimension. Yijun Dong*, Xiang Pan*, Hoang Phan*, Qi Lei. Workshop on Machine Learning and Compression @ NeurIPS 2024.
* denotes equal contribution or alphabetical order. For a complete list of my publications, please visit Google Scholar.