Hi! 👋🏻 I’m Yuxuan (Leo) Lu, a Ph.D. student at Northeastern University. I’m currently working as an intern applied scientist at Amazon. Before that, I got my B.E. in Computer Science and Technology and Graduated with honor at Beijing University of Technology. I’m advised by Prof. Dakuo Wang. My research interest includes Human Computer Interaction and Natural Language Processing , especially in training, running and utilizing Large Language Models (LLMs) effiently and effectively. In the past, I’ve worked as Machine Learning Researcher at a joint program between LinkedIn and Microsoft Research Asia. I’ve also worked as an intern research assistant at THUNLP lab, supervised by Prof. Zhiyuan Liu(刘知远).
Picture of me, taken in The Sayram Lake (赛里木湖)
Education
I’m currently persuing my Ph.D. in Computer Science at Khoury College of Computer Sciences, Northeastern University, advised by Prof. Dakuo Wang.
I got my B.E. in Computer Science and Technology and Graduated with honor at Beijing University of Technology. Before that, I’ve finished my junior and senior high at Beijing National Day School (北京市十一学校).
Preprints
2024
RECOVER: Designing a Large Language Model-based Remote Patient Monitoring System for Postoperative GI Cancer Care
Ziqi Yang*,
Yuxuan Lu*, Jennifer Bagdasarian, Vedant Das Swain, Ritu Agarwal, Collin Campbell, Waddah Al-Refaire, Dr Jehan El-Bayoumi, Guodong (Gordon) Gao, shara,
Dakuo Wang, and
Bingsheng Yao In Submission to CHI 2025, 2024
UXAgent: An Large Language Model-based Agent System for Usability Testing of Web Design
In Submission to CHI 2025, 2024
Characterizing LLM-Empowered Personalized Story Reading and Interaction for Children: Insights From Multi-Stakeholders’ Perspective
In Submission to CHI 2025, 2024
From Dark Data to Open Data: Challenges and Practices for Data Integrators of Data-Driven Open Science Projects in Geoscience
In Submission to CSCW 2025, 2024
Exploring Domain Adaptation with LLMs for Real-World Augmented Question Answer Generation (RA-QAG) in Children Storytelling
In Submission to EMNLP 2024, 2024
ALERTS: Active Learning and Ensemble LLM Real-Time Switch for Real-World Data Drift Challenges
In Submission to EMNLP 2024, 2024
2023
Human Still Wins over LLM: An Empirical Study of Active Learning on Domain-Specific Annotation Tasks
arXiv preprint arXiv:2311.09825, 2023
Large Language Models (LLMs) have demonstrated considerable advances, and several claims have been made about their exceeding human performance. However, in real-world tasks, domain knowledge is often required. Low-resource learning methods like Active Learning (AL) have been proposed to tackle the cost of domain expert annotation, raising this question: Can LLMs surpass compact models trained with expert annotations in domain-specific tasks? In this work, we conduct an empirical experiment on four datasets from three different domains comparing SOTA LLMs with small models trained on expert annotations with AL. We found that small models can outperform GPT-3.5 with a few hundreds of labeled data, and they achieve higher or similar performance with GPT-4 despite that they are hundreds time smaller. Based on these findings, we posit that LLM predictions can be used as a warmup method in real-world applications and human experts remain indispensable in tasks involving data annotation driven by domain-specific knowledge.
Publications
2024
More Samples or More Prompt Inputs? Exploring Effective In-Context Sampling for LLM Few-Shot Prompt Engineering
In Findings of the Association for Computational Linguistics: NAACL 2024, 2024
While most existing works on LLM prompt-engineering focus only on how to select a better set of data samples inside one single prompt input (In-Context Learning or ICL), why can’t we design and leverage multiple prompt inputs together to further improve the LLM performance? In this work, we propose In-Context Sampling (ICS), a low-resource LLM prompt-engineering technique to produce the most confident prediction results by optimizing the construction of multiple ICL prompt inputs. Extensive experiments with two SOTA LLMs (FlanT5-XL and Mistral-7B) on three NLI datasets (e-SNLI, Multi-NLI, and ANLI) illustrate that ICS can consistently enhance LLM’s prediction performance and confidence. An ablation study suggests that a diversity-based ICS strategy may further improve LLM’s performance, which sheds light on a new yet promising future research direction.
Professional Network Matters: Connections Empower Person-Job Fit
Hao Chen,
Lun Du,
Yuxuan Lu, Qiang Fu, Xu Chen, Shi Han, Yanbin Kang, Guangming Lu, and Zi Li
In Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024
Online recruitment platforms typically employ Person-Job Fit models in the core service that automatically match suitable job seekers with appropriate job positions. While existing works leverage historical or contextual information, they often disregard a crucial aspect: job seekers’ social relationships in professional networks. This paper emphasizes the importance of incorporating professional networks into the Person-Job Fit model. Our innovative approach consists of two stages: (1) defining a Workplace Heterogeneous Information Network (WHIN) to capture heterogeneous knowledge, including professional connections and pre-training representations of various entities using a heterogeneous graph neural network; (2) designing a Contextual Social Attention Graph Neural Network (CSAGNN) that supplements users’ missing information with professional connections’ contextual information. We introduce a job-specific attention mechanism in CSAGNN to handle noisy professional networks, leveraging pre-trained entity representations from WHIN. We demonstrate the effectiveness of our approach through experimental evaluations conducted across three real-world recruitment datasets from LinkedIn, showing superior performance compared to baseline models.
Exploring Parent’s Needs for Children-Centered AI to Support Preschoolers’ Storytelling and Reading Activities
Proc. ACM Hum.-Comput. Interact., 2024
Interactive storytelling is vital for preschooler development. While children’s interactive partners have traditionally been their parents and teachers, recent advances in artificial intelligence (AI) have sparked a surge of AI-based storytelling technologies. As these technologies become increasingly ubiquitous in preschoolers’ lives, questions arise regarding how they function in practical storytelling scenarios and, in particular, how parents, the most critical stakeholders, experience and perceive these technologies. This paper investigates these questions through a qualitative study with 17 parents of children aged 3-6. Our findings suggest that even though AI-based storytelling technologies provide more immersive and engaging interaction, they still cannot meet parents’ expectations due to a series of interactive, functional, and algorithmic challenges. We elaborate on these challenges and discuss the possible implications of future AI-based storytelling technologies for preschoolers. We conclude by highlighting the design implications for future AI-based storytelling technologies.
Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis Diagnosis
Shao Zhang, Jianing Yu,
Xuhai Xu, Changchang Yin,
Yuxuan Lu,
Bingsheng Yao, Melanie Tory, Lace M. Padilla, Jeffrey Caterino, Ping Zhang, and
Dakuo Wang In Proceedings of the CHI Conference on Human Factors in Computing Systems, 2024
Today’s AI systems for medical decision support often succeed on benchmark datasets in research papers but fail in real-world deployment. This work focuses on the decision making of sepsis, an acute life-threatening systematic infection that requires an early diagnosis with high uncertainty from the clinician. Our aim is to explore the design requirements for AI systems that can support clinical experts in making better decisions for the early diagnosis of sepsis. The study begins with a formative study investigating why clinical experts abandon an existing AI-powered Sepsis predictive module in their electrical health record (EHR) system. We argue that a human-centered AI system needs to support human experts in the intermediate stages of a medical decision-making process (e.g., generating hypotheses or gathering data), instead of focusing only on the final decision. Therefore, we build SepsisLab based on a state-of-the-art AI algorithm and extend it to predict the future projection of sepsis development, visualize the prediction uncertainty, and propose actionable suggestions (i.e., which additional laboratory tests can be collected) to reduce such uncertainty. Through heuristic evaluation with six clinicians using our prototype system, we demonstrate that SepsisLab enables a promising human-AI collaboration paradigm for the future of AI-assisted sepsis diagnosis and other high-stakes medical decision making.
StorySpark: Expert-Annotated QA Pairs with Real-World Knowledge for Children Storytelling
In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Interactive story reading is a common parent-child activity, where parents expect to teach both language skills and real-world knowledge beyond the story. While increasing storytelling and reading systems have been developed for this activity, they often fail to infuse real-world knowledge into the conversation. This limitation can be attributed to the existing question-answering (QA) datasets used for children’s education, upon which the systems are built, failing to capture the nuances of how education experts think when conducting interactive story reading activities. To bridge this gap, we design an annotation framework, empowered by existing knowledge graph to capture experts’ annotations and thinking process, and leverage this framework to construct StorySparkQA dataset, which comprises 5, 868 expert-annotated QA pairs with real-world knowledge. We conduct automated and human expert evaluations across various QA pair generation settings to demonstrate that our StorySparkQA can effectively support models in generating QA pairs that target real-world knowledge beyond story content. StorySparkQA is available at https://huggingface.co/datasets/NEU-HAI/StorySparkQA.
2023
Beyond Labels: Empowering Human Annotators with Natural Language Explanations through a Novel Active-Learning Architecture
Bingsheng Yao,
Ishan Jindal,
Lucian Popa,
Yannis Katsis, Sayan Ghosh, Lihong He,
Yuxuan Lu, Shashank Srivastava, Yunyao Li,
James Hendler, and
Dakuo Wang In Findings of the Association for Computational Linguistics: EMNLP 2023, Dec 2023
Real-world domain expertsD (e.g., doctors) rarely annotate only a decision label in their day-to-day workflow without providing explanations. Yet, existing low-resource learning techniques, such as Active Learning (AL), that aim to support human annotators mostly focus on the label while neglecting the natural language explanation of a data point. This work proposes a novel AL architecture to support experts’ real-world need for label and explanation annotations in low-resource scenarios. Our AL architecture leverages an explanation-generation model to produce explanations guided by human explanations, a prediction model that utilizes generated explanations toward prediction faithfully, and a novel data diversity-based AL sampling strategy that benefits from the explanation annotations. Automated and human evaluations demonstrate the effectiveness of incorporating explanations into AL sampling and the improved human annotation efficiency and trustworthiness with our AL architecture. Additional ablation studies illustrate the potential of our AL architecture for transfer learning, generalizability, and integration with large language models (LLMs). While LLMs exhibit exceptional explanation-generation capabilities for relatively simple tasks, their effectiveness in complex real-world tasks warrants further in-depth study.
Improving Biomedical Question Answering by Data Augmentation and Model Weighting
Yongping Du, Jingya Yan, Yuxuan Lu, Yiliang Zhao, and Xingnan Jin
IEEE/ACM Transactions on Computational Biology and Bioinformatics, Dec 2023
2022
Contextual Embedding and Model Weighting by Fusing Domain Knowledge on Biomedical Question Answering
Yuxuan Lu, Jingya Yan, Zhixuan Qi, Zhongzheng Ge, and Yongping Du
In Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, Dec 2022
Biomedical Question Answering aims to obtain an answer to the given question from the biomedical domain. Due to its high requirement of biomedical domain knowledge, it is difficult for the model to learn domain knowledge from limited training data. We propose a contextual embedding method that combines open-domain QA model AoA Reader and BioBERT model pre-trained on biomedical domain data. We adopt unsupervised pre-training on large biomedical corpus and supervised fine-tuning on biomedical question answering dataset. Additionally, we adopt an MLP-based model weighting layer to automatically exploit the advantages of two models to provide the correct answer. The public dataset biomrc constructed from PubMed corpus is used to evaluate our method. Experimental results show that our model outperforms state-of-the-art system by a large margin.
2021
Dual Model Weighting Strategy and Data Augmentation in Biomedical Question Answering
Yongping Du, Jingya Yan, Yiliang Zhao, Yuxuan Lu, and Xingnan Jin
In 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Dec 2021
Research Experience
My current research fields includes human-ai collaboration and interaction, especially in the area of Large Language Models (LLMs).
I’m currently working as an intern applied scientist at Amazon.
Before that, I’ve worked as Machine Learning Researcher at a joint program between LinkedIn and Microsoft Research Asia where I do study about LinkedIn’s social network data. I’ve also worked as an intern research assistant at THUNLP lab, supervised by Prof. Zhiyuan Liu(刘知远). My research area there includes Knowledge Embedding.
Open source communities
I’ve participated in many open-source communities. I’m the maintainer of the VSCode extension LaTeX-Utilities, and I’m the founder and maintainer of the EduOJ project. Furthermore, I’ve contributed to many open-source projects, like GitLab, UniversalOJ, OI-Wiki, nix and others.
I’ve participated as mentor and community leader in the Open Source Promotion Plan 2021. All my 3 students successfully finished their projects. I’ve participated as a student in the OSPP 2020 in the UniversalOJ community, and successfully finished my project.
Learn more about my open-source experience at here.