Zaid Khan

[Apr 2025] EFAGen is out! EFAGen infers the data-generating abstraction underlying a static math problem as a program, and executes it to generate diverse, verifiable problem variants.
[Feb 2025] MutaGReP is out! Let an LLM explore a repo to find a plan for a complex user request you give it. Tree search + LLM-guided mutations + code retrieval + planning.
[Feb 2025] DataEnvGym has been accepted as a spotlight presentation at ICLR 2025!
[Oct 2024] DataEnvGym is out! Can we automate the process of generating data to improve a model on diverse, open-ended tasks, based on automatically-discovered model weaknesses? DataEnvGym is a testbed for data-generation agents + teaching environments. Twitter thread
[Mar 2024] Becoming a member of Mohit Bansal's group (MURGe-Lab) at UNC Chapel Hill as a PhD student, where I'll be working on multimodal agents, grounded language reasoning, and other exciting vision/language topics!
[Feb 2024] Two papers accepted to CVPR 2024, on self-training agents to solve computer vision tasks via program synthesis (summer internship work with NEC Laboratories) and black-box predictive uncertainty for multimodal LLMs.
[Feb 2024] Joining the PRIOR team at AllenAI this summer.
[Sep 2023] 1 paper accepted to NeurIPS 2023 on improving the reasoning abilities of open multimodal LLMs with question decomposition. (Collaboration with NEC Laboratories America).
[Aug 2023] Received a fellowship award from NEC Laboratories America.
[Jun 2023] 1 paper accepted to CVPR 2023 on self-training with synthetic data for visual question answering. (Summer internship work with NEC Laboratories America).
[Jun 2023] Joining the Media Analytics Group of NEC Laboratories America in San Jose again this summer to work on agentic foundation models for computer vision.
[May 2023] Completed my Masters in CompE (concentration in Computer Vision and Learning Algorithms) at Northeastern University at Raymond Fu's lab.
[Jan 2023] 1 paper accepted to ICLR 2023 on efficient vision-language pretraining.
[Jul 2022] 1 paper accepted to ECCV 2022 on data-efficient vision-language alignment (collaboration with NEC Laboratories America).
[Feb 2022] Joining the Media Analytics Group of NEC Laboratories America in San Jose this summer.
[Jul 2021] 1 paper (oral) accepted to ACM Multimedia 2021 on using language models for multimodal affective computing.
[May 2021] Received Northeastern's 2021 Outstanding Graduate Student Award!
[Feb 2021] 1 paper accepted to FAccT 2021 on why racial categories don't work for fair computer vision. Media Coverage: Scroll.IN, News@Northeastern reporting
[Sep 2020] Becoming a full-time MS student at Northeastern after wrapping up a 2-year stint at Roadie.

Research

Current Work

automatic skill-targeted data / environment generation: DataEnvGym frames data generation as an RL-style sequential decision-making problem. The goal is to build agents which can automate the process of identifying the weak skills of a model and generating training data to improve those weak skills. It builds on EnvGen, which generates training environments that help an agent learn skills the agent is weak at.

LLM-driven exploration and planning: MutaGReP explores large code repositories to find realizable plans for complex, multi-step user requests. We use LLM-guided mutations and informed tree search to explore plan space, and use a symbol retrieval to keep the plans grounded in the codebase.

Prior Work

Self-training / self-improvement

using reinforced self-training to improve program synthesis (Khan et al., CVPR 2024)
using unlabeled data to improve vision-language reasoning (Khan et al., CVPR 2023)

Using uncertainty during reasoning and decision-making

using self-consistency to identify unreliable knowledge (Khan et al., CVPR 2024)
using uncertainty to decide when to expend more test-time compute (Khan et al., NeurIPS 2023)

Vision-language representation learning

by aligning representations with minimal parameter updates (Khan et al., ICLR 2023)
by learning to reconstruct each modality (Khan et al., ECCV 2022)

highlighted

News

Background

Research

Current Work

Prior Work

Publications