Projects

Does AI Improve Code Quality? Here's What the Data Actually Says

GitHub | Staff User Researcher — GitHub Copilot: Code Quality

In 2023, organizations like GitClear published analyses claiming that GitHub Copilot was degrading code quality. The claims spread quickly. I led a research program to find out if it was true.

The centerpiece was a randomized controlled trial with 243 developers — a rigorous study design that allowed us to establish a causal relationship between Copilot use and code quality, rather than relying on correlational data. Code was evaluated across five dimensions (functionality, readability, reliability, maintainability, conciseness) and assessed blind.

Copilot improved code quality across all five dimensions. But the more interesting finding: a 25% increase in AI adoption was associated with a 3.4% increase in code quality and a 1.5% decrease in delivery throughput. The code quality story was solved. The throughput story wasn't — and nobody was talking about it yet.

That reframe opened a new phase of research focused on the full developer workflow, ultimately laying the groundwork for Copilot Code Review.

Read the published research →

Code Review in the Age of AI: Why Developers Will Always Own the Merge Button

GitHub | Staff User Researcher — GitHub Copilot: Developer Workflow & Code Review

Following the code quality RCT, I designed a follow-on research program to understand why teams using AI were delivering software more slowly. The hypothesis: the bottleneck had moved downstream to code review.

The research unfolded in three phases — the code quality RCT, a diary survey study with 19 engineers followed by 60-minute interviews with 12 participants, and finally intercept surveys on Copilot Code Review adoption. Analysis combined thematic coding with K-means clustering and correlation analysis of survey data.

The findings were clear: code review had become the bottleneck. Code review frequency decreased 12.9%, time spent increased 30 minutes per week, and on 40.7% of days developers rated it their most frustrating activity. AI-assisted authoring was producing larger, more complex pull requests without the contextual scaffolding that human-authored PRs carry.

Copilot Code Review launched in public preview in February 2025, directly addressing these bottlenecks. Within its first year: users merged 41% more pull requests than any other Copilot feature users, 72.6% said it improved their effectiveness, and usage grew 10x — now accounting for more than one in five code reviews on GitHub.

Read the published research →