Skip to main content

By Salary Hub · Updated June 2026

AI Productivity Multiplier by Role: What Studies Actually Show (2026)

Real, peer-reviewed numbers on how much faster generative AI makes you at 15 specific jobs — from a 14% gain for customer support agents to 55.8% for developers using Copilot.

By Salary Hub — AI Impact on Work · Updated 2026-06-20 · Educational only — not career, tax, or legal advice.

Stop writing AI prompts from scratch

Free 14-day trial · no card · 500+ ready prompts + custom prompt builder

14 days free · no card required · cancel anytime

Every CEO memo from 2024 to 2026 has promised that AI will make workers "10x" more productive. The actual research tells a more interesting story: the productivity multiplier is real, but it varies wildly by role, by tool, and — critically — by the experience level of the worker. A novice customer support agent gains 35% throughput with a generative AI assistant. A top-decile agent gains almost nothing. A developer using GitHub Copilot finishes a controlled task 55.8% faster. A management consultant solving problems outside the model's competence zone produces worse work with AI than without it. These are not vibes. They come from randomized controlled trials run by NBER, MIT, BCG, GitHub, and Microsoft Research between 2023 and 2025.

This page collects the peer-reviewed productivity gains for 15 specific roles into one comparison table, then unpacks what "productivity" actually means in each study — and where the headline number hides a more complicated reality. If you want to estimate your own AI productivity multiplier (and translate it into a freelance rate, a salary negotiation, or a hiring decision), see our freelance AI rate calculator and AI tool cost vs salary savings calculator. For a forward look at which jobs disappear rather than just get faster, see AI-replaceable jobs by 2030.

The honest summary up front: across white-collar knowledge work, the median peer-reviewed productivity gain from generative AI in 2023–2025 controlled studies sits between 25% and 40% for tasks the AI is good at. Software engineering with Copilot is the outlier high end at 55.8%. Customer support averages 14%. Writing tasks average 37% faster with 18% higher quality. Consulting tasks gain 25% speed and 40% quality — but only inside the model's competence frontier; outside it, AI users perform worse than non-users. There is no "10x engineer" study. There is no peer-reviewed paper showing 5x gains on real-world end-to-end jobs. The closer the measurement gets to controlled lab tasks, the higher the gain; the closer it gets to a real workday, the lower.

Sources are linked at the bottom of this page. Every number in the table below is from a public, citable study. If you see a number anywhere on the internet that doesn't match one of these, treat it as a marketing claim, not a research finding.

AI productivity gains by role — peer-reviewed studies, 2023–2025

RoleAI tool testedProductivity gainQuality changeSource
Customer support agent (overall)Generative AI assistant+14% issues resolved per hourHigher customer sentimentBrynjolfsson, Li & Raymond (NBER, 2023)
Customer support agent (novice, bottom quartile)Generative AI assistant+34% issues resolved per hourApproaches top-quartile qualityBrynjolfsson, Li & Raymond (NBER, 2023)
Customer support agent (top quartile)Generative AI assistantNear zero measurable gainNo significant changeBrynjolfsson, Li & Raymond (NBER, 2023)
Software developer (controlled task)GitHub Copilot+55.8% faster task completionComparable code passing testsPeng et al. (GitHub/MIT/Microsoft Research, 2023)
Software developer (enterprise field)GitHub Copilot+26% completed tasks per weekNo drop in code quality flaggedGitHub / Accenture / Microsoft Research RCT (2024)
Professional writer / copywriterChatGPT (GPT-3.5)+37% faster, ~40% time saved+18% quality scoreNoy & Zhang (Science, 2023)
Management consultant (in-frontier tasks)GPT-4+25% faster, +12% more tasks done+40% quality vs controlDell'Acqua et al., BCG/HBS "Jagged Frontier" (2023)
Management consultant (out-of-frontier tasks)GPT-4Similar speed−19 percentage points correctnessDell'Acqua et al., BCG/HBS "Jagged Frontier" (2023)
Lawyer / legal analyst (drafting tasks)GPT-4~30% faster on contracts & memosQuality maintained or improvedChoi et al., Minnesota Law (2023); Goldman Sachs / Thomson Reuters (2024)
Accountant / financial analystMicrosoft 365 Copilot~29% faster on Excel & summarizationMixed; high-stakes outputs need reviewMicrosoft Work Trend Index (2024)
Sales rep (B2B prospecting & email)ChatGPT / Copilot for Sales+12–15% sales productivity, est.Higher reply rates reportedMcKinsey "The State of AI" (2024)
Doctor / clinician (note drafting)Ambient AI scribes (Nuance DAX, Abridge)−1 hour/day documentation, +1–2 patients/day reportedMaintained or improved note quality (peer ratings)Microsoft / Nuance health system pilots (2024); JAMA case studies (2024)
Teacher (lesson planning & grading)ChatGPT, Khanmigo~30% time saved on prep tasks (self-report)Variable; needs subject-matter reviewRAND Educator survey (2024); Microsoft Work Trend (2024)
Marketer (content & creative)ChatGPT, Jasper, Midjourney30–40% time saved on first draftsHigher output volume; quality variesMcKinsey "State of AI" (2024); Microsoft Work Trend (2024)
Designer (visual / UX)Midjourney, Figma AI, ChatGPT~30% faster ideation & iterationMore variants exploredMcKinsey "Economic potential of generative AI" (2023)
Data analystChatGPT / Code Interpreter, Copilot30–40% faster on SQL & viz tasksQuality depends on prompt & reviewMcKinsey (2024); Microsoft Work Trend (2024)
Recruiter / HRChatGPT, Microsoft 365 Copilot~25% faster on job descriptions & screensHigher candidate engagement reportedMicrosoft Work Trend Index (2024)
Executive / manager (email, meeting prep)Microsoft 365 Copilot~29% faster information tasksQuality maintained; less context-switchingMicrosoft Work Trend Index (2024)

All percentages are from randomized controlled trials, large-N field studies, or vendor-published telemetry from peer-reviewed deployments. Where a range is given, the high end is controlled-task speed and the low end is real-world weekly throughput. Self-reported gains (teachers, marketers) are flagged in the source column.

What "productivity" actually means in these studies

Most studies measure one of three things: (1) tasks completed per unit of time in a controlled lab, (2) tickets, pull requests, or contracts closed per week in a live workplace, or (3) self-reported time savings on surveys. These are not interchangeable. A controlled-task speed-up of 55.8% — the famous GitHub Copilot result from Peng et al. — does not mean developers ship 55.8% more features per quarter. It means that when 95 developers were assigned an identical HTTP server task, the Copilot group finished a median of 71 minutes faster than the control group's 161 minutes.

The most rigorous real-world number we have for developers comes from a 2024 randomized controlled trial across Microsoft, Accenture, and an anonymous Fortune 100 firm, with 4,867 developers. Copilot users completed 26.08% more tasks per week on average. That is a real, deployed, field-study number — and it is less than half the controlled-task gain. The gap between lab and field is the single most important fact on this page. When a vendor cites a study number, check whether it was measured in a lab or in production. The honest planning multiplier is the field number, not the lab number.

Self-reported survey numbers (Microsoft Work Trend Index, RAND educator survey) tend to land in the 25–40% time-savings range. They are useful as a cross-check, not as primary evidence. Workers consistently overestimate AI time savings on diaries and surveys versus what objective task-completion data shows.

Why novices gain more than experts (and what that means for hiring)

The single most replicated finding across the 2023–2025 literature is that generative AI compresses the skill distribution. In the Brynjolfsson, Li, and Raymond NBER study of 5,179 customer support agents at a Fortune 500 software firm, the overall productivity gain was 14%. But that average hid a stark pattern: novice and low-skilled agents gained 35%, while the most experienced agents showed essentially no measurable gain. The AI was teaching the bottom of the distribution to perform like the top.

The Noy & Zhang ChatGPT writing study (Science, 2023) found a similar compression: workers with weaker baseline writing ability gained the most. So did the BCG/HBS consulting study — below-average consultants moved up to or past the control group's average performance when using GPT-4 on in-frontier tasks. This has clear implications for staffing models: AI is most valuable as a floor-raiser for less experienced staff, not as a ceiling-raiser for senior experts. If you are budgeting AI tools by seat, the highest ROI per seat is at the junior end of your org chart.

There is one important counter-finding. In the BCG study, when consultants used GPT-4 on tasks outside the model's competence frontier — problems requiring numerical reasoning the model gets wrong — they performed 19 percentage points worse than the control group. Junior staff are the most likely to miss when a task is outside the frontier. That is the trade-off: AI raises the floor for in-frontier work and lowers it for out-of-frontier work.

Software engineering: the 55.8% number, in context

The 55.8% Copilot speed-up from Peng et al. (2023) is the most-cited AI productivity statistic in the world. The study itself is small (95 developers) and narrow (a single HTTP server task in JavaScript). Treat it as an upper bound for what tightly-scoped boilerplate work looks like with AI assistance, not as a forecast of weekly throughput.

The 2024 GitHub / Microsoft Research / Accenture three-firm RCT is the better number for planning. With 4,867 developers in production, Copilot users completed 26% more tasks per week, opened more pull requests, and showed no significant drop in code quality across the metrics studied (build success, code review pass-through, defect rates). GitHub's Octoverse 2024 report adds context: developers using Copilot accept roughly 30% of suggestions, and AI-generated code now accounts for a meaningful share of new commits in observed repos.

For a fuller breakdown of which AI coding tool gives the best return — and whether Cursor's larger context window beats Copilot's tighter IDE integration — see Copilot vs Cursor for developers ROI. For the workday-level view of where the time actually goes, see how much time AI saves by task in 2026.

Customer support: the 14% headline and what's underneath

Brynjolfsson, Li, and Raymond's NBER paper "Generative AI at Work" remains the gold-standard real-world deployment study. Over a year, they observed 5,179 customer support agents at a Fortune 500 enterprise software company. Agents using a generative-AI-based conversational assistant resolved 13.8% more customer issues per hour. Customers reported better experiences and were less likely to ask for a supervisor. Employee attrition fell among agents who used the tool.

The breakdown by skill level is what changed the field: novice agents (in their first two months) gained 34% productivity and reached the performance level of agents with more than six months of tenure. The most experienced agents — top quartile — showed no statistically meaningful gain. The AI was effectively codifying and distributing the implicit knowledge of senior agents to newer ones. For workforce planning, this means a contact center can ramp new hires roughly twice as fast, but does not get much extra leverage out of its veterans.

If you're trying to translate this into a staffing or savings number for your own contact center, see AI tool cost vs salary savings.

Knowledge workers: consulting, writing, and the "Jagged Frontier"

The Dell'Acqua et al. "Navigating the Jagged Technological Frontier" study (HBS / BCG / Wharton / MIT, 2023) is the most important paper on the limits of AI productivity. 758 BCG consultants were randomized to use GPT-4 (some with light training, some without) or no AI on 18 realistic consulting tasks. On tasks inside the model's competence frontier — creative brainstorming, structured writing, persuasive memos — consultants using GPT-4 completed 12.2% more tasks, did them 25.1% faster, and produced output rated 40% higher in quality.

On tasks outside the frontier — a quantitative business problem requiring numerical reasoning the model handled poorly — GPT-4 users were 19 percentage points less likely to produce a correct answer than the control group. The model gave confident, fluent, wrong answers, and consultants trusted them. The lesson is not that AI hurts; it is that AI shifts the bottleneck from execution to judgment about when to use it.

The Noy & Zhang Science paper (2023) on writing tasks landed in a cleaner part of the frontier. 453 college-educated professionals wrote business documents — press releases, short reports, analysis plans. The ChatGPT group finished 37% faster and produced output rated 18% higher in quality by blinded evaluators. As with customer support, the productivity distribution compressed: weaker writers gained the most. For a broader view of which tool to reach for in each profession, see best AI tools by profession in 2026.

Roles where AI helps less (or not at all, yet)

Not every role gains 25–55% from generative AI. The gains shrink when the work is highly physical, deeply tacit, heavily regulated, or dominated by judgment under uncertainty. McKinsey's "Economic potential of generative AI" (2023) and follow-up "State of AI in 2024" estimate the smallest near-term productivity impact in roles like skilled trades, construction supervision, surgical and physical-procedure medicine, on-site field service, and frontline manufacturing. The 2024 McKinsey report estimates generative AI could add 0.1–0.6 percentage points to annual labor productivity through 2040, with the bulk concentrated in knowledge-worker functions.

Within white-collar work, the studies suggest AI helps least where the bottleneck is interpersonal judgment (senior management, complex client negotiation), highly regulated work where output must be verifiable line-by-line (audit sign-off, FDA submissions, certain legal opinions), and any task where hallucination cost exceeds drafting savings (medical diagnosis without a clinician in the loop, financial planning advice). For all of these, AI can still draft, but the post-AI review time can equal or exceed the time saved.

If you want a longer view of which roles eventually shrink rather than just speed up, see AI-replaceable jobs by 2030.

Methodology: how to read these numbers without getting fooled

When you encounter an AI productivity claim, ask four questions. First, is the gain measured in a controlled lab task or in production over a multi-week window? Lab numbers are roughly 1.5–2x production numbers. Second, what skill level is the worker? A 35% novice gain and a 0% expert gain average to 14% — and that 14% is the headline. Third, is the task inside the model's frontier? A study that handpicked in-frontier tasks will look better than your real workload. Fourth, who paid for the study? Vendor-funded studies (some Microsoft 365 Copilot numbers) tend to land higher than independent academic RCTs.

A reasonable planning rule: take the largest peer-reviewed gain for a role, halve it for real-world weekly throughput, and halve it again if your team is mostly senior. A development team using Copilot might plan for 13% more shipped tasks per week — half of the 26% field-study gain — even though the lab study says 55.8%. A senior consulting team using GPT-4 might plan for 6–12% throughput gain, not the 25% headline. These are conservative numbers, but they are the ones that survive contact with reality.

How AI productivity gains translate into pay and pricing

A 25% productivity gain does not automatically translate into a 25% pay raise or a 25% lower rate. The split between worker, employer, and customer is set by market structure. In tight labor markets where AI-fluent workers are scarce (senior ML engineers, AI-native product managers), more of the gain accrues to the worker through higher salary. In commoditized markets (basic copywriting, level-1 support), the gain often accrues to the buyer through lower prices.

For freelancers and contractors, the practical question is whether to keep your hourly rate and finish faster (more clients, more income), or lower your rate and capture more market share. The right answer depends on whether your bottleneck is demand or supply. See the freelance AI rate calculator for a model that walks through both strategies, and the AI tool cost vs salary savings calculator for the employer-side view.

How to estimate your own AI productivity multiplier

  1. 1. Pick the closest peer-reviewed role from the table

    Start with the row in the table above that most closely matches your day-to-day work, not your job title. A "product manager" who mostly writes specs and emails is a knowledge worker in the Noy & Zhang writing study (37% faster). A PM who runs SQL queries is closer to the data analyst row (30–40% faster). A PM who negotiates contracts is closer to the lawyer drafting row.

  2. 2. Halve the lab number to get a planning number

    Controlled-task gains are roughly 1.5–2x larger than field-deployment gains. If the headline for your role is 50% faster, plan for 25%. If it's 30%, plan for 15%. This is the conservative number that survives a quarter of real-world use.

  3. 3. Adjust for your skill percentile

    If you are in the top quartile of your role, halve the planning number again — the Brynjolfsson NBER pattern of near-zero gain for experts applies broadly. If you are in the bottom half, keep the full planning number. AI is a floor-raiser more than a ceiling-raiser.

  4. 4. Subtract the time you spend reviewing AI output

    For any high-stakes deliverable (legal, medical, financial, customer-facing copy), measure the time you spend verifying AI output. Subtract it from the gross time savings. For a contract draft, this can erase 30–50% of the gain. For an internal Slack summary, it's negligible.

  5. 5. Multiply by the share of your week the task covers

    Your productivity multiplier is task-weighted. If AI makes you 40% faster at writing but writing is only 30% of your week, your weekly productivity gain is roughly 12%, not 40%. Run the math for each major bucket of your week and sum.

  6. 6. Track real output for 4 weeks

    Pick a measurable unit (tickets closed, PRs merged, pages drafted, decks shipped). Track weekly output for two weeks without AI, then two weeks with AI on the same workload. The delta is your actual multiplier. Most teams find it lands 30–50% below their self-reported survey estimate.

Stop writing AI prompts from scratch.

Salary Hub members get the prompt packs senior consultants, lawyers, developers, and marketers use to actually hit the productivity gains in this report. Start a free trial and run your first AI-augmented workday this afternoon.

14 days free · no card required · cancel anytime

Frequently asked questions

How much faster is AI for software developers?+

In the most-cited lab study (Peng et al. 2023, GitHub / MIT / Microsoft Research), 95 developers using GitHub Copilot completed an HTTP server task 55.8% faster than the control group. In the better real-world study — a 2024 randomized controlled trial across Microsoft, Accenture, and an anonymous Fortune 100 firm, with 4,867 developers — Copilot users completed 26.08% more tasks per week on average. The honest planning number for a dev team is closer to 26% than 55.8%. The 55.8% figure is what happens on tightly-scoped boilerplate tasks in a lab; the 26% figure is what shows up in production over a quarter. Senior developers gain less; junior and mid-level developers tend to gain more, consistent with the broader pattern of AI compressing skill distributions.

Does ChatGPT really make writers and copywriters faster?+

Yes — and the quality usually goes up, not down. In Shakked Noy and Whitney Zhang's 2023 study published in Science, 453 college-educated professionals were randomized to write business documents with or without ChatGPT (GPT-3.5). The ChatGPT group finished 37% faster, with output rated 18% higher in quality by blinded evaluators. The gains were largest for workers with weaker baseline writing ability, who closed most of the gap with stronger writers. For real-world copywriting work — first drafts of landing pages, ad copy variants, blog outlines — the practical multiplier most agencies report is in the 30–40% time-saved range, consistent with the study. Highly polished long-form journalism shows smaller gains because the editing and fact-check time dominates the drafting time.

What's the productivity gain from GitHub Copilot, specifically?+

Two numbers matter. First, the 2023 lab study by Peng et al. measured 55.8% faster task completion on a controlled HTTP server task with 95 developers. Second, the larger 2024 randomized field experiment across 4,867 developers at Microsoft, Accenture, and a Fortune 100 firm measured 26.08% more completed tasks per week, plus more pull requests opened and no measurable drop in code quality on the metrics studied. GitHub's Octoverse 2024 reports that developers accept roughly 30% of Copilot suggestions on average. For planning, use the 26% field-study number; for understanding what's possible on greenfield boilerplate, the 55.8% number is the ceiling. Senior developers report smaller gains than junior developers, consistent with the broader pattern in the literature.

How much faster does AI make lawyers?+

Roughly 30% faster on drafting work, with quality maintained. The Choi et al. 2023 Minnesota Law study, the Thomson Reuters / Goldman Sachs studies in 2024, and several large-firm internal pilots all land in the 20–35% range for tasks like contract drafts, memos, deposition summaries, and discovery review. The gain shrinks sharply for high-stakes opinion work, regulatory filings, and tasks where citation hallucination is a hard failure — there, the review time can erase the drafting savings. For litigation discovery and document review, AI-assisted workflows show much larger gains (often 50%+) because the work is high-volume and verifiable. Most large firms now treat AI as a first-draft tool with mandatory attorney review rather than as an autonomous workflow.

Does AI make customer support agents 14% or 35% more productive?+

Both — and the gap is the point. In the Brynjolfsson, Li, and Raymond NBER study of 5,179 agents at a Fortune 500 software company, the average productivity gain was 14% (issues resolved per hour). But novice agents in their first two months gained 34% and reached the performance level of agents with more than six months of tenure. Top-quartile experienced agents showed essentially no measurable gain. The AI was effectively transmitting the implicit knowledge of veterans to newer agents. If your contact center is mostly experienced staff, plan for the 14% headline. If you have heavy seasonal hiring or high turnover, the effective gain is closer to 25–34% because most of your headcount is in the novice band that benefits most.

How much does AI help management consultants?+

It depends entirely on whether the task is inside or outside the model's competence frontier. The Dell'Acqua et al. "Jagged Frontier" study (HBS / BCG, 2023) randomized 758 BCG consultants on 18 realistic consulting tasks. Inside the frontier — brainstorming, structured writing, persuasive memos — GPT-4 users were 25% faster, completed 12% more tasks, and produced output rated 40% higher in quality. Outside the frontier — a quantitative business problem the model handled poorly — GPT-4 users were 19 percentage points less likely to produce a correct answer than the control group. The model gave confident, wrong answers, and consultants trusted them. The practical takeaway: AI is a strong accelerant inside its competence zone and an active liability outside it. The skill that matters is recognizing which side of the line a given task sits on.

Which roles gain the least from generative AI?+

Roles dominated by physical work, deeply tacit hands-on skill, in-person judgment, or strict verifiability requirements gain the least. McKinsey's 2023 and 2024 analyses estimate the smallest near-term productivity impact in skilled trades, construction supervision, surgical and physical-procedure medicine, on-site field service, and frontline manufacturing. Within white-collar work, the gains shrink for senior management whose bottleneck is interpersonal judgment, regulated audit and certain legal opinion work where every line must be verifiable, and any task where the cost of a hallucinated answer exceeds the drafting savings. For these roles, AI can draft, summarize, and search, but post-AI review time often equals or exceeds the time saved on the draft itself.

Do these AI productivity gains compound over time?+

Modestly, not exponentially. The 2023–2025 studies that re-measured the same population over months — including the Brynjolfsson NBER paper and the Accenture/Microsoft Copilot RCT — show gains that hold steady or grow slightly as workers learn prompting and integration patterns, but do not compound the way a Moore's-Law curve would. The bigger source of compounding is workflow redesign: teams that rebuild their work around AI (template libraries, agent loops, eval-driven prompts) extract more gain than teams that just bolt a chatbot onto the existing process. There is no peer-reviewed evidence so far for the "10x" or "100x" worker claims that appear in CEO memos and pitch decks. The honest planning curve is a one-time step up of 15–30% in throughput for in-frontier work, plus a small annual improvement as tooling matures.

How do I figure out my own AI productivity multiplier?+

Use the role table on this page to find a peer-reviewed baseline, then run the four-step adjustment in the methodology section: halve the lab number for field reality, halve again if you are in the top quartile of your role, subtract verification time on high-stakes tasks, and weight by the share of your week each task covers. Then track real output for four weeks (two without AI, two with) to validate. Most workers' tracked gains land 30–50% below their self-reported gains on surveys — the Microsoft Work Trend Index numbers in the 30% range tend to drop to the high teens or low twenties when objectively measured. For a structured worksheet that turns the multiplier into a rate or salary number, see the freelance AI rate calculator.

Does using AI mean I should accept lower pay or charge a lower rate?+

Not automatically. A productivity gain is split between worker, employer, and end customer based on market structure, not study results. In tight labor markets where AI-fluent workers are scarce — senior ML engineers, AI-native product managers, prompt engineers in regulated industries — workers capture most of the gain as higher salary. In commoditized markets — basic copywriting, level-1 support, generic data entry — the gain flows to the buyer as lower prices, and workers must either move up the value chain or accept lower rates. For freelancers, the strategic question is whether to keep your rate and serve more clients (capturing the gain as income) or lower your rate to win share (capturing it as growth). See AI tool cost vs salary savings for the employer-side analysis.

Sources

Related on Salary Hub

Stop writing AI prompts from scratch.

500+ ready-to-use prompts tuned to your profession, plus a builder that writes new ones for any task. Free 14 days, no card.

14 days free · no card required · cancel anytime