UWorld vs AMBOSS for Step 2 CK
Both question banks measured against the real exam scores of the students who used them, across 2,426 score reports from r/Step2.
The verdict
Among 2,426 score reports from r/Step2, 88% mention UWorld and 8% mention the AMBOSS Qbank; 1,972 and 48 of them pair a percent correct with a real exam score. Small AMBOSS sample, but the patterns are consistent enough to take positions:
- AMBOSS is probably the harder bank, question for question. The reports never say so directly; the usage patterns give it away. Details below.
- For predicting your score, the banks are interchangeable. A 70% student is headed for roughly 258 whichever logo is on the questions.
- Question volume is overrated. Across 472 reports, total questions completed had essentially zero relationship with the final score. Accuracy is the game, with one exception: a second pass adds a few points when your first-pass accuracy is under 75%.
- A second qbank buys a few points at mid accuracy and almost nothing at high accuracy. It is a refinement, not a requirement.
Same average, same destination
Fit each bank's reports separately and ask what a given average implies, and the answer is the same curve twice: every estimate from 60% to 80% lands within 3 points of the other bank's.
| Qbank average | Step 2 if it is UWorld | Step 2 if it is AMBOSS |
|---|---|---|
| 60% | 250 | 248 |
| 65% | 254 | 252 |
| 70% | 258 | 256 |
| 75% | 262 | 261 |
| 80% | 266 | 265 |
Fitted averages, not personal predictions, and they reflect each bank as people actually used it, settings and timing included. The UWorld column rests on 1,972 reports, the AMBOSS column on 48.
For a real prediction, the full predictor combines your qbank average with practice exams and timing.
Every report on one chart
Each dot is one real score report; teal dots are AMBOSS Qbank averages and can be hovered or tapped for that person's full test list. The dashed line is the UWorld fit, the solid line the AMBOSS fit. They run nearly on top of each other.
Real scores by qbank average
UWorld (n = 1,972)
| UWorld average | Average Step 2 | Median | Reports |
|---|---|---|---|
| 50–54% | 243 | 244 | 56 |
| 55–59% | 248 | 249 | 182 |
| 60–64% | 252 | 253 | 337 |
| 65–69% | 256 | 257 | 465 |
| 70–74% | 260 | 261 | 397 |
| 75–79% | 264 | 264 | 297 |
| 80%+ | 268 | 269 | 229 |
AMBOSS Qbank (n = 48)
| AMBOSS average | Average Step 2 | Median | Reports |
|---|---|---|---|
| 60–69% | 250 | 252 | 19 |
| 70–79% | 260 | 261 | 19 |
| 80–89% | 270 | 270 | 6 |
AMBOSS rows are small samples, hence the wider bands; rows under 5 reports are hidden.
Is AMBOSS actually harder? Probably.
The raw numbers say no. Among the 36 students who reported a percent on both banks, the median gap was 0.5 points and exactly 50% scored higher on AMBOSS. If that were the whole story, the difficulty debate would be a myth.
It is not the whole story. 10 of the 36 explicitly describe doing AMBOSS after UWorld, switching banks midway or saving AMBOSS for dedicated; none describe the reverse. That stated-order group averaged +7.5 points on AMBOSS relative to their own UWorld, which is what a practice effect looks like. On top of that, score reports never mention session settings, but filtering out AMBOSS's hardest "5-hammer" tier is common practice, so plenty of reported AMBOSS percents were likely earned on an easier subset of the bank.
Put it together: students reach AMBOSS at their strongest, often trim its hardest questions, and still only match their earlier UWorld percentage. That is not what an equally hard bank looks like. The best reading of this data is that AMBOSS runs a few points harder question for question, and the gap is hidden by when and how people use it. It is an inference from 36 reports, not a controlled experiment, but the arrows all point one way. AMBOSS's own self-assessment fits the pattern, running about 17 points below the real exam (see the table below). For score prediction none of this matters, because the percent-to-score mappings above already absorb it.
Does doing both banks help?
Students who used both banks averaged 261 against 257 for UWorld-only, a 4.2-point gap. Most of that gap is not the second bank: the kind of student who finishes one bank and starts another was always going to score well. Holding UWorld accuracy constant shrinks it fast:
| UWorld average | UWorld only | Both banks | Difference |
|---|---|---|---|
| 55–64% | 250 | 255 | +5.1 |
| 65–74% | 258 | 261 | +3.9 |
| 75–89% | 265 | 266 | +0.8 |
The same pattern shows against practice exams: students who used both banks finished about 1.8 points above what their NBME scores predicted. So the honest number for the second qbank is a couple of points, mostly for students not already at the top. At 75%+ on UWorld the second bank adds +0.8 points, which is noise. Finish and review one bank first; add the second only if your accuracy has plateaued and the calendar allows it.
Question volume is overrated
Combine each report's completion percentages with the banks' sizes (about 4,300 UWorld and 3,400 AMBOSS questions) and you get total questions completed for 472 reports. The correlation with the final score is r = 0.02. Zero, for practical purposes:
| Questions completed | Average Step 2 | Median | Reports |
|---|---|---|---|
| Under 1,500 | 257 | 258 | 63 |
| 1,500–2,500 | 256 | 257 | 98 |
| 2,500–3,500 | 256 | 258 | 177 |
| 3,500–4,300 | 258 | 257 | 80 |
| 4,300+ | 257 | 256 | 54 |
The grind from 1,500 to 4,300+ questions buys nothing visible. Holding accuracy constant changes nothing either: among students at 65–75% on UWorld, finishing under half the bank versus over 80% of it moved the average by about 2.1 points. Compare all of that with accuracy, where the correlation is r = 0.57, and the conclusion writes itself: the score follows how many questions you get right, not how many you get through.
The obvious objection, as with study time: maybe struggling students grind more questions, and a real benefit hides inside a flat average. That story fails both checks the data allows. First, question volume barely relates to where students started (r = +0.08 against the earliest practice test taken 45+ days out, n = 148); if anything, slightly stronger students do slightly more questions, which should make volume look better than it is, and it still shows nothing. Second, improvement says the same: from that earliest test to the real exam, students under 1,500 questions gained +28 points on average while students past 4,300 gained +24, and comparing students who started from the same score, each additional 1,000 questions moved the final score by -0.5 points. Self-reported completion keeps this short of a controlled experiment, but the "did you finish UWorld twice" arms race has no support here. Do enough questions to cover the content, review them until the percentage moves, and spend the rest of the time on practice exams.
One kind of volume does help: repeating your weak material. At the same first-pass accuracy, students who did a second pass of UWorld outscored single-pass students by +5.0 points when the first pass landed at 55–64%, by +3.4 at 65–74%, and by -0.0 at 75% and up (248 second-pass reporters). The comparison is between students who scored the same on their first pass, so this is not just stronger students choosing to repeat. Pushing further into a bank stops helping early; repeating weak material keeps helping until your accuracy crosses roughly 75%, and past that nothing does except practice exams.
Their self-assessments, compared
Both companies also sell full-length self-assessments, which beat any qbank percentage as predictors. Two things worth knowing about each: how far it typically lands from the real score after calibration, and how much it underestimates you out of the box.
| Self-assessment | Reports | Typical error | Real exam vs test, median |
|---|---|---|---|
| UWSA 1 | 1,621 | ±6.6 | +14 |
| UWSA 2 | 1,773 | ±6.0 | +5 |
| UWSA 3 | 356 | ±6.0 | +17 |
| AMBOSS Self-Assessment | 561 | ±7.2 | +17 |
The AMBOSS SA underestimates by about as much as UWSA 1 and UWSA 3; only UWSA 2 lands close to reality, which is why it anchors most prediction folklore. As single predictors they are within a point or two of each other. AMBOSS does deserve one credit here: its free score-prediction service was the most accurate third-party predictor in the head-to-head benchmark, though it still trailed this site's model. Conversion guides: UWSA 1, UWSA 2, UWSA 3 and AMBOSS SA.
Which should you use?
Default to UWorld first. Not because its questions teach better, which this data cannot judge, but because it is the bank the entire prediction ecosystem is calibrated around: 1,972 of the reports here carry a UWorld percentage, so every conversion is sharper. Pick AMBOSS if you want harder stems, the integrated library, or you already burned through UWorld during clerkships. Either way the playbook is the same: one bank, finished and reviewed properly, accuracy over volume, and practice exams as the real yardstick. When you want the number, the free predictor takes your qbank average, whichever bank it came from, and weighs it alongside everything else.
Common questions
Is the AMBOSS Qbank harder than UWorld?
Probably, by a few points. Reported percentages come out nearly even (median gap 0.5 among the 36 students reporting both), but most hit AMBOSS after finishing UWorld, when they are stronger, and many filter out its hardest "5-hammer" questions. Matching your earlier UWorld percentage under those conditions points to harder questions underneath. None of this matters for prediction: a reported percent maps to about the same real score on either bank.
What does 70% on the AMBOSS Qbank mean for Step 2?
About 256 on the real exam, essentially identical to what 70% on UWorld implies (258). The AMBOSS estimate rests on 48 reports, so it is rougher, but it lands in the same place.
Do I need to do both UWorld and AMBOSS?
No. Students who used both banks scored about 4 points higher on average, but most of that is who they are rather than what the second bank did: at the same UWorld percentage the gap is about 4 points for mid scorers and under a point for high scorers. Finish one bank properly first; add the second only if your accuracy has plateaued and you still have time.
How many practice questions are enough for Step 2?
Fewer than the arms race suggests. Across 472 reports with completion data, total questions completed had no measurable relationship with the final score (r = 0.02); students under 1,500 questions averaged the same as students past 4,300, and improvement from the earliest practice test was no larger for heavy grinders. Accuracy, not volume, predicts the score. The exception is a second pass of your weak material, worth several points if your first pass was under 75% and nothing past it. Do enough questions to cover the content, push your percentage up, then stop counting.
Which predicts Step 2 better, UWorld or AMBOSS?
Neither has an edge: r = 0.57 for UWorld and r = 0.58 for AMBOSS against the real score. The UWorld mapping is better calibrated because it rests on 1,972 reports versus 48. Practice exams beat both, which is why the full predictor weighs NBMEs and UWSAs more heavily.
Get a real prediction, not a rule of thumb
The full predictor combines all of your practice tests with their timing, shows calibrated probability ranges, and was the most accurate of every predictor tested against real scores. See the head-to-head comparison.