Step 2 score improvement

Q: How much can I realistically improve before Step 2?

The average student in 1,006 tracked reports improved +24 points from their first practice test (about 75 days out) to the real exam. Students starting at 200-224 gained +35 on average, while students starting at 255 or higher gained +6.

Common study choices compared with final exam scores in 2,426 r/Step2 score reports.

Main findings

Built from 2,426 real score reports, including 1,006 where the first and last practice tests could be dated. Two patterns appeared most consistently:

The average student gained +24 points between their first practice test and the real exam. Students starting at 200–224 gained +35.
Qbank choice, questions past ~2,500, reported CMS average and additional study weeks each had little independent association with the final score. Targeted review and practice-test use showed larger associations.

Observed improvement

Each line tracks the same students from their first practice test (taken 35+ days out) through their last to the real exam, n = 1,006 reports.

Starting score	First test	Last test	Real exam	Total gain	Reports
200–224	215	243	250	+35	266
225–239	233	251	257	+24	312
240–254	246	259	263	+17	268
255+	262	267	268	+6	118

Of the average +24-point gain, +19 appeared between the first and last practice tests, and +5 was the average increase from the final practice test to the real exam. Looking only at students whose first and last tests were both NBMEs, the gain is +18 points on the exact same scale. Gains were largest among students with the lowest baseline scores. Some of the lowest group's increase may reflect regression toward the mean after an unusually low first test.

Estimated association for each study choice

Each bar shows the midpoint of the estimated association with score improvement:

Choice	Estimated change	Evidence in these reports
Second pass of your weak material	+3 to +5	Two-pass students beat one-pass students with the same first-pass accuracy by +5.0 points (first pass 55–64%) and +3.4 (65–74%). At 75%+, the observed difference was 0.0 points. n = 248 two-pass reports.
Adding a second qbank	+1 to +4	At the same UWorld accuracy (65–74%), students who added AMBOSS scored +3.9 higher. The gap shrinks to about zero at 75%+. This observational comparison also reflects who chooses to complete two banks.
More practice exams	+½ to +1 each	Comparing students with the same starting score and timeline, each added practice test was associated with up to a point of additional improvement. The 4-plus group beat the 3-or-fewer group by +0.7 points (n = 221). Practice tests also provide current readiness estimates.
More questions past ~2,500	none observed	Total questions completed vs final score: r = 0.02 across 472 reports. Flat from 1,500 to 4,300+, and still flat when comparing students who started from the same score.
Reported CMS average	none observed	US MD/DO reports with a CMS average showed a mean practice-score change of +21.1; reports with no CMS data or mention showed +21.1 (n = 112 vs 179). Non-reporting does not establish non-use, so this cannot isolate the forms' learning effect. CMS averages were weak prediction variables, while the forms can still support shelf review and NBME-style reasoning.
Switching banks (UWorld vs AMBOSS)	none observed	The same percent correct points to the same real score on either bank, within about 2 points everywhere. Full comparison.
More dedicated weeks	not measurable	Longer study periods were associated with smaller gains, likely because students who were struggling extended their preparation. These reports cannot isolate the effect of added time. Study-time data.

These are observational comparisons from self-reported data. The analyses match students on baseline measures such as first-pass accuracy or first practice-test score and compare subsequent improvement. They cannot establish that a study choice caused the difference.

What this analysis cannot isolate

Nearly everyone in these reports completed weeks of questions, review and practice exams. Because there is no comparison group that skipped this work, its contribution cannot be estimated from these data. The 1st percentile of question volume is 430 questions; nobody skipped dedicated; nobody skipped practice exams entirely. The tables can compare differences in resources and study patterns, but they cannot estimate the effect of dedicated study itself.

The measurable differences include qbank choice, question volume past 2,500, reported CMS average, and additional study weeks. Those variables add little predictive information once practice scores are known. The available measures all reflect medical knowledge: NBME scores, UWorld percentages, UWSA scores and the real exam all rise and fall together (correlations of 0.6–0.8). The reports cannot determine whether UWorld and AMBOSS teach equally well because students did not choose resources at random. CMS averages were also weak standalone predictors, while CMS forms remain useful for practicing NBME-style clinical reasoning; that learning value is not measured here. Across resources, students generally answered questions, reviewed errors and revisited weak topics.

Study choices supported by these data

Choose one primary qbank and review it carefully. The data did not show a clear score advantage for UWorld or AMBOSS.
Follow percent correct as well as coverage. Accuracy correlated with the final score at r = 0.57; question volume correlated at r = 0.02.
Consider a targeted second pass below 75% first-pass accuracy. Repeating weak material was associated with 3 to 5 additional points in this group.
Shift toward practice exams at 75% and above. Additional questions, a second bank and another pass showed little association with final scores among higher-accuracy students.
Space at least four practice exams through dedicated. Each additional test was associated with roughly half a point to one point of improvement.
Use CMS forms for NBME-style reasoning. They can reinforce shelf content, target weak subjects, and provide practice with NBME clinical-question wording. CMS average did not add much score-prediction value once NBME and UWSA scores were known.
Use practice-score trends when considering more study time. Students who studied longer did not gain more on average. A flat trajectory may call for a change in review strategy. See the score-swing analysis.

The free predictor combines your own practice exams and their timing to estimate your score and prediction interval.

Common questions

How much can I realistically improve before Step 2?

The average student in 1,006 tracked reports improved +24 points from first practice test to the real exam. Students starting at 200–224 gained +35 on average, while students starting at 255 or higher gained +6. Among students whose first and last assessments were both NBMEs, the average increase was +18 points.

Does it matter which question bank I use for Step 2?

The same percent correct maps to approximately the same real score on UWorld and AMBOSS. After NBME scores are included, qbank percentage adds little to a score prediction. Choose the bank that fits your study needs and review it thoroughly. See the comparison.

Which study activities were associated with improvement?

Complete question blocks, review each miss, revisit weak topics, and measure progress with practice exams. In this analysis, a targeted second pass was associated with 3 to 5 additional points when first-pass accuracy was below 75%. Students who took at least four practice exams also improved more. Percent correct correlated with the real score at r = 0.57; question count correlated at r = 0.02.

Why can a popular resource show little independent effect here?

Students improved about +24 points while using many different resources. The common element was sustained question review and practice testing. Differences in qbank choice, volume past ~2,500 questions and add-on resources had little independent association with the final score in this dataset.

Combine your scores

Combine your practice scores and test dates in the full predictor. Open the predictor →

Charts & guides

Browse the score analyses, study-data pages, and assessment converters.

Score TwinsScore reports from students with similar practice results UWorld vs AMBOSSReported qbank percentages and final Step 2 scores Accuracy and methodsBlind-test results and a comparison with five other predictors Score swings & late dropsHow common large swings are and what a late drop predicts When to schedule your examGoal probabilities based on recent practice scores

Score guides

Study stats

NBME conversions

More conversions