In January, The Data Times released an early poll tracker of Virginia’s upcoming Gubernatorial election. A few months later, we temporarily shut it down due to the lack of polls being conducted at the time. Now that the race is starting to pick up, we are launching a new version of the model, which includes county projections and simulations of the election. This is our most complex plurality-vote model yet, and we intend to update it regularly alongside our Trump Approval Tracker.
Vote Share Projections
The statewide vote share projections are calculated similarly to our other models, using a weighted average of polls that takes into account recency, sample size, and pollster quality. Just as we’ve done with the introduction of any of our previous models, we’ve made small changes to how some of these variables are calculated. The Recency Score, which typically relies on an adjusted cubic equation, now also uses an exponential and linear equation (with the cubic one holding more weight). These equations are combined to form a more contextualized picture for any given time frame of polls, as some weeks will have many high-quality polls while others won’t.
Additionally, we’ve added four new polling firms to our pollster database. This includes SoCal Strategies, Roanoke College, Christopher Newport University, and Virginia Commonwealth University. Furthermore, the Sample Score has also been adjusted to the polling standard of Virginia, with mid-range polls usually having a sample size of 800. After these scores have been calculated, a final score for the survey is created based on each of these metrics. The Recency Score is the single most influential variable, followed by the Poll Score and finally the Sample Score.
County Projections
For our county projections, we apply a proportional swing from the 2021 election results to each jurisdiction. This is then automatically checked to see if the favored candidate (currently Abigail Spanberger) has been forecasted to receive greater levels of support than their party has in recent years.
Currently, county results are being cross-referenced with the results of the 2017 election. If the projection made by a proportional swing exceeds Northam’s vote share in 2017, a logistic regression tapers off the excess support to avoid overestimating Spanberger. If Earle-Sears were to take the lead, this same methodology would be used to prevent Spanberger from being underestimated in counties that don’t typically swing heavily from year to year.
Election Simulation & Odds
To simulate the election results while adequately accounting for uncertainty, we use a modified Monte Carlo simulation. Rather than simply calculating the deviation based on the overall sample of every poll included in the model, the sample of each survey is multiplied by the poll’s score, with the sum of each product being the sample that the initial margin of error is based on. After this is calculated, the margin of error is further increased based on the time remaining until Election Day. Lastly, the vote share projections used in the simulation assume an even split in undecided voters rather than proportionally splitting them based on current projections.
Every time the forecast needs to be updated, the election is simulated 10,000 times. This number is likely to increase as we approach Election Day. And as more polls are released over the next couple of months, our methodology will naturally continue evolving. Currently, The Data Times categorizes this race as Likely Democrat. This designation indicates that although Spanberger is favored to win, a major upset could still alter the outcome. To understand how these labels are assigned, view the table below.
Categorization | Chance of Victory | Margin of Victory |
Tilt/Tossup | <64% | +0-2% |
Lean | 64-82% | +2-5% |
Likely | 82-99% | +5-14% |
Safe/Solid | >99% | >+14% |