It’s a live, auto-refreshing dashboard that you can use on Election Night 2020 to get more in-depth insights on where counted votes are coming from and how many might be left to count.
Great question! I'll walk through the charts you'll see on the Dashboard:
This map shades each county a color based on the margin either candidate currently holds in that county. You'll see these on other sites, too, and these types of charts offer a good overview of where each county currently stands at a point in time. The darker the shade of blue corresponds to the intensity of the Democratic candidate's lead; the darker the shade of red is the same for the Republican candidate. The mouseover gives a bit more detail on what is happening, with stats for exact margin, votes counted, and expected votes remaining.
There are a few things happening here, but I think this chart excels at illustrating how trends between the 2016 and 2012 elections are shaping up.
The bubbles represent the 2020 vote counted so far for each county, with size corresponding with # of votes (more votes=bigger bubbles) and the color corresponding to the category of that county.
The X-axis represents Biden's current margin in each county. Note that counties to the right of zero are ones that he is winning, and those to the left of zero he is losing. The closer a bubble falls to zero on the X-axis, the closer the margin is.
The Y-axis represents either Clinton's or Obama's margin in 2016/2012, depending on which you select to view. Interpreting this is the same but different as the X-axis: bubbles above the zero line are ones that Clinton/Obama won, and those below are the ones that they lost.
The real magic to this chart is the ability to quickly interpret the data to see where Biden is outperforming/underperforming his Democratic predecessors. If you see bubbles sitting further to the right than they sit on the vertical axis, that would indicate the vote is trending towards Biden. If those "purple" bubbles stay on the left of the vertical zero line, then Trump is still holding onto counties that he had flipped from Obama four years ago.
This is maybe my favorite chart in this bunch. The question I always have when I follow the data on election night is: where are these counted votes coming from, and what do they mean?? I am not an expert on each and every swing state county. It's not always clear what it means when pundits exclaim that there are still so many votes left to count from Broward County!
These are simple stacked bars, with the each segment of a bar representing the number of votes the candidate is earning from each county. The color represents county category. Whichever bar stacks the highest means that candidate is currently leading in the counted vote.
Ah, but look to the "Remaing" bar! Counties often report very unevenly, with more densely populated counties often taking longer to report. If Broward is slow to count on election night, you'll see a big blue chunk sitting here waiting to be distributed. (Caveat that "Remaining" votes is an estimate, explained further below)
One important thing to note when reading this column: the color of the county does not mean that all the remaining votes there are "red" or "blue" votes, just that they are coming from red or blue counties. Don't look at that remaining in Broward bar as 100% for Biden! It will be diviied up between the candidates at some...unknown rate.
This is a VERY simple estimation of how many votes could be remaining at the state/county levels at any point in time on election night. Don't take it literally, but it should provide a ballpark estimate of how many votes are left to count. It's simply a calculation of the 2016 turnout % (at the County level) against the total registered voters prior to the 2020 general election.
Will turnout be exactly the same in each county as it was in 2016? Clearly not. If I had to guess, I think it could be a little higher. But I wanted to keep this exercise simple on the forecasting side, so I think this method is transparent and easy to follow. If you have a different assumption of turnout, then let that carry into your reading of these charts!
Much of the story of the 2016 election can be told through the counties that had previously voted for Obama but had flipped to Trump. The categories represent this movement or lack of movement, and are a simple way of thinking about how that county has voted in the past two presidential elections:
Solid Blue/Solid Red: These are counties that consistently voted for the same party in the 2012 and 2016 general elections
Obama-Trump: These are counties that voted for Obama in 2012 and then voted for Trump in 2016
Romney-Clinton: These are counties that voted for Romney in 2012 and then voted for Clinton in 2016
I’m pulling the data directly from their respective state election data feeds. These public files are updated throughout the night as votes are counted. I’ve designed this app to refresh the data every two minutes on the back-end. Each time a dashboard on this website is loaded or refreshed, it is pulling the processed data from my AWS files, NOT directly from these government sites. The request load to these sites will therefore be minimal and controlled.
I am only one person and not actually very good at programming—it took a lot of effort just to get four states up and running! That said, I chose what I consider to be three east coast bellwether states plus one midwestern state. Once results start coming in for these four states on election night these dashboards should start to give a clear idea of where the evening is headed. I had created a very rudimentary version of this for Florida in 2016 and it started to raise some red flags (showing Trump driving a lot of turnout in red counties) fairly quickly.
I think it’s neat! I love elections and data visualization and hobby-level python projects. I had a very rudimentary Excel-based version of this running on my computer the past few elections and I knew I wanted to do it again for 2020. With all of the extra time indoors during the pandemic, I thought it would be cool to make a more sophisticated version of this. While I didn’t originally expect it to be this sophisticated, it snowballed into something I was really proud of and wanted to share.
More specifically, I always find myself frustrated when I become a refresh button junky on election night on the dashboards set up by major media outlets. The information there is very high level--you see the topline vote counts, the county counts, and some vague count of precincts remaining. Precincts are not a good measure of expected vote remaining because they vary greatly in size. I want to know where are the big stashes of votes remaining, especially as election night goes on and many votes have already been counted. And county names are just words on a page without some additional context. What happened in that county in 2016? In 20212? My dashboard is created around THOSE questions, and seeks to answer them in real time in a way that other dashboards fall short.
It’s my semi-anonymous internet handle. You can probably figure out who I am with some info on this page and some sleuthing….email me if you do! grackle@grackle.live
Great question! I am a hobbyist when it comes to Python and while I have taken on some projects over the years, this is by far the most sophisticated thing I have ever done. I’ll describe how this all works below, and if you actually know things, I’m open to feedback on better ways to accomplish what is here. Or just tell me that I’m great and did a fantastic job (grackle@grackle.live).
There are several components here. I’ll start with the data and move up the food chain from there:
Simply stated, I pulled the archival data from the PA/FL state election websites and processed them in Python using the pandas module. Pandas lets you do “Excel stuff” in Python very easily. Using it, I can pull in data and process/pivot it to get it into a table that can be read for graphs and charts. When the data updates live, Python processes the new information and appends it to existing historical data tables to create these dashboard tables. Imagine a spreadsheet with counties as rows and a few dozen variables in columns such as “2016 Clinton Margin,” “2016 Trump Votes,” “2020 Registered Democrats”, “County Category”, etc.
The data comes in from Florida as a very tall CSV/TXT file with each row representing a County, Race, and Candidate with that candidate’s counted votes in that county. A simple pivot and filter refines that to a table showing cumulative votes for each candidate by county.
Pennsylvania uses XML for their data, which is trickier for me to extract. I was able to read the text into BeautifulSoup (a python module typically used for data scraping) to get just the data I was looking for. This might be foreshadowing, but on election night I will have to assess the new file the state puts up and make any adjustments to my script on-the-fly. I’m hoping it doesn’t change too much from the 2020 Primary file, but we shall see.
Surprisingly, this was probably the easiest component of this entire process. I guess it’s typical to spend 10% of your time doing the fun stuff and 90% of your time getting the fun stuff up and running. I wrote the code for these charts using the Plotly module for Python sitting in a Dash environment that displays these charts on a website. Plotly provides a library of beautiful graphs and charts that are easy to implent in Python, and Dash brings it all to the web. As someone who is more of a hobbyist than an expert, it was all surprisingly easy to learn and use and bring my vision to life. In fact, it was so easy that I was inspired to keep advancing this project beyond "just some graphs that I can update on my laptop."
So how could I share my creation with the world? As someone whose knowledge of web design peaked around the first dotcom boom, I didn't expect to be easily get this up and running. Several Plotly/Dash tutorials I found pointed me in the direction of a service called Heroku, a platform for building and launching web applications.
This sounded complicated but with the help of some great documentation and Google searching, I was able to deploy my app from my personal GitHub page. Each time I pushed a new version of the app to GitHub, Heroku would pick it up and build a new version of the application. Personally, it was very exciting to see my dashboard go from something existing only on my computer to something live on the web! Heroku was surprisingly easy enough to use for a novice like me. I upgraded to a Hobbyist level account so things would run more smoothly on election night.
All the files I use for the dashboard are hosted on Amazon AWS S3. This service operates almost like a more sophisticated Dropbox/Google Drive. My scripts can read and write to the file bucket I created for this project. Once I got set up, it was surprisingly simple to adjust my scripts to go from file management on a local directory to using S3. The boto3 module for Python went a long way in keeping this simple.
Getting the back-end data refresh running on AWS Lambda might have been the most frustrating part of this entire process. I had actually used Lambda before to launch a very simple Twitter bot a few years ago. Essentially, Lambda lets you run simple scripts from the web, the same way you might run a simple script from your command line. I wrote a script that would pull the data from the FL and PA sites, process the data (as described above), append the new data to the historical data and refresh the static “dashboard” files on my S3 instance. Separately, the web app “checks” this file every couple of minutes and reloads the data into the visuals. By using Lambda, I separated the process of getting the raw data and refreshing the dashboard, eliminating the possibility that this application would stress the state election servers. The Lambda script will run every two minutes on election night, but I can adjust as needed.
The frustrating part was setting up Lambda to have all of the modules that I needed in Python. Typically you would write those modules to a zip file for Lambda to load. What I learned is that since Lambda runs on its own Linux build, certain modules (pandas, numpy, lxml) would need the “Linux version,” so to speak, in order to run on Lambda. This took a few attempts and most of a weekend to resolve but I was able to build the modules as layers using an AWS EC2 instance thanks to some tutorials and this extremely helpful video.
Like I said, I am awful at web design and I don't really care to learn more. Since I registered my domain on Google, using Google Sites was the fastest way to get the non-Dashboard pages you see here up and running. The dashboard is using a very simple and elegant stylesheet called water.css.
Some final thoughts here, but it’s pretty amazing what a hobbyist python programmer can do with some spare time and an itch to scratch. I mentioned earlier that this grew from a germ of an idea: that I should move my election night data tool onto Python. It kept growing as I realized that I could add another state, that I could launch it onto the web, that I could automate the back-end, and that it could all come together to become a cohesive and functional tool that I can share more widely. The tools out there are usually free or very-low cost with robust communities and a wealth of knowledge just a search away. I’ve been “teaching myself Python” for most of the last decade. There’s a lot I can’t do but I’m proud to reflect on what I built here and say that this is one of the things I can do.