Today is the first day of DEF CON 27, arguably the world's best known hacker convention. Each year, thousands of people interested in security (and/or the hacking thereof) travel to Las Vegas to learn and gather with like-minded community. Some also attend Black Hat, a related conference which is typically scheduled right before DEF CON, also in Las Vegas. Not everyone who identifies as a hacker or is part of hacker culture writes code or uses Stack Overflow, but we would expect a significant proportion to do so. Well over 25,000 people attended DEF CON in 2018, all located in Las Vegas. Can we see any differences in traffic to Stack Overflow during the days of DEF CON? What can we learn about the hacker community from traffic during that time?
DEF CON 2018
Last year, DEF CON took place from August 9 to August 12. What did traffic from Las Vegas look like during the month of August? Let's look at Stack Overflow traffic as a proportion of US question views as a whole, and also look at another city for comparison. The scales on the y-axis are different so that we can see weekly variation in both cities.
Well, that is pretty clear, if you ask me. See that big spike for Las Vegas on August 9 - 11? There is about a 50% increase in the traffic we see from Las Vegas during the days of DEF CON (at least its first three days), in terms of proportion of US traffic or raw sessions (not shown here). What conclusions can we draw from this?
- There are a lot of people involved in DEF CON that didn't use VPNs to proxy their location when visiting Stack Overflow. I wasn't sure, given the security-minded nature of DEF CON attendees, if this would be the case or not!
- We don't see much proportional increase in traffic because of Black Hat, which took place from August 4 to August 9, and is also a large conference. I would conclude that there is more hands-on coding happening at DEF CON than Black Hat.
Notice the difference in weekly traffic patterns between New York and Las Vegas, keeping in mind that these are showing proportion of US traffic. Most cities around the world look like New York, with proportionally more traffic during the week and less during the weekends; more people commute into cities during the week and stay outside of cities on the weekends. Las Vegas is the opposite! It is unusual for a city (I guess this is true of Las Vegas in a lot of ways) in that we see proportionally more traffic on weekends than on weekdays.
Hackers gonna hack
Not only can we detect this increase in amount of traffic, we can also measure differences in what kind of traffic we see from Las Vegas during DEF CON compared to before and after the convention. We can look at traffic to question views during August 2018 and compare traffic during DEF CON to the rest of the month.
This plot shows the proportion of Las Vegas traffic that went to the top 20 tags for the city during last August, comparing the proportion during DEF CON to the proportion the rest of the month. Notice that there are three categories of tags here:
- Some tags, like Python, Android, strings, and Linux, saw increases during DEF CON in their proportion of Las Vegas traffic. These are the technologies that the DEF CON participants used more.
- A few of these top tags didn't change much at all, like Java and git. Both groups visited questions about these tags at about the same rate.
We can view this same information in a different way to get another perspective on these shifts in traffic.
Here, we have the overall proportion of traffic to each tag on the x-axis and the weighted log odds of visiting during DEF CON on the y-axis. An odds ratio is a way of quantifying how likely an event is; in this graph, think about this quantity showing us that the technologies above the line are visited more during DEF CON and the technologies below the line are visited less during DEF CON, compared to typical values for Las Vegas. This plot shows the top 50 tags in terms of absolute value of log odds ratio.
What can we learn here?
- Python is a huge winner during DEF CON, with enormously elevated levels of traffic. Notice that people are not using Python for data analysis, though, as pandas and dataframe are below the line (my own preferred data analysis language, R, is also below the line). Instead, it's likely that Python is being used at DEF CON as a general scripting tool. Bash, shell, and terminal are also visited more.
- We again see that web development technologies are being visited are lower levels during DEF CON. The tools that developers use to build the web are not the same tools of hacking projects.
- What do we see that did increase? Linux, the Android tag (but not iOS), tags that involve dealing with strings and how they are encoded, low-level languages like C and its compilers, Assembly, Docker, and counting systems like hexadecimal and binary are all among tags that were visited more during DEF CON.
Taken as a whole, we now have a glimpse into the kind of coding work done by the community of DEF CON. As DEF CON gets underway today and we hear about this year's talks and hacking projects, whether that's more like last year's exploration of voting equipment vulnerabilities or how to publish a fake paper in a fake journal, I'm glad to know more about what happens at this conference from a technical perspective. We use this same kind of data to help our clients reach, hire, and enable developers; learn more here!