Everybody loves to talk about Moore’s law—which states that transistor counts on a microprocessor double about every two years—when talking about speed and productivity improvements in computing. But what about the gains that come from new algorithms? Those gains may exceed ones from hardware improvements.
Algorithmic improvements make more efficient use of existing resources and allow computers to do a task faster, cheaper, or both. Think of how easy the smaller MP3 format made music storage and transfer. That compression was because of an algorithm. The study cites Google’s PageRank, Uber’s routing algorithm, and Netflix video compression as examples.
Many of these algorithms come from publicly available sources—research papers and other publications. This “algorithmic commons”—like the digital commons of open source software and publicly available information—helps all programmers produce better software.
But where do these algorithms come from and who provides the research to develop them? New research from MIT sheds light on who’s building our algorithmic commons and why. We dug into the research and then asked the study’s principal researcher, Neil Thompson, research scientist at MIT's Computer Science and Artificial Intelligence Lab (CSAIL), about what this commons means to the field of computer science.
Algorithms: Made in the USA
The research looked at where the researcher who produced the algorithm was born and where they were when they produced the algorithm. And the US was top in both.
38% of all algorithmic improvements came from researchers born in the United States, but a whopping 64% of improvements were made by researchers working in United States. That means that, while the US education system has nurtured lots of computer scientists, it’s even better at attracting top talent to research institutions. Outside the US, richer countries generally made more contributions to the commons.
The researchers note that there is room for change and improvement here. A nation’s contribution to the algorithmic commons is not related to population, but to per capita GDP. “This suggests that algorithm development may exhibit the ‘lost Einsteins’ problem seen in the creation of patentable inventions (Bell, Chetty, Jaravel, Petkova, & Van Reenen, 2019), wherein those with natural talent never get to benefit the world because of a lack of opportunity.”
A positive feedback loop?
The vast majority of algorithmic improvements came from public and non-profit institutions like colleges and governments. The top contributors were those universities that ranked highest in computer science: Stanford, MIT, UC Berkeley, and Carnegie Mellon.
Private institutions made contributions, too, including IBM, Bell Labs,and the Rand Corporation. But one wonders why a for-profit institution would give away their competitive advantage so that their competitors could have access to their research?
Q&A with Neil Thompson
Q: Of the algorithms that you looked at, how many were proprietary and how many were published and freely available?
A: To be included in our data, algorithms needed to be made publicly known at some point so they could make their way into textbooks and the academic literature. We don’t know how quickly this happened for cases, like IBM, where the algorithms where initially developed privately.
Q: For algorithms developed at public institutions, who funds them? Is there any expectation that the funder will get the benefits of the algorithm?
A: This is something that we are looking into, but we don’t yet know the answer. Since many algorithms are developed by mathematicians and computer science theorists, we would expect them to be funded through a mix of overall university funding that provides academics the time to work on these problems, and organizations like the National Science Foundation.
Q: For proprietary algorithms, do you think there's a benefit to sharing them publicly for the creators?
A: To answer this question, we can draw on the analogy of open source software, where firms also disclose information that could otherwise have been kept proprietary. In open source, one of the big benefits is having the community coalesce around your topic and contribute to developing it. So, for algorithms, I would expect the biggest benefits of public disclosure to come when a firm wants to recruit outside algorithm designers to further improve upon the firm’s work.
It’s also quite likely that firms would get a reputational boost from making important algorithms public, which might help with recruiting and retaining talented computer scientists.
Q: Why do you think people and firms contribute to the algorithmic commons?
A: There are clearly a mix of motivations for contributing to the algorithm commons. One motivation is pragmatic: there is a problem that the discoverer needs a better algorithm to solve. This is probably the most important innovation for many companies, but is also common amongst user-innovators across many fields. Another motivation is intrinsic: there is something fun about teasing apart a tough question and finding a clever solution.
Q: When a group of related firms share the same algorithms, what determines which software is better?
A: Having better algorithms is like a chef having better ingredients. It isn’t a guarantee that you’ll get a better meal, but if the chef uses them right, you will. The same is true for algorithms. Give a software designer better algorithms and they have the potential to build something better than the software designer that doesn’t.
Q: What, in your opinion, are the most important algorithms for industry productivity?
A: This is an excellent question, but one we don’t yet know the answer to. Check back with me in a year or so – I hope to have an answer then!