What if all code was open source? Whoa! Now that’s a can of worms! Six months ago I began the journey of removing/reducing big-tech from my life, and the more I learned, the more I’ve come to the conclusion that the only technology I can truly embrace is Open Source. Furthermore, I believe that companies switching to a business model based on Open Source would be more just and healthier for the economics of our society.
I know this wouldn’t make everyone happy, and there are many challenges with incentivizing this kind of shift. But I’ve begun to see this as a necessary change for technology to serve society rather than society serving tech companies. While I’m not an economist—and don’t expect my ideas to necessarily solve complex economic issues—I hope I can contribute to a discussion through which we can figure out how tech should look in a just and sustainable future.
It seems like privacy is often a bit misunderstood when it comes to tech. I, like most people, felt that privacy wasn’t that important to me because I “had nothing to hide…” One day while I was searching my Google Photos for “hiking,” I realized Google was almost certainly using this machine learning auto-categorization for other purposes than providing convenient photo search results. Google is constantly crawling through all of my most personal and cherished moments. It can tell when I’m with friends, what I’m doing, and perhaps even read trends in my emotions by analyzing facial expressions. Of course we don’t have proof this is happening. But as a developer, I know if I worked for a company that made money by selling incredibly targeted ads, the value of using this data to more closely target ads would immediately be obvious to me.
No longer is it simply the content you search and view online that determines what is shown to you in ads and future search results. It would also be the subtle visual cues that exist in photos of your most personal moments; like Sherlock Holmes, figuring out what makes you tick simply by observing the quality of your shirt, or the type of mud spattered on your shoes. Do you wear Nike shoes in your pictures, or a flannel shirt? This information might inform different assumptions about what you value as a person, and affect the ads and content you are exposed to during your online experience. I’m not saying these things are definitely happening. I’m saying the technology is there, the incentives are there, and if I were working on these products, exploiting this technology would be an obvious move for business goals.
After my realization about my Google Photos library, I started doing a little research about privacy at some of these big tech companies. While the Netflix documentary “The Social Dilemma” is a little melodramatic and not terribly technical in my opinion, it made some interesting points. It suggests that the machine learning models these companies have built about us are so accurate they could not only predict who we are right now, but they could also subtly and gradually direct who we become, and what we will value in the future. Humans have evolved, to varying degrees, the ability to discern when another human is manipulating us, but the film argues we have very little ability to resist the kind of big-data driven machine learning models we are now confronted with. The model is so accurate it can predict exactly what kind of content will trigger us to become interested in a new product, or take on a new belief.
In a chapter of his recent book, “21 Lessons for the 21st Century”, Noah Yuval Harrari outlines the potential dangers of an unexamined plunge into a post-privacy world. He argues that a combination of unprecedented data collection, analytical power, and neuro-psychology hacks may not only revive fears of a 1984-like authoritarian surveillance state, but—more disturbing still—will also challenge core beliefs regarding human free will, conscience, autonomy, and dignity. He also suggests that, as automation makes massive swaths of the population economically obsolete, our personal data may be the most valuable item we have to trade in the economy—making our seeming willingness to currently give it away for free all the more troubling. Regardless of how accurate such warnings may turn out to be—or whether the outcome will eventually be deemed positive or negative—the current reality in which we are rapidly giving away reams of personal data with little thought of the likely irreversible consequences should be cause for concern.
Based on the reporting by Jeff Horwitz and Deepa Seetharaman of the Wall Street Journal, and former Facebook employee Frances Haugen speaking recently on 60 Minutes, we know that Facebook already has algorithms that “exploit the human brain’s attraction to divisiveness,” and favor “more and more divisive content in an effort to gain user attention & increase time on the platform.”
While “the algorithm” isn’t explained precisly, I wonder if it is fueled by ever growing personalized data mining capabilities. Money is made when we engage with content. And if the algorithm can lead us to become interested in new content—perhaps a slightly more extreme viewpoint of an already held personal belief—a whole new world of content and marketing opportunity opens up, driven by profit as a goal.
Even for something as controversial as gun control, there are many who are quite moderate: such as a hunter from Minnesota who believes in going through all the proper safety training to use a firearm to provide food for his family, or a suburban family from Connecticut where the idea of individuals owning firearms seems unnecessary and dangerous. When constantly exposed to material that reenforces and builds upon a previously held belief, or material that demonizes other perspectives, what was once a moderate personal belief now becomes an extreme viewpoint that can accelerate distrust between opposing groups.
So is this an issue intrinsic to ad-based business models? Perhaps, but we can’t just make this kind of business model go away. People like free things, and are used to seeing ads, so there would always be an incentive for ad-based revenue. Besides, I don’t think all ads are bad. I use the DuckDuckGo search engine, and they claim their ads are completely based on the search query itself, and not anything they’ve gathered about me personally. (I unfortunately have to take their word for it because not all their code is Open Source.) That’s a business model I find acceptable and choose to use their product. Maybe for someone else that wouldn’t be acceptable and they would choose to not use this product.
Even if we did move away from ad-based revenue, that wouldn’t completely alleviate my privacy concerns. There are so many instances of data leaks in the news due to negligence where private user data like social security numbers are stored unencrypted in databases, and in some instances, personal information was available publicly through API endpoints, secured only through obscurity. Obviously these companies weren’t intentionally trying to compromise users’ privacy, but the major issue at large is consumers don’t know how secure their data is.
My solution for these concerns is Open Source. For the moment ignoring all the business questions this raises, let’s examine how Open Source could solve or improve the issues of privacy negligence or outright manipulation.
For one, Open Source code means you as a user would know exactly what the product you are using does. Or even if you don’t understand the code, someone does, and any questionable software behavior will soon come into the public eye. Machine learning can be awesome, and if it’s used to analyze your pictures and makes them easier to search, that’s great, if that’s all it does! If the application is also feeding that data into a central user model, or selling it off to third parties, that likely isn’t acceptable to the majority of users. In Open Sourced code, you would know, and get to choose to not use that product, boycott, or request changes to the behavior of the code. You have the power of choice based on facts, bringing tech privacy practices under the market influence of informed consumer preference.
In the second scenario regarding paid services that simply have negligent security practices, open-sourcing the code would immediately bring public attention to these vulnerabilities. The company would be incentivized to fix them immediately, or better yet, the community would submit fixes that the company could simply review and include in their software. A common misconception is that Open Source code is less secure because hackers can read the code and exploit vulnerabilities without needing to reverse-engineer the implementations, but the truth is there are so many more good actors constantly reviewing code that any vulnerabilities are almost always found and fixed before a bad actor even knows they exist. A great example of this is the Linux kernel and its many Open Source operating system distributions; it is widely regarded as the most secure operating system, even though all of its code is publicly available. Countless developers worldwide share the responsibility of catching vulnerabilities and keeping it secure, rather than maybe a couple dozen that you might have at a proprietary software company.
Now to deal with the elephant in the room: Open Sourcing all code would wreak havoc on most of the large tech companies out there. Ad-based tech companies would lose a massive amount of customers as people became aware of how their information was being used, and paid products might suddenly lose business to other companies that copy their code and start offering the service as their own. What business model would allow a company to write code that is all Open Source and still survive? Two possibilities come to mind: change our model to selling expertise rather than code, and only require companies over a certain size to Open Source.
By definition, Open Source means anyone has the right to use the software including for profit, however the GPL license requires that any derivatives of your software must also be Open Source. This essentially means that nobody can “steal” your code since it’s all in the public domain to begin with. This is where the big paradigm shift comes.
Executives like to think of code as the asset they are selling. In my experience, it is the domain expertise of developers at an organization that constitutes the value of an organization. And this is how businesses like Nextcloud make a profit. While they offer official, hosted Nextcloud accounts for a monthly fee for those who are less technically inclined, their main source of revenue comes from selling their expertise to large organizations that self-host the software on their own infrastructure and need support or focused development work to support their needs.
I’m a big fan of this business model because it addresses another problem I see in tech. The idea that a SaaS business can scale infinitely with nothing more than a little extra infrastructure cost and a bit of extra support staffing doesn’t seem healthy. The Nextcloud business model follows a more traditional pattern where the number of people they employ to develop and support their product directly correlates to the amount of revenue they generate. Since the software is available for free, the number of “users” they have is irrelevant; instead, they sell their expertise, almost as consultants.
This may not be a popular business model among the tech community since it likely means our “Silicon Valley” salaries and lifestyles might have to come down a little. While the current economic philosophy has moved on from Mercantilism—and we believe wealth can increase without being taken from somewhere else—from a moral standpoint it doesn’t feel fair that we can make infinite amounts of money without much extra input. This feels especially true when I have not personally seen that wealth travel very far into the rest of a society facing a very concerning growth in income inequality, gentrification, and the resentment and division this inevitably inspires.
Open Source may be a bridge too far for small companies just getting off the ground. Perhaps small companies can keep Closed Source code as they build their business, keeping things simple, and allowing them to reap the benefits of their hard work immediately. But once they’ve “made it” and hit a certain size or revenue, regulation or professional norms could kick in, requiring them to Open Source their code and focus their business model on serving their customers.
As far as ad-revenue based companies, I think this is still a viable business model under Open Source as long as these companies stick to transparent practices that users find acceptable, helpful, and non-manipulative. Everyone will have a slightly different tolerance for what is acceptable, but it is their right to choose the products they use knowing what they are getting—like a grocery store shopper determining the organic, fair trade, or locally grown status of their food. DuckDuckGo’s business model works here, where the code proves they only serve ads based on query results, and as a user, you can decide if the service is appropriate for you. This variety of needs would likely lead to more competition targeting different audiences, which is also probably not a bad thing for a healthy Tech economy increasingly facing allegations of monopolistic behavior.
I’m just a guy that wants to know what my photos are being used for and what the phone in my pocket is doing. I know my analysis and solution ideas have holes, but I hope I’ve at least sparked some interest and more discussion about the future and our part in it. I look forward to engaging in this conversation and hearing your perspectives and ideas towards building a more sustainable future for tech.