- Microsoft and Open Source
- Community Reception
- Userbase
- Company Attachment
- Project Popularity
- Scale and Funding
- Features and Intuitive UI
- Integrations
- CI/CD
- Communication
- So What's That Give Us?
- Conclusion
I generally don't write on opinionated topics much but I'd like to have a discussion on GitHub. At its foundation GitHub was a solo entity that ran its own ship. It gained popularity with intuitive UI features around git such as pull requests. This began to pull in developers from the open source community and made their projects easier to manage and take contributions for. Then on June 4, 2018 Microsoft had announced it acquired GitHub. In this article I'll discuss the control that has established as well as the hurdles for alternatives for those concerned about the amount of control.
Microsoft and Open Source
The start of Microsoft had quite the friction with open source software and hobbyists. This can be seen in the infamous An Open Letter To Hobbyists dating back to 1976. There's also Steve Ballmer's thoughts on Linux being a cancer. History took an interesting turn however as Microsoft began working with open source and opening up several previously proprietary products as well. One notable event was Microsoft providing source code to the Samba project. This was fairly unheard of as Samba has already had some interesting history with Microsoft.
This became even more interesting as Microsoft started to open source core technology such as Powershell, the DotNET platform, their internal Linux distribution, and much more. All this hosting on GitHub also made sense acquisition wise.
Community Reception
Considering the history open source has had with Microsoft, it's not surprising that there were those reluctant to accept this at face value. GitLab and BitBucket both saw increases in user migration following the announcement.
It's also important to consider that working with open source also needs to be viable as a business opportunity. Given how many companies run AWS or Google Cloud primarily on Linux servers, trying to push away Linux would not be a good idea. Having things like Powershell available across multiple operating systems can help provide a "one stop" style solution as well. The recent developments with Terraform's license change have also kindled uncertainty in trust of commercial entities with open source related content.
So how can open source projects detach themselves from this kind of situation? Well, it's a bit complicated.
Userbase
This is probably the most crucial component out of everything. There is the danger of burnout from volunteer open source developers who can't get support or adoption for their code. With the last publicly announced 100 million user count, that's a lot of potential adopters and contributors. The largest alternative, GitLab, is estimated to be at around 30 million users.
An ideal alternative would have either a substantial userbase or a very high amount of contributors to projects to be viable
Company Attachment
BitBucket and GitLab both have organizational backing, BitBucket even more so. GitLab does offer an open source core for those to self host. Though I must say the "Contact Sales" link at the top navigation banner is quite interesting. The GNU project does offer hosting services, though it's very much conditional on buying into their philosophy. There are also some options of hosted software such as heptapod and Codeberg.
Finding a hosting solution backed by a considerable user base without a company behind it is difficult
Project Popularity
Part of the major userbase pull in GitHub revolves around hosting a considerable number of popular projects including Angular, React, Kubernetes, cpython, Ruby, tensorflow, and well even the software that powers this site Forem.
On the other hand you may find large ecosystems revolving around their own dedicated hosting. Gnome and KDE both have their own self-hosted GitLab offerings. For Ubuntu related projects there's Launchpad. Finally as has already been mentioned before GNU Savannah for GNU/FLOSS type projects. That said, having several logins to contribute is not quite ideal.
A number of popular projects can help contribute to growth of an alternative
Scale and Funding
I mostly put these together as they're directly related. In order to handle a large number of users, you need a decent amount of money or someone donating hosting/hardware. Either sponsored by large companies or funded from a mix of contribution sources. This is especially difficult if you're trying to avoid being too attached to companies. It really depends on if there are strings attached to contributions.
There's also the issue of keeping the lights on. With company backed solutions there are at least dedicated teams to dealing with infrastructure issues. For funded projects they need enough to hire capable individuals or have the hosting solution offload the infrastructure workload. Otherwise people might give up contributing because of frustrations with any issues that arise (in particular heptapod was fairly slow when I attempted to use it).
An alternative should run reliably and be able to handle operating expenses
Features and Intuitive UI
This is also somewhat bundled together. Changes made difficult due to UI workflows will frustrate users who simply want to contribute. Losing functionality such as pull requests, built-in CI/CD, fine grained permissions management, dependency tracking, security features, etc. might also make them not want to migrate elsewhere. For open source volunteer project members making it as easy as possible for others to contribute helps prevent burnout and project abandonment.
An alternative would benefit highly from a well designed user interface and features for streamlining common tasks
Integrations
A good majority of developer related tools that deal with source code management have integration with GitHub. Some may even use it to host components (GoLang packages). When developers go to use tools they feel comfort in knowing that GitHub support is a fairly standard offering. That said there are some packages which support standard git without relying on a platform like GitHub. It's still nice being able to handle both situations.
GitLab has a fairly competitive list of integration offerings. This would make it ideal for those who want to switch their workloads without a substantial migration burden. The open source solution gitea also provides a fairly reasonable set of integrations as well.
An alternative would benefit highly from integrations with popular development tools
CI/CD
While both a feature and integration, the importance is on a larger scale. Automated checks on PRs, deployments, scanning/analysis, etc. provided by CI/CD helps ease the burden of volunteer based open source software. This in turn helps mitigate the burnout situation. GitHub actually didn't have any CI/CD solutions and many projects used external solutions such as CircleCI and Travis CI. It wasn't until 2019 that GitHub Actions started to become a "don't need another external site" solution.
While there certainly are external solutions such as the mentioned CircleCI and TravisCI it's still another service to login to and manage. GitLab does have its own CI/CD system but it's not a complete compatibility with GitHub actions for migration purposes. It also lacks the library of existing solutions that third party GitHub actions provides (you can see how this ends up here). Gitea based solutions can rely on Gitea actions which is meant to be compatible with GitHub actions (and thus shared actions). That said, it's in a relatively new state and will require polishing to get certain actions to work.
An alternative should have a built in CI/CD system to ease maintainer burdens and not require another login
Communication
Being a collaborative effort requires ways to streamline communication. This includes PRs, issues, discussions, etc. It also means being able to moderate such discussions so maintainers are not overwhelmed by communication overload. While I've noticed GitLab instances with wikis, issues, and PRs, there doesn't seem to be a discussions like feature to filter out basic questions or proposals for features out of issues. It's possible that discussion can happen elsewhere it's still one extra step on the contribution process.
An alternative should make it easier for developers and contributors to collaborate
So What's That Give Us?
To be quite honest a unicorn solution that meets all of these requirements isn't very feasible. In order for users to want to give up what GitHub was offering:
- There would need to be an alternative with enough of the above implemented to make them want to switch
- Something in their belief/morals system drives them enough to not care as much about the rough edges and move away
Also a mass exodus isn't something you're going to see unless Microsoft does something really brash with GitHub. Some scenarios that would make this possible:
- Microsoft does something aggressive to open source projects such as deleting repositories or forcing payment
- GitHub suddenly starts charging for core functionality that was free before (GitHub actions)
- Microsoft does a 180 on their open source efforts and shuts down their entire GitHub presence without an alternative location for their previously open source projects (Powershell, DotNET, VSCode, etc.)
- GitHub is found to be spying on private repositories, which will not be a huge issue for open source where everything is open, but it would make people worried about what other information is being collected
The first option doesn't really make any business sense for Microsoft. Payment for core service is a possibility, though it would depend on the extent. Shutting down their open source offerings is possible, though it would probably require a substantial structural change in the company. The last one we really don't know until someone calls it out with considerable evidence. It's one of those "even if they were they're not going to flat out tell us". So assuming this were to happen would could we possible expect as the outcome?
Mass Migration To GitLab
In this situation a majority of individuals feel that they want the large community in a central location with a solid backing. This would most likely make GitLab unstable for a period of time dealing with the mass influx. However, it would also mean there would be less options and a major competitor put in a bad spot. This could mean a takeover of sales and business culture within the company which causes them to not make great decisions for open source.
Mass Migration To Community Solution
Another option is that a majority feels like migrating to GitLab may end up repeating the same problem. Instead they decide to utilize a still centralized, but community owned solution. This would probably be one of the more difficult choices as it requires duck taping something together quickly and then iterating on it to improve usability. It would all depend on how motivated the community is.
Ecosystem Fragmentation
It also could end up with major ecosystems branching out ala Twitter's situation where people become weary of having everything in a centralized location. This could lead to community sites for container ecosystems, JS ecosystems, programming language ecosystems, etc. While larger and more popular ecosystems might actually benefit from this focus wise, smaller ecosystems would lose discoverability. This would lead to higher abandonment of low adoption projects that might have turned into something good.
Fediverse Like Solution
The other solution is the communities are fragmented but are still able to interact with each other. This would require something similar to ActivityPub for the Fediverse. It should also support a shared authentication system to allow users to hop between different ecosystems with ease. A system for discovery mechanism for enabling promotion of smaller projects would also contribute greatly to helping the overall community. It does have the same issues with community centralization though, which is getting something up and running quickly for people to migrate to.
Conclusion
As it is right now I don't think we'll see the current situation change much. A good number of developers want things easy, and Microsoft hasn't done anything extremely brash enough to warrant people thinking about it. Even then there's also still a chance that another solution pops up that developers organically migrate to over time, over-passing GitHub at some point.
I think another part of it is that the free software philosophy needs to take a more modern approach to things. GNU websites are often stuck in the 90s style wise. Monetization approaches such as selling CDROM distributions aren't as viable anymore. Really the approach should focus more on making the user's lives easier first and foremost, followed by the philosophy of the source being open/free.
Top comments (2)
A big unknown still is what impact federation may have. It is now backed by gitlab, as well as the point of contention that led to the gitea/forgejo fork. But it is nowhere actually delivering yet. When it does it may change the landscape. Or, it may not.
Another big unknown is the legal landscape, particularly in respect to algorithmic copying, such as thru ai. A precedence that clearly removes copyright liability from algorithmic copying could make many, even proprietary vendors, fear using github hosting, and even larger alternatives like gitlab. In that scenario both the ease of migration (and gitea is extremely easy to migrate to) and the ability to scale thru federation may prove a winning scenario at the end.
I haven't really seen anything big with Federation yet. I mentioned a few self-hosted GitLabs like Gnome and KDE, along with launchpad for Ubuntu related development. The main issue is there's no glue I'm aware of that's holding them together much like ActivityPub holds Fediverse traffic together. Without that it's a big step backwards versus GitHub by dispersing the userbase.
I definitely think gitea looks like an interesting contender (I actually have a local install of it) but Gitea Actions needs some polish in the GitHub actions compatibility area. Honestly enough to where I think that should be the primary development focus. Being able to easily re-use other's actions to reduce the amount of code required is pretty huge. Something GitLab could certainly do better with.
The legal implications of AI is still there, just in a different form. If AI slurps up non-proprietary friendly licensed open source code and plops it into a proprietary code base close enough to be recognizable... things won't end well.