Previous snapshots in this series covered how GitHub facilitates software development, how open source software benefits projects, and how open source sites like GitHub offer an additional pathway for development in key technology areas. GitHub stands as the most popular platform for sharing and developing software, both in private and public projects. To better understand how organizations across the globe utilize public projects on GitHub, it is important to reiterate the potential benefits of open source software for organizations.
Software hosting sites like GitHub act as a focal point between developers of a project and the downstream users. Internally, a team of developers can track changes being made to the code and coordinate responses to software issues. The public repository also provides an easy way to showcase documentation for the project, including Frequently Asked Questions, Start Up Guides, and Known Issues. Externally, developers interact directly with users through the issues board and the pull request system. A public issue board gives users access to information about the state of the project and provides an easy means of communication with the project’s developers. Forks and merged pull requests allow direct community input into the project. These allow major issues to quickly filter their way to the development team, accelerating awareness—and ideally mitigation—of software flaws. Overall, open source software gives users visibility into how the code they implement into their projects works behind the scene.
Software hosting sites like GitHub act as a focal point between developers of a project and the downstream users.
Beyond software development, open source software provides additional benefits to organizations. As covered in a previous snapshot, Microsoft can market its paid services through the showcasing of its open source tools. Introducing such tools to potential users through GitHub provides direct-to-consumer marketing. Furthermore, organizations can market themselves to potential employees through their GitHub pages. Demand for top software engineering talent continues to increase, and GitHub offers both candidates and organizations the means to promote their project history and skill level.
Organizations that wish to take advantage of the above benefits can do so through GitHub accounts. GitHub offers three types of accounts: individual, organization, and enterprise.1 According to GitHub’s documentation, “Every person who uses GitHub signs into a personal account. An organization account enhances collaboration between multiple personal accounts, and an enterprise account allows central management of multiple organizations.” Therefore, groups who have teams of developers utilize the organization account type to further ease the process of software development. Organizations can take the additional step of verifying their account.
Figure 1. The Vast Majority of Verified Organizations on GitHub Are United States-Based
Figure 1 maps the number of verified organizations hosting public repositories on GitHub within our dataset, which pulls all repositories with a topic tag relating to AI (as curated by CSET) or that is mentioned in CSET’s merged corpus of scholarly literature, including Digital Science Dimensions, Clarivate’s Web of Science, Microsoft Academic Graph, China National Knowledge Infrastructure, arXiv, and Papers With Code.2 As a result, this dataset is not representative of all organizations nor of all repositories on GitHub. While both individual and organizational users can provide a location in their account pages, these locations tend to be unreliable and inconsistent. To generate a more reliable set of locations, each verified organization was matched to an organization found in Crunchbase, LinkedIn, or ITJuzi. Figure 1 shows that the United States leads in terms of the location of verified organization accounts with public repositories. This is unsurprising—GitHub was developed in the heart of Silicon Valley back in 2007. Furthermore, alternatives to GitHub exist across the world and may have more geographical dominance. Gitee, the Chinese competitor to GitHub, launched in 2013.
Figure 2. Percentage Change of Verified GitHub Organization Accounts over the Past Three Years
However, the last three years of organizational presence on GitHub paint a slightly different story. Figure 2 maps the percentage change of a country’s verified organizations with public projects on GitHub from 2019 to time of writing, relative to before 2019.3 While the United States still leads in terms of overall number of verified organizations, non-U.S.-based organizations have increasingly joined in the past three years. The lower joining rate for U.S.-based organizations with open repositories might be explained by the high rate of U.S. organizations using GitHub over the past decade. Many major U.S.-based organizations created their profiles prior to 2019 and have maintained a presence ever since. The increase in countries such as Nigeria, Saudi Arabia, and India displays the growth of global interest in GitHub, and suggests more organizations are realizing the benefits of open source software.
Tune in for our next Data Snapshot exploring an organization that has become especially active in AI development on GitHub and beyond!
- Our dataset contains only individual and organizational account types.
- CSET’s merged corpus of scholarly literature includes Digital Science’s Dimensions, Clarivate’s Web of Science, Microsoft Academic Graph, China National Knowledge Infrastructure, arXiv, and Papers With Code. Data sourced from Dimensions, an inter-linked research information system provided by Digital Science (http://www.dimensions.ai). All China National Knowledge Infrastructure content is furnished for use in the United States by East View Information Services, Minneapolis, MN, USA.
- (# of verified organizations with account creation date from 2019 to 2022) / (# of verified organizations with account creation date before 2019)