Introduction
I created Daily Stars Explorer out of curiosity to track the star trends of GitHub repositories. Currently, GitHub does not offer a graph showing the daily star changes for a repository.
Besides I can see that the most popular tool to track stars doesn't currently show the daily number of stars and it limits the cumulative stars to 40k, tracing a straight line from 40k to the current number of total stars.
I recognize that using stars as the sole measure of a repository's relevance can be risky. Various factors, beyond just quality, can influence the number of stars a repository receives:
- Popularity contests: Sometimes, projects gain stars simply because they become popular due to factors like marketing, promotion, or being featured in articles, blogs, or social media posts. This popularity can snowball, leading to more stars, even if the project itself may not offer substantial value.
- Trendiness: Projects related to trendy technologies, buzzwords, or topics may attract attention and stars, regardless of their actual quality or usefulness. For example, projects related to AI, Rust, or cryptocurrency may receive a significant number of stars due to the hype surrounding these fields.
- Novelty: Projects that introduce novel or unique ideas, even if they are not particularly useful in practice, may attract attention and stars simply because they are different. However, novelty does not always translate to long-term usefulness or sustainability.
- Community support: Projects with active and engaged communities may accumulate stars through contributions, feedback, and endorsements from community members. Even if the project itself is not outstanding, a supportive community can drive its popularity and star count.
- Historical significance: Some projects may have gained stars over time due to their historical significance or influence on subsequent projects, even if they are outdated or no longer actively maintained. These projects may serve as references or inspirations for newer projects, leading to continued star accumulation.
- Unbiased evaluation: Users may star projects for reasons unrelated to their quality or usefulness, such as personal preferences, curiosity, or experimentation. This can lead to inflated star counts for projects that may not deserve them based on objective criteria.
- Awesome lists: Those tend to accumulate a high number of GitHub stars, and one reason behind this phenomenon could be the perception of stars as bookmarks. Users might star repositories with the intention of revisiting them later for reference or exploration. Additionally, the sheer number of stars often acts as a social signal, prompting more users to star the repository, thus perpetuating the cycle of popularity.
- Buy stars: Sounds crazy but is also possible to buy stars from fake or even real accounts. This blog post analyses in detail this phenomenon.
Having said that, I understand that observing star trends can provide valuable insights into a project's perception and its popularity trajectory. Additionally, I notice that many projects actively solicit GitHub stars, highlighting their significance within the community.
The idea behind my project is to treat the daily star count as an intriguing time series that can be analyzed with statistical tools. Whether you find this valuable or not is for you to decide.
Features
Full History of Stars
My project offers you the ability to access the full history of stars for a GitHub repository. It not only shows you the stars per day but also provides a cumulative stars graph. This way, you can visualize how a repository's popularity has evolved over time.
Generate CSV and JSON
Easily save the star history as CSV or JSON files, with a daily and cumulative star count for each day since the repository's creation. You can then analyse the time series with the tools of your choice.
Caching and Data Refresh
To keep things efficient, I've implemented a caching mechanism. Once you've fetched the history of stars, the data is cached for ten days. During this period, you have the option to refresh the data up to the current day. Please note that the graph will display data up to the last complete UTC day.
Compare Repositories
For those curious about how two repositories stack up against each other, my project offers a comparison feature. This is something I wasn't sure to add, please consider the factors that might influence the number of stars.
Aggregates and trends
In the Transform drop down is possible to select different levels of aggregation and also see the trend of the time series (using FB Prophet library).
Patterns noticed
Spikes
Using Log Y-Axis
Constant growth
The project was started 10 years ago and is showing an interesting constant growth in the number of daily stars.
Limits
- Using one GitHub PAT the app can query up to 500k stars per hour. If this limit has already been reached, you will need to wait until the next hourly refresh.
- Rate limit: There's also a 60 maximum requests per hour for the full star history API, that should be enough for a human and it helps preventing bots running and use all the resources including
- Fetching Time: The time it takes to retrieve all star history (if not already cached) depends on the total number of stars. To overcome the 40,000-star limit, I leveraged the GitHub GraphQL API. Unfortunately, this doesn't allow for parallel requests. The workaround is to fetch the first half of the stars from the beginning and the other half from the end simultaneously, which can be time-consuming for large repositories. Retrieving the complete star history for Kubernetes (~108k stars) typically takes about 3 minutes.
Conclusion
This tool has been an interesting exercise for me to experiment with different technologies, you might find it valuable or completely useless. Curious to read in the comments what are your thoughts.
Top comments (0)