In this article I would like to analyze thedifferences between two tools that seem to overlap: GitHub Copilot and ChatGPT. What are the fundamental differences between the two? Which one to choose? And do you really have to choose?
ChatGPTChatGPT
Let's start by analyzing ChatGPT. It is a web portal where you can start a chat with a Large Language Model (LLM). There are several ChatGPT tiers:
With the free tier, we have a fairly minimal experience that allows us to interact with the GPT-3.5 model.
Things start to get interesting from the Plus tier, which already offers us the possibility to interact with the GPT-4 model. This also gives us access to the web. Access to the web is important, because when we ask a question to the model, it is able to answer based on "native knowledge" derived from its training. Let's say that the model has been trained with data from the web up to 2021, and we ask it who the Prime Minister of the United Kingdom is, it would answer Boris Johnson (the Prime Minister in office at that time). If we gave the same model access to the web, it would be able to give us the exact answer: Rishi Sunak (Prime Minister in office at the time of writing this article).
The third team, in addition to interacting with other models such as DALL-E, adds the possibility that the data sent through the requests will not be used to retrain the model.
GitHub Copilot
GitHub Copilot is a fine-tuning of the GPT-4 model for code. Fine-tuning refers to the ability to train a model by specializing it for a specific scenario, in this case working on code. The basic capabilities are therefore the same as GPT-4, which is already highly capable of working on code, with a specific specialization on this feature.
Just like ChatGPT, GitHub Copilot also offers different pricing tiers.
It can be observed how for the tiers Individual and Business the difference in features is mostly related to "Management and policies". The Individual tier is aimed at individual users, while the Business tier targets more corporate scenarios, where centralized user and policy management provides a significant advantage for tool administrators.
I will dedicate a separate paragraph to the Enterprise tier later in this article.
Terms and Conditions
Another fundamental difference between the two tools can be found in the Terms & Conditions (T&C). GitHub Copilot's terms and conditions of use ensure that the underlying model will not be retrained using user-inputted data. Essentially, even in the Individual tier, when GitHub Copilot analyzes your code to provide you with answers and suggestions, it does not use the portions of code analyzed to retrain its algorithm, thereby preserving intellectual property.
Regarding ChatGPT, this applies starting from the Team tier.
From the perspective of Copilot in Edge, however, Commercial Data Protection is guaranteed for the types of accounts listed in the first paragraph of this link, and only when accessing with the company account and not the personal account.
Due to data protection concerns, for professional use, I would never recommend a tier that does not offer data protection functionalities, for this purpose from now on we will consider a comparison between the different tiers of GitHub Copilot and ChatGPT Team.
IDE Integration
The main advantage of GitHub Copilot is the integration with the IDE: it is in fact born as a tool to suggest code in real time to the developer as he or she writes code. It infers from the context of what has already been written and what is being written to suggest entire portions of code in a pro-active way.
Over time, GitHub Copilot has evolved by adding several features, in addition to the Code Completion we just talked about: Chat and Smart Actions.
We can imagine the Chat feature as an implementation of a ChatGPT-like scenario. However, being a model specialized in code, the field is therefore restricted: asking GitHub Copilot who the Prime Minister of the United Kingdom is, it will respond:
If ChatGPT can answer both code and general questions, what's the advantage of using GitHub Copilot over ChatGPT?
Keeping in mind that this question compares a feature of a larger product (GitHub Copilot is not just its chat) to a complete product (ChatGPT is a chat service), the strengths of GitHub Copilot lie in its integration with the IDE.
Without leaving the Visual Studio Code screen (or Visual Studio or JetBrains), we can select portions of code and ask direct questions to our pair programming assistant. From the backend perspective, the question posed to the model will therefore contain:
- our context: the selected code
- our question, for example "explain to me what this portion of code does"
- the system prompt: the system prompt is a basic prompt that has been defined and written on the backend and surrounds the question we have asked. They are the basic instructions. In the simplest cases, we can think of the system prompt as a series of basic indications such as "You are a very helpful virtual assistant in helping. You always try to explain things in a complete but never verbose way and you are able to schematize complex concepts in order to make their understanding easier". This is a remarkably simple system prompt, GitHub Copilot's will clearly be more complex and will contain instructions such as "Only respond to questions related to the programming world", which generates responses like the one in the screenshot above.
This system prompt is one of the strengths of GitHub Copilot: the code generated by the tool is not directly passed on to the end user, but is filtered and double-checked in order to avoid scenarios of prompt-injection (a concept similar to SQL injection, but which applies to prompt engineering scenarios).
Even more important than the system prompt is GitHub Copilot's ability to access the extended context of our IDE. The context is formed by the files open in tabs and the files present in the open folder. In fact, there is the possibility, through the @workspace keyword, to ask broader questions about the entire repository that is open.
In the two screenshots above, we can see how GitHub Copilot Chat is able to analyze the entire structure of the folders, without having to specifically select portions of code, and provide me with the exact answers. In the second case, it is even able to understand the intended usage of certain services that I have described and how the APIs that I have defined work. It can also generate links to files so that they can be accessed directly without having to navigate the structure of my repository.
Taking other Visual Studio Code extensions that integrate a GPT model with our IDE as an example, functions are inherently more limited: Visual chatGPT Studio - Visual Studio Marketplace. As we can see in this case, the offered features only cover a subset of the functionalities related to the selection of code sections and the ability to ask questions about them.
But let's analyze an even more complex scenario than what we have seen so far: let's assume that we have two microservices communicating with each other through a queue. In addition to the "@workspace" tag, I also use the tag "#file:" to enrich the context of the chat by inserting, in addition to the selected code, another file. This way, I can ask how the event managed by microservice 2 is formatted inside microservice 1:
What is meant by "programming questions"?
It is interesting to focus on GitHub Copilot's narrowed operational context. When it's said that the tool is able to answer programming questions, we should not only think about the world of code in the narrowest sense.
We are also able to ask questions about frameworks, third-party tools we use in the code, and the hosting architecture of an application, such as "How can I create a secure AKS cluster?"
Therefore, general questions such as "What are the general principles of quantum mechanics?" are excluded. Questions that do not pertain in any way, even from an architectural point of view, to writing code.with the writing of code are therefore excluded.
However, if we need answers on certain topics in order to conduct an analysis for the code we are writing, do we have alternatives to obtain such answers? We can safely use more general tools such as ChatGPT or tools that natively have access to the web to provide answers such as Copilot on Edge.
GitHub Copilot EnterpriseGitHub Copilot Enterprise
In Enterprise tier, the gap in terms of features grows. In addition to normal Chat and Code completion (also present in the first two tiers), some truly interesting features are added:
- Knowledge of my organization's knowledge base: GitHub Copilot is able to index the repositories in my organization and answer questions related to all the files contained within it. An RAG scenario, or Chat With Your Data, focused on code that offers the possibility to ask the chat questions about the use of other projects compared to the ones I am currently working on (for example a shared function library used within my organization).
- Pull Request Diff Analysis: here we have the ability to have GitHub Copilot analyze the individual commits that make up a Pull Request highlighting the fundamental changes that impact it
- Web Search Powered by Bing (beta): We can ask questions about the latest updates to a framework and the answer will be generated by searching online content.Attention: we are also talking here about code-related questions.
- Fine-Tuned Models (coming soon): it is a feature that has not yet been made available in General Availability, but it will allow fine-tuning of the GitHub Copilot model based on our repositories. How does it differ from indexing? Simply put: the model will not have to search for answers based on user questions in an indexed information database, it will have the answer built into its model. Just as ChatGPT in the free tier can tell us who the Prime Minister of the United Kingdom is without searching for the answer, GitHub Copilot will be able to natively know which shared libraries exist within our organization. Suppose we have a library that is used to interact with OpenAI APIs, when GitHub Copilot suggests (even proactively!) code to call OpenAI APIs, it will not suggest a generic HTTP request, but rather suggest the appropriate library invocation. ## Conclusion If from a proactive point of view, GitHub Copilot appears to be the only tool that offers the service of suggesting entire portions of code without explicit requests, the native integration of the Chat with the IDE makes the use of the tool significantly easier compared to ChatGPT integrated services, which require a more manual intervention for context construction.
From a cost standpoint, ChatGPT Team has a higher costcompared to GitHub Copilot Business, which offers more advanced programming features. What is lost by not having the ability to do online searches - replaceable as we have seen with other tools - is gainedin terms of proactivity and ease of use of the tool. GitHub Copilot Enterprise, on the other hand, has a cost higher than ChatGPT Team but offers a series of truly interesting additional features for Enterprise scenarios.
Top comments (0)