DEV Community

Artem
Artem

Posted on

Thinking about Chain-of-Thoughts

The main issues of Large Language Models (LLM):

  • Solving complex logical problems (Finding implicitly specified information)
  • Security (Resistance to hacks and the ethics of behavior)
  • Hallucinations (Generation of new information that does not correspond to reality)

All problems are complex and interrelated. When solving a complex task, we expect the model to produce something new, something we don't know and haven't written in the request: that is, the model should generate information that is not represented in the request. When we ask the model to find something, we expect it to write information that is present in the request. Often, a complex task involves searching for information within the request: here, the contradiction with which the model is confronted becomes apparent.

Introduction

OpenAI has recently made a breakthrough with a new model in addressing these issues (link). The new model uses the Chain-of-Thoughts (CoT) technique to solve tasks.

Preview of ChatGPT
Here is how a dialogue with ChatGPT o1-preview looks. A user's request initiates a whole chain of actions, in which data is synthesized by the model. It's unknown whether all data is displayed. In the end, all the deliberations are hidden, and the user receives a composed response.

Meanwhile, the company continues to adhere to the principle of a minimalistic interface:

  • The user inputs a prompt
  • The model internally performs stepwise actions
  • The user receives a summarized answer, which significantly reduces user effort

This is accompanied by a cool animation that shows the stages of "thinking" by the model, making it more intuitively understandable. Based on several statements, one can draw certain conclusions:

  • The internal dialogue will be hidden from the user in the future
  • Increased thinking time is considered beneficial, suggesting more in-depth and thorough information processing.
  • The model does not aim to be polyglot, possibly for token usage optimization or dataset specialization. However, it works quite well with Russian.
  • The model consumes an enormous amount of tokens compared to existing ones, and the cost is biting, not to mention the API access restrictions.
  • The model works better with direct and clear instructions.

Deep Thinker
Sometimes the time it may take to solve the issue can be... long.

Abstractly speaking, the model contains a loop through which the input data is passed. At each stage of the loop, the data is enriched with synthetic information. In two stages: the first generates "some" instruction, and the second produces the model's response. There is a certain mechanism for exiting the loop. Some or all of the information is passed through summarization. How exactly this is implemented in the model is unknown to me, but the logic of the process is quite clear:

  1. The loop can be realized either inside the model or as an external tool
  2. The instruction can be fixed, selectable, or generated by the model
  3. The model's response can be generated by an internal or external model
  4. The loop can be controlled by the model or by some external instrument
  5. Summarization can be controlled by an internal or external model

The five points above are unknown variables that will affect the quality of the final answer. The question arises: should all five points be synthesized by the model, or not? If not, how many is it better to leave for synthesis? Should non-synthetic information be added at some stage or not? Should the user see the process of thinking, or not, or only part of it?

Regardless of the model's effectiveness, this approach will have long-term consequences for the entire industry: how much data will be expected from the user, will synthetic data be shown to the user, etc.

The concept of "Chain-of-Thoughts" (CoT) in the context of language models can be understood as a structured approach to interacting with the model, where the conversation unfolds in a series of steps, with each response from the model guiding the next question or query. This method differs from simply providing a long prompt to the model all at once, as it allows for a more interactive and dynamic dialogue. Below is an explanation of the CoT process:

What is Chain-of-Thoughts?

Chain-of-thoughts can be implemented with a variety of models, not just with the latest cutting-edge ones.

Here's how it works:

Structurally, CoT is a sequence of messages that are sent to the model consecutively. The key feature is that the model's responses are included in this sequence. In a typical scenario, the chain lengthens with each new user query and the model's response. The question is: how does this differ from sending a single, long prompt to the model at once?

Image description
Chain of questions for creating a chain of thoughts
Image description
On the left (A) is the numbering of comparison positions for GPT-4o-mini. On the right (B) through a dialog/chain of thoughts. In the first case, an answer was also obtained. However, the chain allows to get a more structured answer.

The difference lies in several aspects:

  • Structured Approach: Instead of a single response, a structure is created that allows for a conversation flow.
  • Step-by-Step Process: Interactions become more detailed, with each step possibly affecting subsequent ones.
  • Interactivity: Each stage can be independently modified, leading to a more iterative process.

This is essentially a dialogue between the user and a conversational AI, much like any other chat interaction.

An example sequence of messages in a CoT process might look like this:

  • Here is my problem. How should I approach it?
  • Response 1
  • What are the drawbacks of this solution?
  • Response 2
  • How can I overcome these drawbacks?
  • Response 3
  • Provide the final solution considering the feedback.
  • Response 4

Image description
An example of Chain-of-Thoughts
Image description
On the left (A), pattern generation is shown using a single query to GPT-4o-mini. On the right (B), through a dialog/chain-of-thoughts. It is obvious that the answer on the right is more saturated with examples and more specific.

It's important to note that the questions are quite general but are given specific context by the user's descriptions of the problem, the situation, and the constraints involved. The model adapts to the conversation context and deepens its understanding of the problem with each response.

The CoT is fed to the model incrementally: first, the initial question is sent, then the first pair of question-response, and so on. The dialogue becomes more complex and data-rich with each step. Technically, CoT is more token-intensive than a single prompt, making it more expensive and slower in terms of processing time. It also works poorly with asynchronous interactions because the next message cannot be sent until the response to the previous one is received. Therefore, I've rarely used GPT-4 until the advent of GPT-4o-mini.

Chain-of-Thoughts (CoT) can help us understand the model better:

  • I have this problem. How do I solve it? On the second stage, it will be sent not just the second message, but the first message, the response to it, and the second message:

  • I have this problem. How do I solve it?

  • Response Model 1

  • Instruction to add synthetic data from the model. Data that helps understand "how the model thinks"

  • Response Model 2

  • Request for a final solution

Image description
Chain of questions for generating synthetic information)
Image description
On the left (A), pattern generation is shown using a single query to GPT-4o-mini. On the right (B), it is through a dialogue/chain-of-thoughts. In both cases, a problem with the order of output has been noted. However, the response on the left is significantly more detailed.

For the model, the CoT would appear as a text with requests, responses, and system instructions marked accordingly. It is designed to accept more and more of these in a growing sequence.

The duration of the cycle and the point at which the cycle ends are dependent on the number of questions asked. Additionally, the user can control the model's operation at each stage of the dialogue. Instructions can be either fixed or dynamically generated with the help of the language model.

Chain-of-Thoughts, or CoT, should be considered a method of breaking down information presented to the model. Any problem you have when interacting with the model has certain nuances that are important specifically to you. These nuances need to be provided to the model to get the most qualitative answer possible.

This information consists of several parts:

  • Your instructions in Chain-of-Thoughts can vary from simple requests to complex, multi-level tasks. They may include requests for text creation, data analysis, formulation of arguments, etc. The key here is the clarity and accuracy of your instructions to ensure the model correctly understands your requirements.
  • Your data and examples play a crucial role in Chain-of-Thoughts. They help the model understand what kind of responses you expect, create the necessary context for contemplation. These can be "good" examples that show the desired result, as well as "bad" that point to problem areas.

In fact, any model can be used to create a Chain-of-Thoughts, provided the context size allows it. This is an interesting and multifaceted tool for model management. To successfully use Chain-of-Thoughts, it is important to consider the context. This may include previous requests, information about the user, current circumstances, etc. Finally, Chain-of-Thoughts is a dynamic process. You can add new instructions and examples as needed, adjust the model's responses, and improve the results.

Top comments (0)