David Mohl

Posted on Dec 4, 2022 • Originally published at david.coffee on Dec 4, 2022

Refactoring code with ChatGPT

#chatgpt #ai #go

From my blog: https://david.coffee/refactoring-code-with-chat-gpt/

ChatGPT is really damn impressive and the more I read about it, the more it feels like withcraft. I see people using it for game development, coming up with titles, improving prompts for MidJourney and and and. It's shaping up to be one of those universally useful tools that are great to keep around.

Here's one thing I had on my mind: Can it help me when I code?

I was working on a little pet project and had some pretty inefficient code I slapped together for a middleware while watching TV:

package middleware

import (
    "context"
    "encoding/json"
    "net/http"
)

var parsedJsonBodyContextKey struct{}

func ParseJsonBody(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        var data map[string]interface{}
        err := json.NewDecoder(r.Body).Decode(&data)
        if err != nil {
            ctx := context.WithValue(r.Context(), parsedJsonBodyContextKey, nil)
            next.ServeHTTP(w, r.WithContext(ctx))
            return
        }

        ctx := context.WithValue(r.Context(), parsedJsonBodyContextKey, nil)
        next.ServeHTTP(w, r.WithContext(ctx))
    })
}

The code takes whatever is passed into the request body, tries to JSON decode it, then puts the result into context. If no body exists or isn't JSON, it will pass nil instead.

I wanted to shorten and make it a bit more efficient but though it'd be a cool experiment to let ChatGPT do it:

Me: Can you refactor the following code to make it more efficient?

<code from above here>

ChatGPT: Here are a few ways the provided Go code can be imporved:

The ParseJsonBody() function defines a data variable to store the parsed JSON body, but it doesn't appear to be used anywhere in the function. This variable can be removed.

The ParseJsonBody() function creates a new context.Context with a key-value pair for the parsed JSON body, but it sets the value to nil in both branches of the if statement. This means that the value of the parsed JSON body will always be nil, regardless of whether the JSON body was parsed successfully or not. The value of the parsed JSON body should be set to the parsed JSON data if parsing was successful, and nil if parsing failed.

The ParseJsonBody() function passes the original http.Request object to the next handler, even if the JSON body was parsed successfully. In this case, the http.Request object should be updated with the new context.Context that contains the parsed JSON body before being passed to the next handler.

Here is how the code can be improved:
package middleware

import (
  "context"
  "encoding/json"
  "net/http"
)

var parsedJsonBodyContextKey struct{}

func ParseJsonBody(next http.Handler) http.Handler {
  return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
      var data map[string]interface{}
      err := json.NewDecoder(r.Body).Decode(&data)
      if err != nil {
          ctx := context.WithValue(r.Context(), parsedJsonBodyContextKey, nil)
          next.ServeHTTP(w, r.WithContext(ctx))
          return
      }

      ctx := context.WithValue(r.Context(), parsedJsonBodyContextKey, data)
      next.ServeHTTP(w, r.WithContext(ctx))
  })
}

Huh, turns out I copy pasted and forgot to change the normal branch to actually add data into the context - whoops. ChatGPT spotted this, understood what this was supposed to do and fixed it.

Can we make it more compact though?

Me: Can you make the code more compact and more DRY?

ChatGPT: Yes, the provided code can be made more compact and DRY (Don't Repeat Yourself) by removing the duplicate code and using a single context.Context object.

Here is how the code can be improved:

package middleware

import (
  "context"
  "encoding/json"
  "net/http"
)

var parsedJsonBodyContextKey struct{}

func ParseJsonBody(next http.Handler) http.Handler {
  return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
      var data map[string]interface{}
      err := json.NewDecoder(r.Body).Decode(&data)

      ctx := context.WithValue(r.Context(), parsedJsonBodyContextKey, data)
      next.ServeHTTP(w, r.WithContext(ctx))
  })
}

This is exactly what I had in mind for refactoring this code: No need for the if condition at all because var data map[string]interface{} is nil by default. Only nitpick would be that we assign err and never use it, so the compiler will complain about this code.

Let's add a function to get the data out of the context again. Can ChatGPT summon this function for us?

Me: Please add a function that takes the body out of the context again

ChatGPT: To extract the parsed JSON body from the context.Context object, a new function can be added to the middleware package. This function can take a context.Context object as an argument and return the parsed JSON body, if it exists in the context, or nil otherwise.

Here is an example implementation of this function:

package middleware

import (
  "context"
  "encoding/json"
)

var parsedJsonBodyContextKey struct{}

func ParseJsonBody(next http.Handler) http.Handler {
  return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
      var data map[string]interface{}
      err := json.NewDecoder(r.Body).Decode(&data)

      ctx := context.WithValue(r.Context(), parsedJsonBodyContextKey, data)
      next.ServeHTTP(w, r.WithContext(ctx))
  })
}

func GetParsedJsonBody(ctx context.Context) map[string]interface{} {
  val := ctx.Value(parsedJsonBodyContextKey)
  if val == nil {
      return nil
  }
  return val.(map[string]interface{})
}

To use this function, you can pass a context.Context object as an argument and check the return value for nil to determine whether the JSON body was parsed successfully or not.

Here is an example of how this function can be used:

package main

import (
  "context"
  "fmt"
  "net/http"

  "github.com/myproject/middleware"
)

func main() {
  http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
      parsedBody := middleware.GetParsedJsonBody(r.Context())
      if parsedBody == nil {
          w.Write([]byte("JSON body was not parsed"))
          return
      }
      w.Write([]byte(fmt.Sprintf("

Short and concise. It understood what I wanted it to do, it delivered a solution and even gave an explanation with examples how to use it!

After a bit of hacking I noticed that using map[string]interface{} is kind of a PITA because we have to do many type assertions and checks if keys exist. I got curious if we can tell it to swap out the type into something custom that has a nicer interface:

Me: Instead of using map[string]interface{}, let's use a custom type that has methods for getting the value for a given key, or a default

ChatGPT: To use a custom type that has methods for getting the value for a given key or a default value, you can define a new ParsedJsonBody type that satisfies the context.Context interface. This type can be a struct that contains a map[string]interface{} field to store the parsed JSON body.

Here is an example implementation of the ParsedJsonBody type:
type ParsedJsonBody struct {
  data map[string]interface{}
}

func (p *ParsedJsonBody) Value(key interface{}) interface{} {
  if key == parsedJsonBodyContextKey {
      return p
  }
  return nil
}

func (p *ParsedJsonBody) Get(key string) interface{} {
  return p.data[key]
}

func (p *ParsedJsonBody) GetString(key string) string {
  val := p.data[key]
  if val == nil {
      return ""
  }
  return val.(string)
}

func (p *ParsedJsonBody) GetInt(key string) int {
  val := p.data[key]
  if val == nil {
      return 0
  }
  return val.(int)
}
The ParseJsonBody() middleware function can be updated to use the ParsedJsonBody type instead of map[string]interface{}:

Hmm okay, this is where things get a bit weird, but that could have also been my way of explaining what I wanted.

The code looks ok on first skimming, but it ignored my instruction to have a default. I think it interpreted that as "we'll return a default", but not one I can specify.

There's lots of type assertions in there that don't have any checks, so this will panic if we were to try to GetInt something that isn't an int. Also, you can define a new ParsedJsonBody type that satisfies the context.Context interface is nonsense, and so is the Value() method.

What I actually wanted was a custom type like type MyType map[string]interface{}, not a struct with a type field, but whatever.

Let's ask directly if it thinks anything is wrong with that code:

Me: Thanks for the code, do you see any problems with this implementation?

ChatGPT: There is a potential problem with the provided implementation of the ParsedJsonBody type. The Value() method returns a pointer to the ParsedJsonBody struct, but the Get(), GetString(), and GetInt() methods return values of the underlying data field. This means that if the ParsedJsonBody struct is modified after it is added to the context.Context object, the modifications will be reflected in the context.Context, but not in the return values of the Get(), GetString(), and GetInt() methods.

To fix this problem, the Get(), GetString(), and GetInt() methods should return copies of the values in the data field, rather than the values themselves.

Here is an updated implementation of the ParsedJsonBody type that fixes this problem:
type ParsedJsonBody struct {
  data map[string]interface{}
}

func (p *ParsedJsonBody) Value(key interface{}) interface{} {
  if key == parsedJsonBodyContextKey {
      return p
  }
  return nil
}

func (p *ParsedJsonBody) Get(key string) interface{} {
  val, ok := p.data[key]
  if !ok {
      return nil
  }
  return val
}

func (p *ParsedJsonBody) GetString(key string) string {
  val, ok := p.data[key].(string)
  if !ok {
      return ""
  }
  return val
}

func (p *ParsedJsonBody) GetInt(key string) int {
  val, ok := p.data[key].(int)
  if !ok {
      return 0
  }
  return val
}

The good: It fixed the type assertions and now has a checks, neat!

The bad: The explanation it gave on why it added those doesn't make sense. It didn't actually change those to return copies of values. Let's do one more optimization pass.

Me: Please further optimize this code

ChatGPT: Here are a few ways the provided implementation of the ParsedJsonBody type can be further optimized:

The Value() method currently returns p if the key argument matches parsedJsonBodyContextKey, and nil otherwise. Since parsedJsonBodyContextKey is a global variable that is only used in the ParseJsonBody() middleware function, this check is unnecessary. The Value() method can simply return p without checking the key argument.

The Get(), GetString(), and GetInt() methods currently check if the specified key exists in the data field and return a default value if the key does not exist. This check can be removed, because the data field is a map[string]interface{} and trying to access a non-existent key in a map will return the zero value for the type of the map's value (interface{} in this case, which is nil).

The GetString() and GetInt() methods currently use the ok return value of the type assertion to determine whether the value for the specified key is of the expected type. This check can be removed, because

Me: Please show me an implementation with those points

ChatGPT:
type ParsedJsonBody struct {
  data map[string]interface{}
}

func (p *ParsedJsonBody) Value(key interface{}) interface{} {
  // The key argument is not used, so we can simply return p
  return p
}

func (p *ParsedJsonBody) Get(key string) interface{} {
  // We don't need to check if the key exists, because trying
  // to access a non-existent key in a map will return the zero
  // value for the type of the map's value (nil in this case)
  return p.data[key]
}

func (p *ParsedJsonBody) GetString(key string) string {
  // We don't need to check the type of the value, because trying
  // to assert a non-string value to a string type will return
  // the zero value for the string type ("" in this case)
  return p.data[key].(string)
}

func (p *ParsedJsonBody) GetInt(key string) int {
  // We don't need to check the type of the value, because trying
  // to assert a non-int value to an int type will return the
  // zero value for the int type (0 in this case)
  return p.data[key].(int)
}

It got more compact for sure, but the code is wrong. It got rid of the type checks with the confident explanation that type-asserting a non-int value to an int type will return the zero value for the int type, but this is not correct and will panic. Here's the code in Golang playground to try out: https://play.golang.com/p/8jjBJyrJ1uC

While impressive, at this point it's internal state of my program and what I'm trying to do is a bit too messed up, and the confidently wrong explanation+implementation can be dangerous. For a moment I self-doubted my own knowledge of Golang and had to try this out in the playground to make sure.

The network isn't specifically trained on code, so I guess it just doesn't fully understand the inner-workings of Golang, without providing enough context to it first.

Generating unit tests

Let's do something fun - unit tests (groans)! If it can spit out functional unit tests for stuff I'm too lazy to write, that would already be very helpful. After all, it seems to understand code and is able to give explanations for what's going on.

The code here is a snippet from a JWT token parser. It validates the token, extracts sid/uid scopes (UUID) and checks if those are actually valid UUIDs.

Me: Consider the following Golang code:

// other code here
func (c *Client) validateToken(t jwt.Token) (*Token, error) {
  if err := jwt.Validate(t); err != nil {
      return nil, err
  }

  // parse sid
  sid, ok := t.Get("sid")
  if !ok {
      return nil, domain.ErrTokenInvalidSID
  }
  sidUUID, err := uuid.Parse(sid.(string))
  if err != nil {
      return nil, domain.ErrTokenInvalidSID
  }

  // parse uid
  uid, ok := t.Get("uid")
  if !ok {
      return nil, domain.ErrTokenInvalidUID
  }
  uidUUID, err := uuid.Parse(uid.(string))
  if err != nil {
      return nil, domain.ErrTokenInvalidUID
  }

  return &Token{
      SessionID: sidUUID,
      UserID:    uidUUID,
  }, nil
}
// other code here

Please generate some testcases for when sid or uid are invalid

ChatGPT: Here are some test cases for when the sid or uid values in the token are invalid:

func TestValidateTokenInvalidSID(t *testing.T) {
 // Create a new client
 client := NewClient("some-secret-key")

 // Create a test token
 token := jwt.New()
 token.Set("uid", uuid.New().String())
 // Set the sid value to something invalid
 token.Set("sid", "invalid-sid")
 token.Set(jwt.ExpirationKey, time.Now().Add(time.Minute * 5).Unix())

 // Sign the token
 signedToken, err := jwt.Sign(token, jwt.WithKey(client.alg, client.hmacSecret))
 if err != nil {
     t.Errorf("unexpected error signing token: %s", err)
 }

 // Validate the token
 _, err := client.validateToken(signedToken)
 if err != domain.ErrTokenInvalidSID {
     t.Errorf("expected error to be %s, got %s", domain.ErrTokenInvalidSID, err)
 }
}

func TestValidateTokenInvalidUID(t *testing.T) {
 // Create a new client
 client := NewClient("some-secret-key")

 // Create a test token
 token := jwt.New()
 // Set the uid value to something invalid
 token.Set("uid", "invalid-uid")
 token.Set("sid", uuid.New().String())
 token.Set(jwt.ExpirationKey, time.Now().Add(time.Minute * 5).Unix())

 // Sign the token
 signedToken, err := jwt.Sign(token, jwt.WithKey(client.alg, client.hmacSecret))
 if err != nil {
     t.Errorf("unexpected error signing token: %s", err)
 }

 // Validate the token
 _, err := client.validateToken(signedToken)
 if err != domain.ErrTokenInvalidUID {
     t.Errorf("expected error to be %s, got %s", domain.ErrTokenInvalidUID, err)
 }
}

Sweet as. Sadly it's missing a bit of context here too, so I can't copy-paste the test as is. jwt.Sign() isn't returning a jwt.Token, but the jwt token in signed form, so it can't get passed to validateToken, but there's no way that it could have known that, without feeding the the jwt package into it first. Besides that it looks good, and the tests run after fixing that!

And to go full circle, now that we have more unit tests, let's ask it to optimize my validateToken function as well:

One possible optimization for the validateToken function would be to move the code that parses the sid and uid values into a separate function, and then call that function from validateToken instead of duplicating the code. This would make the code easier to read and maintain, and would also avoid repeating the same error-checking code multiple times.

Here is an example of how the optimized validateToken function could look like:

func (c *Client) validateToken(t jwt.Token) (*Token, error) {
 // Validate the token
 if err := jwt.Validate(t); err != nil {
     return nil, err
 }

 // Parse the sid and uid values from the token
 sid, err := c.parseSID(t)
 if err != nil {
     return nil, err
 }
 uid, err := c.parseUID(t)
 if err != nil {
     return nil, err
 }

 return &Token{
     SessionID: sid,
     UserID:    uid,
 }, nil
}

func (c *Client) parseSID(t jwt.Token) (uuid.UUID, error) {
 // parse sid
 sid, ok := t.Get("sid")
 if !ok {
     return uuid.UUID{}, domain.ErrTokenInvalidSID
 }
 sidUUID, err := uuid.Parse(sid.(string))
 if err != nil {
     return uuid.UUID{}, domain.ErrTokenInvalidSID
 }

 return sidUUID, nil
}

func (c *Client) parseUID(t jwt.Token) (uuid.UUID, error) {
 // parse uid
 uid, ok := t.Get("uid")
 if !ok {
     return uuid.UUID{}, domain.ErrTokenInvalidUID
 }
 uidUUID, err := uuid.Parse(uid.(string))
 if err != nil {
     return uuid.UUID{}, domain.ErrTokenInvalidUID
 }

 return uidUUID, nil
}

I was hoping it'd catch the naked .(string) type assertion, but code splitting is also nice :D

Although this iteration doesn't really avoid repeating the same error-checking code multiple times like it claims.

Collaborative hacking with an AI buddy

I use Copilot quite a lot so I'm already used to having an AI pair-programming buddy, but I strongly believe that as these tools get smarter, they will drastically change the way how we as engineers work.

While not there yet, AI will be able to generate more efficient code faster than us, spot errors we'd miss, or flat-out generate entire applications based on spec input.

When Stable Diffusion came out, some people integrated it into their photo editors to quickly generate content based on a prompt. I am sure we'll see something like this, but for code, in no time. Imagine reading through some code, hitting an "AI" button, then ask it to refactor a routine, or give an explanation what that code does. The plugin will then send all the necessary context to the neural net and ask it to do the task.

The next generation of AI powered plugins could also do things like quickly provide feedback when it strongly believes a mistake has been made, or when it has a high confidence score for a completion / refactoring.

I think that would be really dang cool!