DEV Community

yangbongsoo
yangbongsoo

Posted on

WebClient Encoding Duplication Issue

When using WebClient's uriBuilder, it results in a doubly encoded output because it encodes an already encoded query string value again. In other words, there's no need to encode in advance.

// If constructed as below, the final request is sent as a doubly encoded result.
this.webClient
  .get()
    .uri(uriBuilder -> uriBuilder
      .path("/api")
      //.query("name=양봉수") It's korea language
      .query("name=%EC%96%91%EB%B4%89%EC%88%98")
      .build()
     )
  .retrieve()
Enter fullscreen mode Exit fullscreen mode

However, if the query string value is of JSON type, it must be encoded in advance; otherwise, an error occurs.

It might be hard to understand why one would put JSON in a query string, but in a situation where the existing system is being recreated, it was necessary to ensure backward compatibility.

I initially thought encoding was unnecessary because I knew that uriBuilder internally performs encoding. However, an error occurs in the expand logic of this.uriComponentsBuilder.build().expand(uriVars).

public class DefaultUriBuilderFactory implements UriBuilderFactory {

@Override
public URI build(Object... uriVars) {
    ... 

    UriComponents uric = this.uriComponentsBuilder.build().expand(uriVars);

    ...
}
Enter fullscreen mode Exit fullscreen mode

The cause of the error was the processing according to the URI template. It recognized the curly braces in the JSON as a template for parsing, and although it looked for a value corresponding to the key 'name,' the error occurred because there was no value.

In other words, the expected template was in the form of /test?key={name}, but since the JSON format did not match this, it was determined that there was no value.

The error message is as follows.

msg: Not enough variable values available to expand '"name"'
java.lang.IllegalArgumentException: Not enough variable values available to expand '"name"'

at org.springframework.web.util.UriComponents$VarArgsTemplateVariables.getValue(UriComponents.java:370)
at org.springframework.web.util.HierarchicalUriComponents$QueryUriTemplateVariables.getValue(HierarchicalUriComponents.java:1093)
at org.springframework.web.util.UriComponents.expandUriComponent(UriComponents.java:263)
at org.springframework.web.util.HierarchicalUriComponents.lambda$expandQueryParams$5(HierarchicalUriComponents.java:456)
at java.base/java.util.Map.forEach(Map.java:713)
at org.springframework.web.util.HierarchicalUriComponents.expandQueryParams(HierarchicalUriComponents.java:452)
at org.springframework.web.util.HierarchicalUriComponents.expandInternal(HierarchicalUriComponents.java:441)
at org.springframework.web.util.HierarchicalUriComponents.expandInternal(HierarchicalUriComponents.java:53)
at org.springframework.web.util.UriComponents.expand(UriComponents.java:172)
at org.springframework.web.util.DefaultUriBuilderFactory$DefaultUriBuilder.build(DefaultUriBuilderFactory.java:403)
at org.springframework.web.util.DefaultUriBuilderFactory.expand(DefaultUriBuilderFactory.java:154)
at org.springframework.web.reactive.function.client.DefaultWebClient$DefaultRequestBodyUriSpec.uri(DefaultWebClient.java:194)
Enter fullscreen mode Exit fullscreen mode

To solve this problem, the request was sent to the WebClient with double encoding (performing encoding on the JSON data in advance), and the server receiving this request had to perform decoding twice.

Whether combining the path and query string into a string and inserting it into the URI, or using uriBuilder, both methods result in the same error.

// path value is /test?jsonParam=%7B%0A%20%20%22name%22%3A%20%22ybs%22%0A%7D
// The jsonParam is the value obtained by encoding the below JSON once.
// {
//    "name": "ybs"
//  }
this.webClient
  .post()
  .uri(path)


// path value is /test
// query value is %7B%0A%20%20%22name%22%3A%20%22ybs%22%0A%7D
this.webClient
  .post()
  .uri(uriBuilder -> {
    return uriBuilder
      .path(path)
      .query(query)
      .build();
Enter fullscreen mode Exit fullscreen mode

Upon looking for other solutions, it was found that by constructing it as below, it is possible to avoid the problematic expand logic, and thus no error occurs.

However, using this method slightly changes the encoded JSON result.

// method1
this.webClient
  .post()
  .uri(uriBuilder -> {
    URI uri = UriComponentsBuilder.fromUriString(path)
      .queryParam("query", query)
      .build()
      .toUri();
    return uri;
  }
)

// method2. Instead of using the uriBuilder lambda, you can directly put a URI object.
URI uri = UriComponentsBuilder.fromUriString(path)
  .queryParam("query", query)
  .build()
  .toUri();

return this.webClient
  .post()
  .uri(uri)
Enter fullscreen mode Exit fullscreen mode

When looking at the query string value of the created URI object, the colon (:) is not encoded. In other words, when you URL encode a colon (:), it becomes %3A, but if the encoding is done internally by the URI, the result does not apply the encoding.

1. %7B%0A%20%20%22name%22:%20%22ybs%22%0A%7D (The result encoded internally by the URI)
2. %7B%0A%20%20%22name%22%3A%20%22ybs%22%0A%7D (The result of encoding in advance)
Enter fullscreen mode Exit fullscreen mode

In the RFC 3986 Uniform Resource Identifier (URI) specification, a colon (:) is allowed as a query character. In other words, a colon (:) is permitted as a character within a query.

query       = *( pchar / "/" / "?" )
—————
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded   = "%" HEXDIG HEXDIG
sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="
Enter fullscreen mode Exit fullscreen mode

Of course, whether URL encoding (referred to as percent-encode in the RFC specification) is not necessary can be seen as a separate issue. The reserved characters defined below are a set of characters that, as part of the URI syntax, can distinguish different data within the URI.

reserved    = gen-delims / sub-delims
—————
gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="
Enter fullscreen mode Exit fullscreen mode

The RFC 3986 specification document states that reserved characters should be encoded, but it also mentions that encoding is not necessary if they are specifically allowed in the component corresponding to the query.

Ultimately, the interpretation of the component may vary, but I understand that it is acceptable not to encode the colon (:), without causing any issues.

https://www.rfc-editor.org/rfc/rfc3986#section-2.2

URI producing applications should percent-encode data octets that
correspond to characters in the reserved set unless these characters
are specifically allowed by the URI scheme to represent data in that
component.  If a reserved character is found in a URI component and
no delimiting role is known for that character, then it must be
interpreted as representing the data octet corresponding to that
character's encoding in US-ASCII.
Enter fullscreen mode Exit fullscreen mode

Top comments (0)