Sunday, February 8, 2015

Go Case Sensitivity With JSON Decoding

TL;DR

  • The order of fields in JSON matters when decoding to a Go struct: {"id": 1, "ID": 124452} will map the id as 124452 for a struct with only ID int `json:"id"` defined.
  • Explicitly defining the case of the field in the json tag will not necessarily help

Details

About two months ago as of this publish date, one of my coworkers changed some development code which resulted in an interesting break to a monitoring component we have for our data ingest pipeline. The monitoring component is written in Go and sends an event through the data pipeline at specified intervals. Each component in the data pipeline then sends back a message acknowledging it received the message along with some other information. What my coworker ended up changing was moving a piece of transformation code that appended an ID to the JSON event we received (if one did not exist) to run before the monitoring code kicked off. The net effect that this ended up having was transforming this JSON event from to this which broke our monitoring component! We fixed this particular issue by having the monitoring component send out the proper id field but I was curious why this broke our monitoring component.

I ended up playing around with different forms of the JSON that was being sent and how it was decoded to structs with variations of the JSON fields. This is the result of that: which outputs: And here is the Go Playground link.

What does this demonstrate? It shows that ordering of fields with the same name (but different cases) can alter what you expect to receive when unmarshalling JSON. Receiving JSON that looks like {"id": 1, "ID": 124452} can materially effect your program if you are only unmarshalling to a struct with an ID field define like this:
ID int `json:"id"`
Rather than getting 1 you will get 124452 - not exactly what you would expect when you explicitly set the JSON tag to "id" not "ID"!

After tracing through the encoding/json package it ended up that Go's behavior when unmarshalling JSON into structs is use the byte package's EqualFold function when it doesn't find an exact field name match. You can see that here in the encoding/json package's decode.go file. I pulled out the relevant code here: This piece of code is located in the object method on the decodeState struct. What is happening is that for every JSON object this object method iterates through the keys and assigns the data to the appropriate struct field. The code above shows how if there is an exact match for a field name then that field is what the data is assigned to; however, if there is not an exact match then the EqualFold function is employed which is case insensitive. This means that the order in which fields are encountered in the JSON matters as I found in my experiment.

At this time I am not really sure if this would be considered a bug or not. At minimum, I think it's definitely an edge case but one that could cause serious problems.

Update: Here is the link to the Google Group post I made on this.