Adventure of Linq and Postgres in Go

How to do SQL in Go code?

Published on 25 March 2020. Posted under category: Development

Gophers as Link from Legend of Zelda, credits to Ashley Willis.

LINQ (.NET Language-Integrated Query) is a general purpose query facilities available to .NET language family. It is a cool language feature that enables programmers to use SQL directly in code. LINQ also enables consistent syntax both for querying relational database or querying most kind of collections.

I’m particularly interested in having static type checking when building queries. Go is statically typed-language, Postgres (the relational database we’re using in Tokopedia) also have fairly rigid types, why should we defer type checks long into runtime? I’m also looking at the possibilities of streamlining large code base that access our database. Does a LINQ implementation need an almost 1:1 SQL compatibility until it can be deemed good enough for production code?

The Goal

Go sql package have the right idea, the query results are a collection of structs (it actually called slice in Go, but I can’t dismiss feeling that slice sounds like a part while we’re really talking about a whole here), therefore convenient functions like struct scan are provided by the standard package. My goal is to directly serve structs from query functions. For example I want to be able to do this:

type Fields struct {
    ID   int64  `db:"id"`
    Name string `db:"title"`
}

data, err := table
    .Select()
    .Where(Name.Like("Leben"))
    .Or(Name.Equal("Asa")))

if err != nil {
    return err
}

for _, d := range data {
    fmt.Println(d.Name)
}

That snippet’s syntax is actually quite far from LINQ’s. I’m just clueless at how to translate Go functions into it’s SQL counterpart.

One of my self-imposed requirements is to avoid empty interface as value placeholder, and avoid string as field identifier at all cost. I want static typing to do the job of ensuring each supplied (or retrieved) value in Go match their Postgres counterpart. Having Go’s reflection to do the interface heavy lifting or using strings to pick correct fields means more checks and more code. The goal is to code less by leaving the heavy lifting to tools.

Spoiler ahead: the solution does look like that snippet.

Getting the Right Data

My adventure started by figuring out how to link Postgres type to Go type. I did a lot of struct scan when coding in Go, so I had a hunch that the struct scan function holds some kind of type mapping to ensure Postgres data can be stored correctly in supplied struct. Thanks to the open source nature of Go community, this kind of sniffing and reserve engineering is possible (and even encouraged).

I found the definitive clue in package github.com/lib/pq, the driver library for working with Postgres using Go standard sql package. The package actually has another package inside it (oid), which contains computer generated code (using go generate command) that lists all Postgres type as Go constants. From there, it’s easy to find out how lib/pq actually maps Postgres type to Go type.

With type mapping technique under my belly, I moved forward by investigating on how to enumerate fields type of a Postgres table. This is important because Tokopedia have a lot of tables and I simply don’t have the strength (nor the will) to write data structs for each of them. Fortunately, the creator of Postgres hid a hidden treasure in psql. If you connect to a database using -E flag, psql will prints actual queries it used for commands such as \d (yet another reason why people who loves GUI database browser should switch to psql). The queries to enumerate fields are really cryptic, but can still be deciphered.

Those findings are quite useful for making tools or even to put together a runtime query library. I put together a little gist to demonstrate my findings so far and created a package called pqprobe to retrieve useful information from Postgres database. I was pretty occupied with works after this point. After some time, and a humble holiday to Singapore to watch Final Fantasy Distant Worlds orchestra, the adventure continued.

A Tale of Generics

Before working with Go, I spent about 4 years doing C++. I am fully aware about the increase of quality of life when switching to Go. While most of my projects can be compiled using short commands, such as make build (or, my favorite, qbs build), the initial pain to make those short commands work are quite time-consuming. I also had a fair share of experience of long compile time due to heavy use of template metaprogramming. Those oft-cited benefits of Go? Those are real. For some time, I missed almost nothing from my C++ life.

That is until I worked on this project. Go is very honest to programmers, template metaprogramming is just a fancy text generation, you should roll your own generator instead of relying on compiler. This approach, as we’ll find out, actually works quite well, but I can understand why people are demanding an easier route.

My challenge at this point was to figure out how to create various SQL comparison operators that would not accept incorrect input type. For example, in previous snippet you’d find expression such as Name.Equal("Leben"). I want to make sure ridiculous inputs such as Name.Equal(time.Now()) won’t even compile. The problem is, I don’t know how to change function argument type based on Postgres field’s type in Go (please enlighten me).

Fortunately, in previous journey I already uncovered about how to avoid writing entire Tokopedia tables data struct manually. My solution was to write a simple Go source code and replace interesting identifiers with Go’s template notation. From there, I wrote a simple tool that will retrieve fields’ type information from Postgress for me and put the data nicely into my template source code. That was actually the first time I had ever wrote my own template metaprogramming generator. The conflicting feelings of accomplishment and slight annoyance on why I should do that in the first place were quite pleasing.

So that was my answer. Instead of figuring out how to adapt function argument’s type based on Postgres field type, I simply turned the function argument’s type into template parameter and let the generator write those code for me. I put those early prototype under example directory inside pqprobe package, lest someone mistaken them for actual tool.

Putting the Pieces Together

I kept talking about the tool or the generator that will make everyone’s life easier, now is finally the time to create that tool. Like earlier prototype, I started with actual and working source code, then turn the moving parts into Go template notation. This enabled me to validate ideas quickly and provided me with entire Go tools (linter, compiler, auto-complete, refactor tools) during development. I avoided unit testing because I want to try as much patterns as possible during my small weekend time.

The solution is divided into three parts: the table model, the (pseudo) LINQ implementation and field comparison operators. The table model provides entry point to library features. The LINQ implementation is actually still just a query builder, but with added benefit of static typing. The field comparison operators enables field expression and generated on a per-field basis.

The table model works like a contract. You bind it with a database connection and it will provides functions that make table access and modification easier (hopefully). Since it’s a table model, not a database model, it only knows about the table hardcoded in it. I separated the data struct with type that provides action on a particular record to ensure immutability for the data type. Since this is the main feature, pqlinq (the generator) will give it a same name as the package name.

The LINQ implementation only provides basic SELECT statement at the time of writing. It basically a just a query builder. All of it’s logic is hosted on Query type, but I used some interface trick to limit possible action after certain action (you can’t call table.Select().Where(…).Where(…) for example, well you can, but at least not naturally). Query has BuildSQL method, in case you want to check what it will send to Postgres. You can find the generated LINQ code in files with _linq.go endings.

I didn’t put SQL UPDATE or INSERT in the LINQ implementation because I think such expressions are concise enough to be put inside the table model (planning the same for DELETE). This would force people to select target records first before they are able to call UPDATE, which means extra network calls. I just didn’t find batch UPDATE occurs frequently enough in production code, and even if the needs arise it sounds like a task for tools like psql, not some server side application. This is just my opinion, obviously, I’m open to suggestions should anyone finds LINQ on transactions useful.

The final piece is the filter expression builder. This particular piece contains comparison operators with correct type according to its field. Since Go doesn’t allow operator overloading, I opted to provide long named operators such as Equal, NotEqual, GreaterThan and their short named counterparts ( Eq, Neq, Gt). They will create filter expression object that can be chained using And or Or. I can only fit basic operators at the time of writing. Postgres have a lot of operators, and some of them only work for certain type, so there are a lot things to improve here.

Speaking of improvements, there are several challenges ahead. I’m particularly wary about how to implement SQL JOINS within this lexical framework. I’m thinking about using extension of two or more data structs, but am still not quite sure how it would behave. Another interesting problem to tackle is how to handle nullable fields. Coalesce to zero value might help, but some people might want to use the nil variant of variable type, so some kind of generator switch must be installed. A test code generator would also be handy, otherwise the codebase will lose a lot of oomph from unit tests.

Conclusion

This little adventure managed to produce a proof of concept of LINQ implementation with Postgres database. I’m aware of other LINQ implementation that grasped a form which much more closer to the actual LINQ than mine, such as this excellent go-linq package, but at least I actually managed to reach the form I envisioned. I’m really curious about how .NET’s LINQ to SQL actually works. If you happen to know about this, please inform me.

The remaining challenge (arguably the hardest) is to put consistent amount of time to pqprobe and pqlinq to ensure they mature properly as tools so they can make a compelling case to be used in production code.

So, what do you think about working with Postgres database using LINQ in Go? Do you have any neat ideas? Leave a comment so we can talk about it.

This blog was first published in Medium on 20 November 2017.