The interesting architecture of crt.sh

A while back I wrote myself a little dashboard for monitoring TLS certificates for my domains. Right now it works by talking to https://crt.sh/. Sometimes this works great, but sometimes crt.sh is really slow. Plus, it’s another thing that could be compromised.

So, I started looking at how crt.sh works. It’s kinda cool.

There are only 3 separate processes:

Cron
- ct_monitor is program that uses libcurl to get CT log changes and libpq to put them into the database.
PostgreSQL
- certwatch_db is the core web application, written in PL/pgSQL. It even includes the HTML templating and query parameter handling. Of course, there are a couple of things not entirely done in pgSQL…
- libx509pq adds a set of x509_* functions callable from pgSQL for parsing X509 certificates.
- libcablintpq adds the cablint_embedded(bytea) function to pgSQL.
- libx509lintpq adds the x509lint_embedded(bytea,integer) function to pgSQL.
Apache HTTPD
- mod_certwatch is a pretty thin wrapper that turns every HTTP request into an SQL statement sent to PostgreSQL, via…
- mod_pgconn, which manages PostgreSQL connections.

The interface exposes HTML, ATOM, and JSON. All from code written in SQL.

And then I guess it’s behind an nginx-based load-balancer or somesuch (based on the 504 Gateway Timout messages it’s given me). But that’s not interesting.

The actual website is run from a read-only slave of the master DB that the ct_monitor cron-job updates; which makes several security considerations go away, and makes horizontal scaling easy.

Anyway, I thought it was neat that so much of it runs inside the database; you don’t see that terribly often. I also thought the little shims to make that possible were neat. I didn’t get deep enough in to it to end up running my own instance or clone, but I thought my notes on it were worth sharing.