The interesting architecture of crt.sh
A while back I wrote myself a little dashboard for monitoring TLS certificates for my domains. Right now it works by talking to https://crt.sh/. Sometimes this works great, but sometimes crt.sh is really slow. Plus, it’s another thing that could be compromised.
So, I started looking at how crt.sh works. It’s kinda cool.
There are only 3 separate processes:
- Cron
ct_monitor
is program that uses libcurl to get CT log changes and libpq to put them into the database.
- PostgreSQL
certwatch_db
is the core web application, written in PL/pgSQL. It even includes the HTML templating and query parameter handling. Of course, there are a couple of things not entirely done in pgSQL…libx509pq
adds a set ofx509_*
functions callable from pgSQL for parsing X509 certificates.libcablintpq
adds thecablint_embedded(bytea)
function to pgSQL.libx509lintpq
adds thex509lint_embedded(bytea,integer)
function to pgSQL.
- Apache HTTPD
mod_certwatch
is a pretty thin wrapper that turns every HTTP request into an SQL statement sent to PostgreSQL, via…mod_pgconn
, which manages PostgreSQL connections.
The interface exposes HTML, ATOM, and JSON. All from code written in SQL.
And then I guess it’s behind an nginx-based load-balancer or somesuch (based on the 504 Gateway Timout messages it’s given me). But that’s not interesting.
The actual website is run
from a read-only slave of the master DB that the
ct_monitor
cron-job updates; which makes several security
considerations go away, and makes horizontal scaling easy.
Anyway, I thought it was neat that so much of it runs inside the database; you don’t see that terribly often. I also thought the little shims to make that possible were neat. I didn’t get deep enough in to it to end up running my own instance or clone, but I thought my notes on it were worth sharing.