The interesting architecture of crt.sh ====================================== --- date: "2018-02-09" --- A while back I wrote myself a little dashboard for monitoring TLS certificates for my domains. Right now it works by talking to . Sometimes this works great, but sometimes crt.sh is really slow. Plus, it's another thing that could be compromised. So, I started looking at how crt.sh works. It's kinda cool. There are only 3 separate processes: - Cron - [`ct_monitor`](https://github.com/crtsh/ct_monitor) is program that uses libcurl to get CT log changes and libpq to put them into the database. - PostgreSQL - [`certwatch_db`](https://github.com/crtsh/certwatch_db) is the core web application, written in PL/pgSQL. It even includes the HTML templating and query parameter handling. Of course, there are a couple of things not entirely done in pgSQL... - [`libx509pq`](https://github.com/crtsh/libx509pq) adds a set of `x509_*` functions callable from pgSQL for parsing X509 certificates. - [`libcablintpq`](https://github.com/crtsh/libcablintpq) adds the `cablint_embedded(bytea)` function to pgSQL. - [`libx509lintpq`](https://github.com/crtsh/libx509lintpq) adds the `x509lint_embedded(bytea,integer)` function to pgSQL. - Apache HTTPD - [`mod_certwatch`](https://github.com/crtsh/mod_certwatch) is a pretty thin wrapper that turns every HTTP request into an SQL statement sent to PostgreSQL, via... - [`mod_pgconn`](https://github.com/crtsh/mod_pgconn), which manages PostgreSQL connections. The interface exposes HTML, ATOM, and JSON. All from code written in SQL. And then I guess it's behind an nginx-based load-balancer or somesuch (based on the 504 Gateway Timout messages it's given me). But that's not interesting. The actual website is [run from a read-only slave][slave-post] of the master DB that the `ct_monitor` cron-job updates; which makes several security considerations go away, and makes horizontal scaling easy. [slave-post]: https://groups.google.com/d/msg/mozilla.dev.security.policy/EPv_u9V06n0/gPJY5T7ILlQJ Anyway, I thought it was neat that so much of it runs inside the database; you don't see that terribly often. I also thought the little shims to make that possible were neat. I didn't get deep enough in to it to end up running my own instance or clone, but I thought my notes on it were worth sharing.