From 8c11dbdaacd8ec0b75d9110fe326955f01c45bee Mon Sep 17 00:00:00 2001 From: Luke Shumaker Date: Fri, 9 Feb 2018 21:49:47 -0500 Subject: add crt-sh-architecture article --- public/crt-sh-architecture.md | 56 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) create mode 100644 public/crt-sh-architecture.md diff --git a/public/crt-sh-architecture.md b/public/crt-sh-architecture.md new file mode 100644 index 0000000..d518d2f --- /dev/null +++ b/public/crt-sh-architecture.md @@ -0,0 +1,56 @@ +The interesting architecture of crt.sh +====================================== +--- +date: "2018-02-09" +--- + +A while back I wrote myself a little dashboard for monitoring TLS +certificates for my domains. Right now it works by talking to +. Sometimes this works great, but sometimes crt.sh +is really slow. Plus, it's another thing that could be compromised. + +So, I started looking at how crt.sh works. It's kinda cool. + +There are only 3 separate processes: + + - Cron + - [`ct_monitor`](https://github.com/crtsh/ct_monitor) is program + that uses libcurl to get CT log changes and libpq to put them + into the database. + - PostgreSQL + - [`certwatch_db`](https://github.com/crtsh/certwatch_db) is the + core web application, written in PL/pgSQL. It even includes the + HTML templating and query parameter handling. Of course, there + are a couple of things not entirely done in pgSQL... + - [`libx509pq`](https://github.com/crtsh/libx509pq) adds a set of + `x509_*` functions callable from pgSQL for parsing X509 + certificates. + - [`libcablintpq`](https://github.com/crtsh/libcablintpq) adds the + `cablint_embedded(bytea)` function to pgSQL. + - [`libx509lintpq`](https://github.com/crtsh/libx509lintpq) adds the + `x509lint_embedded(bytea,integer)` function to pgSQL. + - Apache HTTPD + - [`mod_certwatch`](https://github.com/crtsh/mod_certwatch) is a + pretty thin wrapper that turns every HTTP request into an SQL + statement sent to PostgreSQL, via... + - [`mod_pgconn`](https://github.com/crtsh/mod_pgconn), which + manages PostgreSQL connections. + +The interface exposes HTML, ATOM, and JSON. All from code written in +SQL. + +And then I guess it's behind an nginx-based load-balancer or somesuch +(based on the 504 Gateway Timout messages it's given me). But that's +not interesting. + +The actual website is [run from a read-only slave][slave-post] of the +master DB that the `ct_monitor` cron-job updates; which makes several +security considerations go away, and makes horizontal scaling easy. + +[slave-post]: https://groups.google.com/d/msg/mozilla.dev.security.policy/EPv_u9V06n0/gPJY5T7ILlQJ + +Anyway, I thought it was neat that so much of it runs inside the +database; you don't see that terribly often. I also thought the +little shims to make that possible were neat. I didn't get deep +enough in to it to end up running my own instance or clone, but I +thought my notes on it were worth sharing. -- cgit v1.2.3