summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLuke Shumaker <lukeshu@lukeshu.com>2018-02-09 21:49:47 -0500
committerLuke Shumaker <lukeshu@lukeshu.com>2018-02-09 22:05:19 -0500
commit8c11dbdaacd8ec0b75d9110fe326955f01c45bee (patch)
treebf1781b3f78ff9b8cd897b5d58ccab58f1a6e1e2
parent031ceb664aa37672cfc0835d13f97a99f2451ea4 (diff)
add crt-sh-architecture article
-rw-r--r--public/crt-sh-architecture.md56
1 files changed, 56 insertions, 0 deletions
diff --git a/public/crt-sh-architecture.md b/public/crt-sh-architecture.md
new file mode 100644
index 0000000..d518d2f
--- /dev/null
+++ b/public/crt-sh-architecture.md
@@ -0,0 +1,56 @@
+The interesting architecture of crt.sh
+======================================
+---
+date: "2018-02-09"
+---
+
+A while back I wrote myself a little dashboard for monitoring TLS
+certificates for my domains. Right now it works by talking to
+<https://crt.sh/>. Sometimes this works great, but sometimes crt.sh
+is really slow. Plus, it's another thing that could be compromised.
+
+So, I started looking at how crt.sh works. It's kinda cool.
+
+There are only 3 separate processes:
+
+ - Cron
+ - [`ct_monitor`](https://github.com/crtsh/ct_monitor) is program
+ that uses libcurl to get CT log changes and libpq to put them
+ into the database.
+ - PostgreSQL
+ - [`certwatch_db`](https://github.com/crtsh/certwatch_db) is the
+ core web application, written in PL/pgSQL. It even includes the
+ HTML templating and query parameter handling. Of course, there
+ are a couple of things not entirely done in pgSQL...
+ - [`libx509pq`](https://github.com/crtsh/libx509pq) adds a set of
+ `x509_*` functions callable from pgSQL for parsing X509
+ certificates.
+ - [`libcablintpq`](https://github.com/crtsh/libcablintpq) adds the
+ `cablint_embedded(bytea)` function to pgSQL.
+ - [`libx509lintpq`](https://github.com/crtsh/libx509lintpq) adds the
+ `x509lint_embedded(bytea,integer)` function to pgSQL.
+ - Apache HTTPD
+ - [`mod_certwatch`](https://github.com/crtsh/mod_certwatch) is a
+ pretty thin wrapper that turns every HTTP request into an SQL
+ statement sent to PostgreSQL, via...
+ - [`mod_pgconn`](https://github.com/crtsh/mod_pgconn), which
+ manages PostgreSQL connections.
+
+The interface exposes HTML, ATOM, and JSON. All from code written in
+SQL.
+
+And then I guess it's behind an nginx-based load-balancer or somesuch
+(based on the 504 Gateway Timout messages it's given me). But that's
+not interesting.
+
+The actual website is [run from a read-only slave][slave-post] of the
+master DB that the `ct_monitor` cron-job updates; which makes several
+security considerations go away, and makes horizontal scaling easy.
+
+[slave-post]: https://groups.google.com/d/msg/mozilla.dev.security.policy/EPv_u9V06n0/gPJY5T7ILlQJ
+
+Anyway, I thought it was neat that so much of it runs inside the
+database; you don't see that terribly often. I also thought the
+little shims to make that possible were neat. I didn't get deep
+enough in to it to end up running my own instance or clone, but I
+thought my notes on it were worth sharing.