diff options
-rw-r--r-- | public/crt-sh-architecture.html | 45 | ||||
-rw-r--r-- | public/crt-sh-architecture.md | 56 | ||||
-rw-r--r-- | public/index.atom | 39 | ||||
-rw-r--r-- | public/index.html | 1 | ||||
-rw-r--r-- | public/index.md | 1 |
5 files changed, 142 insertions, 0 deletions
diff --git a/public/crt-sh-architecture.html b/public/crt-sh-architecture.html new file mode 100644 index 0000000..3783a50 --- /dev/null +++ b/public/crt-sh-architecture.html @@ -0,0 +1,45 @@ +<!DOCTYPE html> +<html lang="en"> +<head> + <meta charset="utf-8"> + <title>The interesting architecture of crt.sh — Luke Shumaker</title> + <link rel="stylesheet" href="assets/style.css"> + <link rel="alternate" type="application/atom+xml" href="./index.atom" name="web log entries"/> +</head> +<body> +<header><a href="/">Luke Shumaker</a> » <a href=/blog>blog</a> » crt-sh-architecture</header> +<article> +<h1 id="the-interesting-architecture-of-crt.sh">The interesting architecture of crt.sh</h1> +<p>A while back I wrote myself a little dashboard for monitoring TLS certificates for my domains. Right now it works by talking to <a href="https://crt.sh/" class="uri">https://crt.sh/</a>. Sometimes this works great, but sometimes crt.sh is really slow. Plus, it’s another thing that could be compromised.</p> +<p>So, I started looking at how crt.sh works. It’s kinda cool.</p> +<p>There are only 3 separate processes:</p> +<ul> +<li>Cron +<ul> +<li><a href="https://github.com/crtsh/ct_monitor"><code>ct_monitor</code></a> is program that uses libcurl to get CT log changes and libpq to put them into the database.</li> +</ul></li> +<li>PostgreSQL +<ul> +<li><a href="https://github.com/crtsh/certwatch_db"><code>certwatch_db</code></a> is the core web application, written in PL/pgSQL. It even includes the HTML templating and query parameter handling. Of course, there are a couple of things not entirely done in pgSQL…</li> +<li><a href="https://github.com/crtsh/libx509pq"><code>libx509pq</code></a> adds a set of <code>x509_*</code> functions callable from pgSQL for parsing X509 certificates.</li> +<li><a href="https://github.com/crtsh/libcablintpq"><code>libcablintpq</code></a> adds the <code>cablint_embedded(bytea)</code> function to pgSQL.</li> +<li><a href="https://github.com/crtsh/libx509lintpq"><code>libx509lintpq</code></a> adds the <code>x509lint_embedded(bytea,integer)</code> function to pgSQL.</li> +</ul></li> +<li>Apache HTTPD +<ul> +<li><a href="https://github.com/crtsh/mod_certwatch"><code>mod_certwatch</code></a> is a pretty thin wrapper that turns every HTTP request into an SQL statement sent to PostgreSQL, via…</li> +<li><a href="https://github.com/crtsh/mod_pgconn"><code>mod_pgconn</code></a>, which manages PostgreSQL connections.</li> +</ul></li> +</ul> +<p>The interface exposes HTML, ATOM, and JSON. All from code written in SQL.</p> +<p>And then I guess it’s behind an nginx-based load-balancer or somesuch (based on the 504 Gateway Timout messages it’s given me). But that’s not interesting.</p> +<p>The actual website is <a href="https://groups.google.com/d/msg/mozilla.dev.security.policy/EPv_u9V06n0/gPJY5T7ILlQJ">run from a read-only slave</a> of the master DB that the <code>ct_monitor</code> cron-job updates; which makes several security considerations go away, and makes horizontal scaling easy.</p> +<p>Anyway, I thought it was neat that so much of it runs inside the database; you don’t see that terribly often. I also thought the little shims to make that possible were neat. I didn’t get deep enough in to it to end up running my own instance or clone, but I thought my notes on it were worth sharing.</p> + +</article> +<footer> +<p>The content of this page is Copyright © 2018 <a href="mailto:lukeshu@sbcglobal.net">Luke Shumaker</a>.</p> +<p>This page is licensed under the <a href="https://creativecommons.org/licenses/by-sa/3.0/">CC BY-SA-3.0</a> license.</p> +</footer> +</body> +</html> diff --git a/public/crt-sh-architecture.md b/public/crt-sh-architecture.md new file mode 100644 index 0000000..d518d2f --- /dev/null +++ b/public/crt-sh-architecture.md @@ -0,0 +1,56 @@ +The interesting architecture of crt.sh +====================================== +--- +date: "2018-02-09" +--- + +A while back I wrote myself a little dashboard for monitoring TLS +certificates for my domains. Right now it works by talking to +<https://crt.sh/>. Sometimes this works great, but sometimes crt.sh +is really slow. Plus, it's another thing that could be compromised. + +So, I started looking at how crt.sh works. It's kinda cool. + +There are only 3 separate processes: + + - Cron + - [`ct_monitor`](https://github.com/crtsh/ct_monitor) is program + that uses libcurl to get CT log changes and libpq to put them + into the database. + - PostgreSQL + - [`certwatch_db`](https://github.com/crtsh/certwatch_db) is the + core web application, written in PL/pgSQL. It even includes the + HTML templating and query parameter handling. Of course, there + are a couple of things not entirely done in pgSQL... + - [`libx509pq`](https://github.com/crtsh/libx509pq) adds a set of + `x509_*` functions callable from pgSQL for parsing X509 + certificates. + - [`libcablintpq`](https://github.com/crtsh/libcablintpq) adds the + `cablint_embedded(bytea)` function to pgSQL. + - [`libx509lintpq`](https://github.com/crtsh/libx509lintpq) adds the + `x509lint_embedded(bytea,integer)` function to pgSQL. + - Apache HTTPD + - [`mod_certwatch`](https://github.com/crtsh/mod_certwatch) is a + pretty thin wrapper that turns every HTTP request into an SQL + statement sent to PostgreSQL, via... + - [`mod_pgconn`](https://github.com/crtsh/mod_pgconn), which + manages PostgreSQL connections. + +The interface exposes HTML, ATOM, and JSON. All from code written in +SQL. + +And then I guess it's behind an nginx-based load-balancer or somesuch +(based on the 504 Gateway Timout messages it's given me). But that's +not interesting. + +The actual website is [run from a read-only slave][slave-post] of the +master DB that the `ct_monitor` cron-job updates; which makes several +security considerations go away, and makes horizontal scaling easy. + +[slave-post]: https://groups.google.com/d/msg/mozilla.dev.security.policy/EPv_u9V06n0/gPJY5T7ILlQJ + +Anyway, I thought it was neat that so much of it runs inside the +database; you don't see that terribly often. I also thought the +little shims to make that possible were neat. I didn't get deep +enough in to it to end up running my own instance or clone, but I +thought my notes on it were worth sharing. diff --git a/public/index.atom b/public/index.atom index 8f3fefa..3c5961a 100644 --- a/public/index.atom +++ b/public/index.atom @@ -140,6 +140,45 @@ </entry> <entry xmlns="http://www.w3.org/2005/Atom"> + <link rel="alternate" type="text/html" href="./crt-sh-architecture.html"/> + <link rel="alternate" type="text/markdown" href="./crt-sh-architecture.md"/> + <id>https://lukeshu.com/blog/crt-sh-architecture.html</id> + <updated>2018-02-09T00:00:00+00:00</updated> + <published>2018-02-09T00:00:00+00:00</published> + <title>The interesting architecture of crt.sh</title> + <content type="html"><h1 id="the-interesting-architecture-of-crt.sh">The interesting architecture of crt.sh</h1> +<p>A while back I wrote myself a little dashboard for monitoring TLS certificates for my domains. Right now it works by talking to <a href="https://crt.sh/" class="uri">https://crt.sh/</a>. Sometimes this works great, but sometimes crt.sh is really slow. Plus, it’s another thing that could be compromised.</p> +<p>So, I started looking at how crt.sh works. It’s kinda cool.</p> +<p>There are only 3 separate processes:</p> +<ul> +<li>Cron +<ul> +<li><a href="https://github.com/crtsh/ct_monitor"><code>ct_monitor</code></a> is program that uses libcurl to get CT log changes and libpq to put them into the database.</li> +</ul></li> +<li>PostgreSQL +<ul> +<li><a href="https://github.com/crtsh/certwatch_db"><code>certwatch_db</code></a> is the core web application, written in PL/pgSQL. It even includes the HTML templating and query parameter handling. Of course, there are a couple of things not entirely done in pgSQL…</li> +<li><a href="https://github.com/crtsh/libx509pq"><code>libx509pq</code></a> adds a set of <code>x509_*</code> functions callable from pgSQL for parsing X509 certificates.</li> +<li><a href="https://github.com/crtsh/libcablintpq"><code>libcablintpq</code></a> adds the <code>cablint_embedded(bytea)</code> function to pgSQL.</li> +<li><a href="https://github.com/crtsh/libx509lintpq"><code>libx509lintpq</code></a> adds the <code>x509lint_embedded(bytea,integer)</code> function to pgSQL.</li> +</ul></li> +<li>Apache HTTPD +<ul> +<li><a href="https://github.com/crtsh/mod_certwatch"><code>mod_certwatch</code></a> is a pretty thin wrapper that turns every HTTP request into an SQL statement sent to PostgreSQL, via…</li> +<li><a href="https://github.com/crtsh/mod_pgconn"><code>mod_pgconn</code></a>, which manages PostgreSQL connections.</li> +</ul></li> +</ul> +<p>The interface exposes HTML, ATOM, and JSON. All from code written in SQL.</p> +<p>And then I guess it’s behind an nginx-based load-balancer or somesuch (based on the 504 Gateway Timout messages it’s given me). But that’s not interesting.</p> +<p>The actual website is <a href="https://groups.google.com/d/msg/mozilla.dev.security.policy/EPv_u9V06n0/gPJY5T7ILlQJ">run from a read-only slave</a> of the master DB that the <code>ct_monitor</code> cron-job updates; which makes several security considerations go away, and makes horizontal scaling easy.</p> +<p>Anyway, I thought it was neat that so much of it runs inside the database; you don’t see that terribly often. I also thought the little shims to make that possible were neat. I didn’t get deep enough in to it to end up running my own instance or clone, but I thought my notes on it were worth sharing.</p> +</content> + <author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author> + <rights type="html"><p>The content of this page is Copyright © 2018 <a href="mailto:lukeshu@sbcglobal.net">Luke Shumaker</a>.</p> +<p>This page is licensed under the <a href="https://creativecommons.org/licenses/by-sa/3.0/">CC BY-SA-3.0</a> license.</p></rights> + </entry> + + <entry xmlns="http://www.w3.org/2005/Atom"> <link rel="alternate" type="text/html" href="./http-notes.html"/> <link rel="alternate" type="text/markdown" href="./http-notes.md"/> <id>https://lukeshu.com/blog/http-notes.html</id> diff --git a/public/index.html b/public/index.html index e967651..ac1163a 100644 --- a/public/index.html +++ b/public/index.html @@ -22,6 +22,7 @@ time { <ul> <li><time>2018-02-09</time> - <a href="./posix-pricing.html">POSIX pricing and availability; or: Do you really need the PDF?</a></li> <li><time>2018-02-09</time> - <a href="./kbd-xmodmap.html">GNU/Linux Keyboard Maps: xmodmap</a></li> +<li><time>2018-02-09</time> - <a href="./crt-sh-architecture.html">The interesting architecture of crt.sh</a></li> <li><time>2016-09-30</time> - <a href="./http-notes.html">Notes on subtleties of HTTP implementation</a></li> <li><time>2016-02-28</time> - <a href="./x11-systemd.html">My X11 setup with systemd</a></li> <li><time>2016-02-28</time> - <a href="./java-segfault-redux.html">My favorite bug: segfaults in Java (redux)</a></li> diff --git a/public/index.md b/public/index.md index 20a0750..fdef2b8 100644 --- a/public/index.md +++ b/public/index.md @@ -12,6 +12,7 @@ time { * <time>2018-02-09</time> - [POSIX pricing and availability; or: Do you really need the PDF?](./posix-pricing.html) * <time>2018-02-09</time> - [GNU/Linux Keyboard Maps: xmodmap](./kbd-xmodmap.html) + * <time>2018-02-09</time> - [The interesting architecture of crt.sh](./crt-sh-architecture.html) * <time>2016-09-30</time> - [Notes on subtleties of HTTP implementation](./http-notes.html) * <time>2016-02-28</time> - [My X11 setup with systemd](./x11-systemd.html) * <time>2016-02-28</time> - [My favorite bug: segfaults in Java (redux)](./java-segfault-redux.html) |