summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLuke Shumaker <lukeshu@lukeshu.com>2018-02-09 22:05:28 -0500
committerLuke Shumaker <lukeshu@lukeshu.com>2018-02-09 22:06:43 -0500
commit8c99fadac68cb05b4aaa08cab7a55c7fbfe5e364 (patch)
tree768c6b7d3a9e1cd4721d4c11c21c205101c189a6
parentc49a93bedf3e7cb7328aa8b354c0307199405480 (diff)
parent8c11dbdaacd8ec0b75d9110fe326955f01c45bee (diff)
make: add crt-sh-architecture article
-rw-r--r--public/crt-sh-architecture.html45
-rw-r--r--public/crt-sh-architecture.md56
-rw-r--r--public/index.atom39
-rw-r--r--public/index.html1
-rw-r--r--public/index.md1
5 files changed, 142 insertions, 0 deletions
diff --git a/public/crt-sh-architecture.html b/public/crt-sh-architecture.html
new file mode 100644
index 0000000..3783a50
--- /dev/null
+++ b/public/crt-sh-architecture.html
@@ -0,0 +1,45 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+ <meta charset="utf-8">
+ <title>The interesting architecture of crt.sh — Luke Shumaker</title>
+ <link rel="stylesheet" href="assets/style.css">
+ <link rel="alternate" type="application/atom+xml" href="./index.atom" name="web log entries"/>
+</head>
+<body>
+<header><a href="/">Luke Shumaker</a> » <a href=/blog>blog</a> » crt-sh-architecture</header>
+<article>
+<h1 id="the-interesting-architecture-of-crt.sh">The interesting architecture of crt.sh</h1>
+<p>A while back I wrote myself a little dashboard for monitoring TLS certificates for my domains. Right now it works by talking to <a href="https://crt.sh/" class="uri">https://crt.sh/</a>. Sometimes this works great, but sometimes crt.sh is really slow. Plus, it’s another thing that could be compromised.</p>
+<p>So, I started looking at how crt.sh works. It’s kinda cool.</p>
+<p>There are only 3 separate processes:</p>
+<ul>
+<li>Cron
+<ul>
+<li><a href="https://github.com/crtsh/ct_monitor"><code>ct_monitor</code></a> is program that uses libcurl to get CT log changes and libpq to put them into the database.</li>
+</ul></li>
+<li>PostgreSQL
+<ul>
+<li><a href="https://github.com/crtsh/certwatch_db"><code>certwatch_db</code></a> is the core web application, written in PL/pgSQL. It even includes the HTML templating and query parameter handling. Of course, there are a couple of things not entirely done in pgSQL…</li>
+<li><a href="https://github.com/crtsh/libx509pq"><code>libx509pq</code></a> adds a set of <code>x509_*</code> functions callable from pgSQL for parsing X509 certificates.</li>
+<li><a href="https://github.com/crtsh/libcablintpq"><code>libcablintpq</code></a> adds the <code>cablint_embedded(bytea)</code> function to pgSQL.</li>
+<li><a href="https://github.com/crtsh/libx509lintpq"><code>libx509lintpq</code></a> adds the <code>x509lint_embedded(bytea,integer)</code> function to pgSQL.</li>
+</ul></li>
+<li>Apache HTTPD
+<ul>
+<li><a href="https://github.com/crtsh/mod_certwatch"><code>mod_certwatch</code></a> is a pretty thin wrapper that turns every HTTP request into an SQL statement sent to PostgreSQL, via…</li>
+<li><a href="https://github.com/crtsh/mod_pgconn"><code>mod_pgconn</code></a>, which manages PostgreSQL connections.</li>
+</ul></li>
+</ul>
+<p>The interface exposes HTML, ATOM, and JSON. All from code written in SQL.</p>
+<p>And then I guess it’s behind an nginx-based load-balancer or somesuch (based on the 504 Gateway Timout messages it’s given me). But that’s not interesting.</p>
+<p>The actual website is <a href="https://groups.google.com/d/msg/mozilla.dev.security.policy/EPv_u9V06n0/gPJY5T7ILlQJ">run from a read-only slave</a> of the master DB that the <code>ct_monitor</code> cron-job updates; which makes several security considerations go away, and makes horizontal scaling easy.</p>
+<p>Anyway, I thought it was neat that so much of it runs inside the database; you don’t see that terribly often. I also thought the little shims to make that possible were neat. I didn’t get deep enough in to it to end up running my own instance or clone, but I thought my notes on it were worth sharing.</p>
+
+</article>
+<footer>
+<p>The content of this page is Copyright © 2018 <a href="mailto:lukeshu@sbcglobal.net">Luke Shumaker</a>.</p>
+<p>This page is licensed under the <a href="https://creativecommons.org/licenses/by-sa/3.0/">CC BY-SA-3.0</a> license.</p>
+</footer>
+</body>
+</html>
diff --git a/public/crt-sh-architecture.md b/public/crt-sh-architecture.md
new file mode 100644
index 0000000..d518d2f
--- /dev/null
+++ b/public/crt-sh-architecture.md
@@ -0,0 +1,56 @@
+The interesting architecture of crt.sh
+======================================
+---
+date: "2018-02-09"
+---
+
+A while back I wrote myself a little dashboard for monitoring TLS
+certificates for my domains. Right now it works by talking to
+<https://crt.sh/>. Sometimes this works great, but sometimes crt.sh
+is really slow. Plus, it's another thing that could be compromised.
+
+So, I started looking at how crt.sh works. It's kinda cool.
+
+There are only 3 separate processes:
+
+ - Cron
+ - [`ct_monitor`](https://github.com/crtsh/ct_monitor) is program
+ that uses libcurl to get CT log changes and libpq to put them
+ into the database.
+ - PostgreSQL
+ - [`certwatch_db`](https://github.com/crtsh/certwatch_db) is the
+ core web application, written in PL/pgSQL. It even includes the
+ HTML templating and query parameter handling. Of course, there
+ are a couple of things not entirely done in pgSQL...
+ - [`libx509pq`](https://github.com/crtsh/libx509pq) adds a set of
+ `x509_*` functions callable from pgSQL for parsing X509
+ certificates.
+ - [`libcablintpq`](https://github.com/crtsh/libcablintpq) adds the
+ `cablint_embedded(bytea)` function to pgSQL.
+ - [`libx509lintpq`](https://github.com/crtsh/libx509lintpq) adds the
+ `x509lint_embedded(bytea,integer)` function to pgSQL.
+ - Apache HTTPD
+ - [`mod_certwatch`](https://github.com/crtsh/mod_certwatch) is a
+ pretty thin wrapper that turns every HTTP request into an SQL
+ statement sent to PostgreSQL, via...
+ - [`mod_pgconn`](https://github.com/crtsh/mod_pgconn), which
+ manages PostgreSQL connections.
+
+The interface exposes HTML, ATOM, and JSON. All from code written in
+SQL.
+
+And then I guess it's behind an nginx-based load-balancer or somesuch
+(based on the 504 Gateway Timout messages it's given me). But that's
+not interesting.
+
+The actual website is [run from a read-only slave][slave-post] of the
+master DB that the `ct_monitor` cron-job updates; which makes several
+security considerations go away, and makes horizontal scaling easy.
+
+[slave-post]: https://groups.google.com/d/msg/mozilla.dev.security.policy/EPv_u9V06n0/gPJY5T7ILlQJ
+
+Anyway, I thought it was neat that so much of it runs inside the
+database; you don't see that terribly often. I also thought the
+little shims to make that possible were neat. I didn't get deep
+enough in to it to end up running my own instance or clone, but I
+thought my notes on it were worth sharing.
diff --git a/public/index.atom b/public/index.atom
index 8f3fefa..3c5961a 100644
--- a/public/index.atom
+++ b/public/index.atom
@@ -140,6 +140,45 @@
</entry>
<entry xmlns="http://www.w3.org/2005/Atom">
+ <link rel="alternate" type="text/html" href="./crt-sh-architecture.html"/>
+ <link rel="alternate" type="text/markdown" href="./crt-sh-architecture.md"/>
+ <id>https://lukeshu.com/blog/crt-sh-architecture.html</id>
+ <updated>2018-02-09T00:00:00+00:00</updated>
+ <published>2018-02-09T00:00:00+00:00</published>
+ <title>The interesting architecture of crt.sh</title>
+ <content type="html">&lt;h1 id="the-interesting-architecture-of-crt.sh"&gt;The interesting architecture of crt.sh&lt;/h1&gt;
+&lt;p&gt;A while back I wrote myself a little dashboard for monitoring TLS certificates for my domains. Right now it works by talking to &lt;a href="https://crt.sh/" class="uri"&gt;https://crt.sh/&lt;/a&gt;. Sometimes this works great, but sometimes crt.sh is really slow. Plus, it’s another thing that could be compromised.&lt;/p&gt;
+&lt;p&gt;So, I started looking at how crt.sh works. It’s kinda cool.&lt;/p&gt;
+&lt;p&gt;There are only 3 separate processes:&lt;/p&gt;
+&lt;ul&gt;
+&lt;li&gt;Cron
+&lt;ul&gt;
+&lt;li&gt;&lt;a href="https://github.com/crtsh/ct_monitor"&gt;&lt;code&gt;ct_monitor&lt;/code&gt;&lt;/a&gt; is program that uses libcurl to get CT log changes and libpq to put them into the database.&lt;/li&gt;
+&lt;/ul&gt;&lt;/li&gt;
+&lt;li&gt;PostgreSQL
+&lt;ul&gt;
+&lt;li&gt;&lt;a href="https://github.com/crtsh/certwatch_db"&gt;&lt;code&gt;certwatch_db&lt;/code&gt;&lt;/a&gt; is the core web application, written in PL/pgSQL. It even includes the HTML templating and query parameter handling. Of course, there are a couple of things not entirely done in pgSQL…&lt;/li&gt;
+&lt;li&gt;&lt;a href="https://github.com/crtsh/libx509pq"&gt;&lt;code&gt;libx509pq&lt;/code&gt;&lt;/a&gt; adds a set of &lt;code&gt;x509_*&lt;/code&gt; functions callable from pgSQL for parsing X509 certificates.&lt;/li&gt;
+&lt;li&gt;&lt;a href="https://github.com/crtsh/libcablintpq"&gt;&lt;code&gt;libcablintpq&lt;/code&gt;&lt;/a&gt; adds the &lt;code&gt;cablint_embedded(bytea)&lt;/code&gt; function to pgSQL.&lt;/li&gt;
+&lt;li&gt;&lt;a href="https://github.com/crtsh/libx509lintpq"&gt;&lt;code&gt;libx509lintpq&lt;/code&gt;&lt;/a&gt; adds the &lt;code&gt;x509lint_embedded(bytea,integer)&lt;/code&gt; function to pgSQL.&lt;/li&gt;
+&lt;/ul&gt;&lt;/li&gt;
+&lt;li&gt;Apache HTTPD
+&lt;ul&gt;
+&lt;li&gt;&lt;a href="https://github.com/crtsh/mod_certwatch"&gt;&lt;code&gt;mod_certwatch&lt;/code&gt;&lt;/a&gt; is a pretty thin wrapper that turns every HTTP request into an SQL statement sent to PostgreSQL, via…&lt;/li&gt;
+&lt;li&gt;&lt;a href="https://github.com/crtsh/mod_pgconn"&gt;&lt;code&gt;mod_pgconn&lt;/code&gt;&lt;/a&gt;, which manages PostgreSQL connections.&lt;/li&gt;
+&lt;/ul&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;p&gt;The interface exposes HTML, ATOM, and JSON. All from code written in SQL.&lt;/p&gt;
+&lt;p&gt;And then I guess it’s behind an nginx-based load-balancer or somesuch (based on the 504 Gateway Timout messages it’s given me). But that’s not interesting.&lt;/p&gt;
+&lt;p&gt;The actual website is &lt;a href="https://groups.google.com/d/msg/mozilla.dev.security.policy/EPv_u9V06n0/gPJY5T7ILlQJ"&gt;run from a read-only slave&lt;/a&gt; of the master DB that the &lt;code&gt;ct_monitor&lt;/code&gt; cron-job updates; which makes several security considerations go away, and makes horizontal scaling easy.&lt;/p&gt;
+&lt;p&gt;Anyway, I thought it was neat that so much of it runs inside the database; you don’t see that terribly often. I also thought the little shims to make that possible were neat. I didn’t get deep enough in to it to end up running my own instance or clone, but I thought my notes on it were worth sharing.&lt;/p&gt;
+</content>
+ <author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
+ <rights type="html">&lt;p&gt;The content of this page is Copyright © 2018 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
+&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
+ </entry>
+
+ <entry xmlns="http://www.w3.org/2005/Atom">
<link rel="alternate" type="text/html" href="./http-notes.html"/>
<link rel="alternate" type="text/markdown" href="./http-notes.md"/>
<id>https://lukeshu.com/blog/http-notes.html</id>
diff --git a/public/index.html b/public/index.html
index e967651..ac1163a 100644
--- a/public/index.html
+++ b/public/index.html
@@ -22,6 +22,7 @@ time {
<ul>
<li><time>2018-02-09</time> - <a href="./posix-pricing.html">POSIX pricing and availability; or: Do you really need the PDF?</a></li>
<li><time>2018-02-09</time> - <a href="./kbd-xmodmap.html">GNU/Linux Keyboard Maps: xmodmap</a></li>
+<li><time>2018-02-09</time> - <a href="./crt-sh-architecture.html">The interesting architecture of crt.sh</a></li>
<li><time>2016-09-30</time> - <a href="./http-notes.html">Notes on subtleties of HTTP implementation</a></li>
<li><time>2016-02-28</time> - <a href="./x11-systemd.html">My X11 setup with systemd</a></li>
<li><time>2016-02-28</time> - <a href="./java-segfault-redux.html">My favorite bug: segfaults in Java (redux)</a></li>
diff --git a/public/index.md b/public/index.md
index 20a0750..fdef2b8 100644
--- a/public/index.md
+++ b/public/index.md
@@ -12,6 +12,7 @@ time {
* <time>2018-02-09</time> - [POSIX pricing and availability; or: Do you really need the PDF?](./posix-pricing.html)
* <time>2018-02-09</time> - [GNU/Linux Keyboard Maps: xmodmap](./kbd-xmodmap.html)
+ * <time>2018-02-09</time> - [The interesting architecture of crt.sh](./crt-sh-architecture.html)
* <time>2016-09-30</time> - [Notes on subtleties of HTTP implementation](./http-notes.html)
* <time>2016-02-28</time> - [My X11 setup with systemd](./x11-systemd.html)
* <time>2016-02-28</time> - [My favorite bug: segfaults in Java (redux)](./java-segfault-redux.html)