summaryrefslogtreecommitdiff
path: root/public/crt-sh-architecture.html
blob: a0a284088aab5e502e85597de2963d2f0e14b041 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <title>The interesting architecture of crt.sh — Luke T. Shumaker</title>
  <link rel="stylesheet" href="assets/style.css">
  <link rel="alternate" type="application/atom+xml" href="./index.atom" name="web log entries"/>
</head>
<body>
<header><a href="/">Luke T. Shumaker</a> » <a href=/blog>blog</a> » crt-sh-architecture</header>
<article>
<h1 id="the-interesting-architecture-of-crt.sh">The interesting
architecture of crt.sh</h1>
<p>A while back I wrote myself a little dashboard for monitoring TLS
certificates for my domains. Right now it works by talking to <a
href="https://crt.sh/" class="uri">https://crt.sh/</a>. Sometimes this
works great, but sometimes crt.sh is really slow. Plus, it’s another
thing that could be compromised.</p>
<p>So, I started looking at how crt.sh works. It’s kinda cool.</p>
<p>There are only 3 separate processes:</p>
<ul>
<li>Cron
<ul>
<li><a
href="https://github.com/crtsh/ct_monitor"><code>ct_monitor</code></a>
is program that uses libcurl to get CT log changes and libpq to put them
into the database.</li>
</ul></li>
<li>PostgreSQL
<ul>
<li><a
href="https://github.com/crtsh/certwatch_db"><code>certwatch_db</code></a>
is the core web application, written in PL/pgSQL. It even includes the
HTML templating and query parameter handling. Of course, there are a
couple of things not entirely done in pgSQL…</li>
<li><a
href="https://github.com/crtsh/libx509pq"><code>libx509pq</code></a>
adds a set of <code>x509_*</code> functions callable from pgSQL for
parsing X509 certificates.</li>
<li><a
href="https://github.com/crtsh/libcablintpq"><code>libcablintpq</code></a>
adds the <code>cablint_embedded(bytea)</code> function to pgSQL.</li>
<li><a
href="https://github.com/crtsh/libx509lintpq"><code>libx509lintpq</code></a>
adds the <code>x509lint_embedded(bytea,integer)</code> function to
pgSQL.</li>
</ul></li>
<li>Apache HTTPD
<ul>
<li><a
href="https://github.com/crtsh/mod_certwatch"><code>mod_certwatch</code></a>
is a pretty thin wrapper that turns every HTTP request into an SQL
statement sent to PostgreSQL, via…</li>
<li><a
href="https://github.com/crtsh/mod_pgconn"><code>mod_pgconn</code></a>,
which manages PostgreSQL connections.</li>
</ul></li>
</ul>
<p>The interface exposes HTML, ATOM, and JSON. All from code written in
SQL.</p>
<p>And then I guess it’s behind an nginx-based load-balancer or somesuch
(based on the 504 Gateway Timout messages it’s given me). But that’s not
interesting.</p>
<p>The actual website is <a
href="https://groups.google.com/d/msg/mozilla.dev.security.policy/EPv_u9V06n0/gPJY5T7ILlQJ">run
from a read-only slave</a> of the master DB that the
<code>ct_monitor</code> cron-job updates; which makes several security
considerations go away, and makes horizontal scaling easy.</p>
<p>Anyway, I thought it was neat that so much of it runs inside the
database; you don’t see that terribly often. I also thought the little
shims to make that possible were neat. I didn’t get deep enough in to it
to end up running my own instance or clone, but I thought my notes on it
were worth sharing.</p>

</article>
<footer>
  <aside class="sponsor"><p>I'd love it if you <a class="em"
      href="/sponsor/">sponsored me</a>.  It will allow me to continue
      my work on the GNU/Linux ecosystem.  Thanks!</p></aside>

<p>The content of this page is Copyright © 2018 <a href="mailto:lukeshu@lukeshu.com">Luke T. Shumaker</a>.</p>
<p>This page is licensed under the <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a> license.</p>
</footer>
</body>
</html>