New Domain Parking / Set-Up Notes

Steps to “park” a new domain with email and HTTP service. Total cost is ~$12/year assuming you already have a web server set up.

Domain Registration and DNS

  • Register domain with Amazon Route53 ($12/year for .com)
    • Delete the public “Hosted Zone” ($6/year) since CloudFlare will be used for hosting DNS
    • No Route53 Hosted Zone is necessary, unless you want to run a VPC with its own private view of the domain, in which case there needs to be a private Hosted Zone.
  • Create CloudFlare free-tier account for DNS hosting
    • Change Amazon Route53 DNS server settings over to CloudFlare
    • CloudFlare settings that you might want to adjust:
      • Crypto/SSL Policy: see below
      • Always Use HTTPS: On (unless you need fine-grained control over HTTP→HTTPS redirection)


Assume you have a web server that will respond to HTTP requests on the new domain.

  • Option 1: Direct Connection (CloudFlare ingress and SSL termination, but no SSL to the origin)
    • Use a single-host A/CNAME record in CloudFlare
    • CloudFlare will handle SSL termination, but must be used in “Flexible” crypto mode which reverts to HTTP when talking to the origin server.
  • Option 2a: Proper SSL Setup with AWS load balancer (~$240/year) and its built-in certificate
    • Create an EC2 load balancer with a certificate appropriate for the domain
    • Use a CNAME record in CloudFlare pointing to the load balancer’s DNS name
    • Now you can enable CloudFlare’s “Full” crypto mode
  • Option 2b: Proper SSL Setup with Let’s Encrypt (free)
    • TBD – needs some kind of containerized HTTP server that updates the certificate automatically

Email Forwarding

It is important to be able to receive email addressed to, for example to respond to verification emails for future domain transfers or SSL certificate issuance.

Email forwarding can be set up for free using Mailgun:

  • Create Mailgun free-tier account on the top-level domain
  • Add the necessary DNS records for Mailgun at CloudFlare (domainkey and MX servers)
  • In Mailgun’s “Routes” panel, create a rule that matches incoming email to and forwards it as necessary

Email Reception

If you actually want to receive (not just forward) incoming email, either use Gmail on the domain, or the following (nearly-free) AWS system:

  • In Amazon SES, add and verify the domain
    • This will require adding a few more records at CloudFlare, including MX records
  • Set up an SES rule to accept incoming email and store messages in S3
  • Use a script like this one to poll S3 for new messages and deliver them via procmail

Cloudflare Cookie Caching Confusion

Recently I discovered a surprising behavior of Cloudflare that affects sites that use HTTP cookies to hold user access tokens:

CloudFlare’s non-Enterprise service tiers do not respect the “Vary: Cookie” HTTP header.

This impacts sites that:

  • Are relying on CloudFlare’s “Cache Everything” option
  • Expose both “logged in” and “not logged in” versions of the same URL, where the page content differs for logged-in and anonymous visitors
  • Use cookies to differentiate between logged-in and anonymous visitors

Under these conditions, a logged-in user who visits a URL that is also visible to anonymous traffic will sometimes be served the “not logged in” version of a page.

This is confusing because the user will not see their normal posting controls, and they may be asked to log in again unnecessarily.

This problem won’t affect a site where logged-in users always see different URLs, for example a banking or webmail service where anonymous visitors will never see the same pages as logged-in users.

The impact happens on sites like discussion forums where anonymous visitors have read-only access to the same pages where logged-in users hang out.

Why this happens

For URLs that are visible (in different forms) to logged-in and anonymous visitors, it is important for your web server to return a Vary: Cookie HTTP header, to tell caches like CloudFlare that a different version of the page will be returned depending on what cookies the browser sends.

Of course, pages visible only to logged-in users should also send a Cache-Control: private header, to prevent their content from being exposed to anyone other than the user who requested them.

CloudFlare is usually very good at obeying the letter of the relevant web standards, but in this case, it will ignore Vary: Cookie. This means that anonymous hits will poison the cache for logged-in users. CloudFlare will serve up the anonymous version of the page even though the browser is sending a valid logged-in cookie.


One cannot rely on Cache-Control: public/private alone from fixing the problem, because both anonymous and logged-in hits will have the exact same cache key. CloudFlare properly declines to cache responses marked Cache-Control: private, but as soon as it sees a Cache-Control: public response, it will be used to fulfill subsequent hits on the same key.

Proper fixes include:

  • Use a Page Rule to disable caching on the affected URLs. Of course, this means you do not benefit from the cache anymore.
  • Find a way to ensure logged-in users always view different URLs than anonymous traffic.
  • Upgrade to CloudFlare’s Enterprise service tier.

Cost Disease

From the article (not my own words):

“If some government program found a way to give poor people good health insurance for a few hundred dollars a year, college tuition for about a thousand, and housing for only two-thirds what it costs now, that would be the greatest anti-poverty advance in history. That program is called ‘having things be as efficient as they were a few decades ago’.”

Read more at Philip Greenspun’s blog:

The original source is here:

Crude Linear Models Almost Always Outperform Human Judgment

Fascinating article – from 1979 – on how even a crudely-constructed linear model is almost always superior to human expert judgment on classification tasks:

Robyn M. Dawes, “The Robust Beauty of Improper Linear Models in Decision Making”

Is there still any role for people in decisionmaking?

But people are important. The statistical model may integrate the information in an optimal manner, but it is always the individual (judge, clinician, subjects) who chooses variables. Moreover, it is the human judge who knows the directional relationship between the predictor variables and the criterion of interest, or who can code the variables in such a way that they have clear directional relationships.

Now consider how good machine learning and AI algorithms have become at these parts of the task! Soon it might become optimal to delegate the whole process to machines.