Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

James & Gerhard build Jerod's Pipe Dream™️. Adam helps. #518

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

gerhard
Copy link
Member

@gerhard gerhard commented Jun 21, 2024

How it all started

We started on this crazy journey many years ago: Kaizen 1 - The day half the internet went down

You've heard us talk about CDNs a lot in recent Kaizen episodes:

Having tried all sorts of things, we finally did it. @jerodsanto's pipe dream is real:

Screenshot 2024-06-21 at 19 27 04

This pull request is the first concrete step towards building our simple, wet wipe, CDN. If you want the context behind this, you must listen to Kaizen 15 when it ships as 🎧 https://changelog.com/friends/50. In the meantime, you should probably watch this to catch up on what happened between Kaizen 13-14 & Kaizen 15 - @jamesarosen is awesome 🙌

🎬 Let's Build a CDN - Part 1

lets-build-a-cdn-v6

Where are we today?

Back to this PR, there are a few important elements that are worth calling out:

  1. fly.io/cdn-2024-01-26/default.vcl is where all the Varnish magic happens. The comments (especially those at the bottom) are super helpful for context.
  2. fly.io/cdn-2024-01-26/fly.toml is the Fly.io app config.
  3. fly.io/cdn-2024-01-26/run is a super duper™️ amazing script that made both @adamstac & @jerodsanto chuckle. You should definitely try running it and see what happens. Attaching a few screenshots that show what to expect:
image


Sub-commands like deploy, world-scale & small-scale are not going to work for you. It's OK to skip them. The most interesting command is ./run demo-2024-06-21 . It has 7 steps & this is what each step looks like:

image image image image Screenshot 2024-06-21 at 19 59 59 image

While the above screenshots give you a good idea of what is going on in the background, for the best experience, you will want to listen to 🎧 https://changelog.com/friends/50. We captured all that fun, joy & excitement in a way that only a conversation between good friends is able to 😁 🤣 😆

Note

There is now a video available for the actual demo: 🎬 Gerhard shows Adam & Jerod The Pipe Dream™️

Screenshot 2024-08-02 at 08 03 51

What happens next?

We still have a bunch of challenges ahead of us. See fly.io/cdn-2024-01-26/default.vcl for more details. I intend to make progress with them as time permits.

If there are any tips that you can share, we look forward to them! Drop it as a comment below, or join us in Changelog's Slack if you prefer it cosier: https://changelog.slack.com/archives/C024Q4CER

@jamesarosen is exploring a different direction which I am super excited about. We intend to get together in a few weeks to record together and have some more fun with this. By then, https://changelog.com/friends/50 will have shipped, so that is definitely the next milestone.

Things that we want to do part of this PR

Follow-ups to this PR

  • Split into a separate repository - thechangelog/pipedream sounds amazing @jerodsanto 💪

Kaizen!

gerhard and others added 5 commits June 21, 2024 19:15
Signed-off-by: Gerhard Lazu <[email protected]>
We don't want it to connect to Fly.io Proxy, and therefore go to the
edge of the network, pays the HTTPS termination penalty, and then comes
back again to the app instance. Stay within the Fly.io internal network
& connect to the app port directly instead 🤦

Signed-off-by: Gerhard Lazu <[email protected]>
Run instances across 4 regions
Use up to 2GB of RAM (single shared CPU is enough)

👌

Signed-off-by: Gerhard Lazu <[email protected]>
Based on the metrics from last 12h.

Also update the TODO to reflect current perspective.

Signed-off-by: Gerhard Lazu <[email protected]>
# - https://abhishekjakhotiya.medium.com/magento-internals-cache-purging-and-cache-tags-bf7772e60797

backend default {
.host = "top1.nearest.of.changelog-2024-01-12.internal";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to play with DNS anycast by trying top2.nearest, especially if you have more than 3-4 changelog.internal origin servers.

Alternatively, you think it's worth experimenting with Varnish-level shielding? Cache nodes in low-traffic areas will have low cache-hit ratios.

I wonder if there would be a way to use Fly's DNS to specify that the cache servers in the same datacenters as the origin servers are shields. 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We started with <appname>.internal which resulted in Varnish errors since .host is not able to handle multiple values. FTR, this is what that returns:

host changelog-2024-01-12.internal
changelog-2024-01-12.internal has IPv6 address fdaa:0:4556:a7b:2f7:4d32:8516:2
changelog-2024-01-12.internal has IPv6 address fdaa:0:4556:a7b:2f7:89b9:c9b8:2

top2.nearest will behave the same since we are running 2 instances in a single region:

host top2.nearest.of.changelog-2024-01-12.internal
top2.nearest.of.changelog-2024-01-12.internal has IPv6 address fdaa:0:4556:a7b:2f7:4d32:8516:2
top2.nearest.of.changelog-2024-01-12.internal has IPv6 address fdaa:0:4556:a7b:2f7:89b9:c9b8:2

Do you think that we should use something other than .host in Varnish?

I really like your idea of having a shield Varnish node in the same region as the origins, and then pointing all other varnish nodes to that shield. Provided that we can figure out how to template the varnish config at instance boot time, we could do the following:

  • if $FLY_REGION == $PRIMARY_REGION (both are environment variables available on every Fly.io machine), set the backend .hosts to the values from changelog-2024-01-12.internal (we would need to figure out how to handle more than 1 value)
  • if $FLY_REGION != $PRIMARY_REGION set the backend .hosts to the Varnish shield region, i.e. $PRIMARY_REGION.$FLY_APP_NAME.internal (this assumes that we want to run more than one shield instance in the same region for redundancy purposes)

WDYT?


Fly.io .internal DNS

Copy link

@jamesarosen jamesarosen Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that we should use something other than .host in Varnish?

Based on my (possibly outdated) understanding, yes. .host will limit you to a single IP address. If you want to use a CNAME to support anycast, you'll want a dynamic backend: vmod-dynamic or vmod-reqwest

The alternative, I suppose, would be to resolve the anycast address at build, then create a .host for each IP it returns. But then you'd have to recompile the VCL regularly. If dynamic isn't an option, I suppose we could put a daemon on the containers that polls top2.nearest.of and compiles the VCL every minute, then tells Varnish to reload the config.

Varnish resolves backend IP only at launch time (and how to fix it) is also worth reading

Comment on lines +73 to +90
set resp.http.x-ttl = obj.ttl;
# What is the max object staleness permitted?
set resp.http.x-grace = obj.grace;

# Did the response come from Varnish or from the backend?
if (obj.hits > 0) {
set resp.http.x-cache = "HIT";
} else {
set resp.http.x-cache = "MISS";
}

# Is this object stale?
if (obj.ttl < std.duration(integer=0)) {
set resp.http.x-cache = "STALE";
}

# How many times has this response been served from Varnish?
set resp.http.x-cache-hits = obj.hits;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's perfectly fine to use the x- prefix, but this practice was deprecated in 2012.

Copy link
Member Author

@gerhard gerhard Jun 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is good to know!

Having looked at IANA Hypertext Transfer Protocol (HTTP) Field Name Registry, the header which seems closest to what we want is Cache-Status. I have noticed that this does not mention anything about hits or grace

Does the following Cache-Status header example look good to you @jamesarosen?

Cache-Status: Shield; hit; hits=101; ttl=53.00; grace=86400.000, Edge; stale; hits=2; ttl=-123.00; grace=3600.000

For reference, this is how we are currently capturing the above info - we don't yet have Shield, only Edge:

x-cache: HIT
x-cache-hits: 1
x-ttl: 55.312
x-grace: 86400.000

Comment on lines +103 to +105
# - If the backend gets restarted (e.g. new deploy), backend remains sick in Varnish
# - https://info.varnish-software.com/blog/two-minute-tech-tuesdays-backend-health
# - EXPLORE: varnishlog -g raw -i backend_health
Copy link
Member Author

@gerhard gerhard Jun 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesarosen this is by far the most problematic part of this config. If the app gets re-deployed, the backend remains sick. This is what I am seeing in the varnish logs:

Screenshot 2024-06-20 at 07 09 18

My interpretation of the above is that Varnish does not try to re-establish the TCP connection to the backend once this goes away. WDYT?

The only fix that I have so far is to restart all Varnish instances, which forces the TCP connection to get re-established. It's basically this:

flyctl machines list | awk '/cpu/ { system("flyctl machines restart " $1) }'

Copy link

@jamesarosen jamesarosen Jun 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two hypotheses:

Hypothesis One: It Takes Longer Than You Think

This VCL config in your probe:

.interval = 5s;
.window = 10;
.threshold = 5;

Means that the backend will only be marked healthy why 5 (threshold) of the last 10 (window) requests have returned a 200 OK. Once a backend becomes unhealthy, it will take up to 30s ((5 + 1) * 5s) to recover.

Try waiting longer. If that works, the solution is probably to do rolling restarts and use Varnish to load-balance. Given the way Fly's 6PN addressing works, I don't think you can be sure that top1.nearest.of wouldn't resolve to the same thing as any other given host. Thus, I think you need not 2 but 3 backends:

  1. top1.closest.of -- the primary backend
  2. iad (or any other fixed value) -- failover 1
  3. sjc (or any other fixed value) -- failover 2

You would set up probes (healthchecks) for all three. Then you can check the health status in vcl_recv:

set req.backend = primary;
if(!req.backend.healthy) {
     set req.backend = failover1;
}
if(!req.backend.healthy) {
     set req.backend = failover2;
}

Or instead of implementing this yourself, you could use a fallback director.

Hypothesis Two: You're Not Getting 200s

Is is possible that the /health endpoint is returning a 302 or 307? Perhaps it's redirecting from http:https:// to https://? What do the origin logs say?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have the varnish container running and can exec into it to run varnishlog then you could see what kind of responses Varnish is getting from your backends and this may help with troubleshooting any sick backends.

I cloned the repo and built the container on my local machine to have a running Varnish container to demonstrate the varnishlog command. I had to adjust the vcl and set the backend to target pipedream.changelog.com since it was complaining about not being able to resolve top1.nearest.of.changelog-2024-01-12.internal from my machine. Even though pipedream.changelog.com was only providing 503 responses to me, it still works to demonstrate the use of varnishlog just fine.

Once I had it running I just had a different shell fire off a request through it curl -sv http:https://127.0.0.1:9000/ to get some log details.

varnish@2a59337f72cd:/etc/varnish$ varnishlog
*   << BeReq    >> 3         
-   Begin          bereq 2 fetch
-   VCL_use        boot
-   Timestamp      Start: 1719172226.180749 0.000000 0.000000
-   BereqMethod    GET
-   BereqURL       /
-   BereqProtocol  HTTP/1.1
-   BereqHeader    Host: 127.0.0.1:9000
-   BereqHeader    User-Agent: curl/8.8.0
-   BereqHeader    Accept: */*
-   BereqHeader    X-Forwarded-For: 192.168.65.1
-   BereqHeader    Via: 1.1 2a59337f72cd (Varnish/7.4)
-   BereqHeader    Accept-Encoding: gzip
-   BereqHeader    X-Varnish: 3
-   VCL_call       BACKEND_FETCH
-   VCL_return     fetch
-   Timestamp      Fetch: 1719172226.180792 0.000042 0.000042
-   FetchError     backend default: unhealthy
-   Timestamp      Beresp: 1719172226.180795 0.000046 0.000003
-   Timestamp      Error: 1719172226.180796 0.000046 0.000000
-   BerespProtocol HTTP/1.1
-   BerespStatus   503
-   BerespReason   Backend fetch failed
-   BerespHeader   Date: Sun, 23 Jun 2024 19:50:26 GMT
-   BerespHeader   Server: Varnish
-   VCL_call       BACKEND_ERROR
-   BerespHeader   Content-Type: text/html; charset=utf-8
-   BerespHeader   Retry-After: 5
-   VCL_return     deliver
-   Storage        malloc Transient
-   Length         278
-   BereqAcct      0 0 0 0 0 0
-   End            

*   << Request  >> 2         
-   Begin          req 1 rxreq
-   Timestamp      Start: 1719172226.180555 0.000000 0.000000
-   Timestamp      Req: 1719172226.180555 0.000000 0.000000
-   VCL_use        boot
-   ReqStart       192.168.65.1 57563 http
-   ReqMethod      GET
-   ReqURL         /
-   ReqProtocol    HTTP/1.1
-   ReqHeader      Host: 127.0.0.1:9000
-   ReqHeader      User-Agent: curl/8.8.0
-   ReqHeader      Accept: */*
-   ReqHeader      X-Forwarded-For: 192.168.65.1
-   ReqHeader      Via: 1.1 2a59337f72cd (Varnish/7.4)
-   VCL_call       RECV
-   VCL_return     hash
-   VCL_call       HASH
-   VCL_return     lookup
-   VCL_call       MISS
-   VCL_return     fetch
-   Link           bereq 3 fetch
-   Timestamp      Fetch: 1719172226.181146 0.000590 0.000590
-   RespProtocol   HTTP/1.1
-   RespStatus     503
-   RespReason     Backend fetch failed
-   RespHeader     Date: Sun, 23 Jun 2024 19:50:26 GMT
-   RespHeader     Server: Varnish
-   RespHeader     Content-Type: text/html; charset=utf-8
-   RespHeader     Retry-After: 5
-   RespHeader     X-Varnish: 2
-   RespHeader     Age: 0
-   RespHeader     Via: 1.1 2a59337f72cd (Varnish/7.4)
-   VCL_call       DELIVER
-   RespHeader     x-ttl: -0.000
-   RespHeader     x-grace: 0.000
-   RespHeader     x-cache: MISS
-   RespUnset      x-cache: MISS
-   RespHeader     x-cache: STALE
-   RespHeader     x-cache-hits: 0
-   VCL_return     deliver
-   Timestamp      Process: 1719172226.181170 0.000614 0.000024
-   Filters        
-   RespHeader     Content-Length: 278
-   RespHeader     Connection: keep-alive
-   Timestamp      Resp: 1719172226.181219 0.000663 0.000048
-   ReqAcct        77 0 77 315 278 593
-   End            

*   << Session  >> 1         
-   Begin          sess 0 HTTP/1
-   SessOpen       192.168.65.1 57563 http 172.17.0.2 9000 1719172226.180510 24
-   Link           req 2 rxreq
-   SessClose      REM_CLOSE 0.001
-   End            

The way this log works is you get multiple pieces, the Session, BeReq, and Request. It ends up listing this in the order that things finish and you have to hunt around a bit for where things started and figure out the chain of events.

  1. Here the Session has an identifier of 1 and is the starting point, from there you can see a reference to Link making a req 2.
  2. This is where the beginning of Request 2 starts and there is a reference that Request 2 was called by the id 1 in Begin req 1 rxreq. You work through a few lines of the Request and you'll see another Link that points to bereq 3 fetch which is where it makes a backend request.
  3. This is the point where BeReq starts and if you look up at BeReq, you'll see it contains a reference to being called from the identifier 2 in Begin bereq 2 fetch. You can see the pieces of the request that BeReq received, where it makes a call to the backend, some timestamps of the various events and the details of the response it received from the backend, and then the BeReq finishes which is why it is first in the log entries.
  4. Next you have the continuation of Request 2 where it receives a response and lists all the details of the response it got back from BeReq then it finishes
  5. Finally you are back to Session and it finishes

If you have a Varnish instance with a bunch of activity it's useful to use some command line options for varnishlog to filter things a bit to only show the log events you want, and a nice reference for the command line options is here: https://www.varnish-software.com/developers/tutorials/varnishlog/

I had to go reference an old gist I made when doing this more regularly. Using the -q option is useful to filter the logs and only show logs coming from your own IP.

varnishlog -q 'ReqHeader:X-Forwarded-For eq "192.168.65.1" or BereqHeader:X-Forwarded-For eq "192.168.65.1"'

...this would only end up showing the Request and BeReq events from the log since the Session wouldn't have these headers

varnishlog -q 'ReqURL eq "/friends/50" or BereqURL eq "/friends/50"'

...this would only show Request and BeReq events to a specific URL.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another hypothesis: Fly is assigning a new IP address during the deploy. If you give a DNS name for a .host, Varnish only resolves that once, at launch. If Fly assigns a new IP, you'll be hammering someone else's service with healthcheck requests 😓

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend upvoting this Fly.io feature request to support SRV records. libvmod works with getdns to support SRV records. That would mean you wouldn't have to specify the port here. It would pick the port up from DNS.

Comment on lines +3 to +5
# Closest to James
primary_region = "sjc"
# Secondary region will be "lhr", closest to Gerhard
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: change this to iad so that it's in the same region as the origin.

Stop cache insertion when a backend fetch returns an 5xx error.

Add new varnish_status synthetic response for the health-checks.

Make it:
- World Scale™
- Slow Scale™
- Adam Scale™
- Jerod Scale™
- Gerhard Scale™

Convert `demo` to `run` and teach it a few new tricks.

Signed-off-by: Gerhard Lazu <[email protected]>
@gerhard gerhard force-pushed the james-and-gerhard-build-jerods-pipedream-adam-helps branch from eff9bcb to a44e0cb Compare June 25, 2024 18:21
kill_timeout = 30

[env]
VARNISH_SIZE="500M"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this VARNISH_SIZE="500M" environment variable tells Varnish that it can use 500MB of RAM for it's service, but down below you have the VM set to only have 256MB of RAM, so this may result in Varnish trying to use more memory than the system has available.
https://www.varnish-software.com/developers/tutorials/running-varnish-docker/#setting-the-cache-size

@mttjohnson
Copy link

I forked the repo and picked up the branch (https://github.com/mttjohnson/changelog.com/commits/james-and-gerhard-build-jerods-pipedream-adam-helps/) to experiment with and test a few Varnish things in a local docker container.

Aside from the commits where I experimented with VCL modifications I also tried to keep notes for commands I was running, some of the results, thoughts, and resources I found along the way.

https://github.com/mttjohnson/changelog.com/blob/james-and-gerhard-build-jerods-pipedream-adam-helps/fly.io/cdn-2024-01-26/VARNISH_TESTING.md

What I've explored and documented so far I'm hoping will provide some insight that I think could resolve the sick backends issue you've been seeing, and provide a starting point for purging/banning content from the cache in varnish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants