Additional Context : Serverless is a vast area. This post specifically evaluates if frequently pinging one or more Urls is a reliable approach to mitigate cold start delays for websites hosted on Node based serverless platforms like Vercel or Netlify.
Our team wanted to evaluate if their Next.js website could be served via any of the available serverless platforms. We saw that Vercel and Netlify are two popular platforms that offered great DX and could serve our Next.js frontend without requiring any serverless platform specific modifications to our code.
Our Next.js frontend serves a few statically generated routes and a few server-side generated routes. It also serves around a dozen API routes.
I was tasked to determine the speed and scalability impact of serving our website via a serverless platform like Vercel or Netlify.
Given the speed-sensitive nature of our website (eCommerce), one of my primary concerns was to check if cold start delays could affect our site-speed (specifically, the server-response speed or the time-to-first-buffer).
I had read that cold starts could be prevented by frequently pinging one or more website urls to keep our serverless functions warm. So, I designed an experiment to evaluate if we could mitigate cold start delays when deploying our website on Vercel or Netlify.
I had undertaken the original experiment with our actual website frontend code (my client team’s private repo). Since I cannot share that code, I re-ran the same experiment with a modified version of Next.js commerce example for the purpose of this blog post.
Following are the relevant changes made to the forked version of the Next.js commerce example code:
formikto some server-side routes (but not all server-side rendered routes).
axiosto one of them while not to another.
The repo was deployed on both Vercel and Netlify (free tier plan).
I set up a mechanism so that the server-response would send back a header
false to tell us if a certain request was served from a cold started or an already running runtime.
Using the HTTP response header to determine identify cold starts
To determine the correct value for this response header, a global in-memory object is maintained whose value can be
get across requests. By accessing this global object, a request can determine if they are the first request to be served by the current runtime or no. Here and here are the relevant parts for this logic.
I created a simple script that fires
axios.get() requests to the six distinct urls (four server-side rendered routes and two api urls) of the frontend hosted on Vercel and Netlify platforms.
Pseudo code for test script that fires the requests
During the 24 iterations, the first request of an iteration was always served by a cold-started runtime. This was expected. The unexpected part - more than 30% consecutive requests were also served by a cold-started runtime. This happened for the frontend code deployed on both the serverless platforms - Netlify and Vercel. The two charts below detail this:
As expected, every time a request was served by a cold-started runtime, it slowed the server-response time on both the platforms.
When six distinct urls for our website are sequentially hit at the gap of 1 minute, I expected the first request to result in a cold-start, but not the later requests. To understand this behavior better, I tried to understand how Vercel and Netlify deployed our website code.
Both Vercel (source) and Netlify (source) leverage AWS lambda for their serverless offering. From the details here and here, it appears that a single website output bundle is split and deployed across multiple AWS lambdas. So, two urls for a website -
https://xyz.com/route_2 can potentially be served by different lambda functions. And, pinging
https://xyz.com/route_1 will not keep the lambda for
https://xyz.com/route_2 warm. So, unless we track how our website routes are grouped & deployed across different lambdas, we cannot reliably keep all our routes in-memory by pinging the urls.
Serverless platforms split and deploy our single large output bundle across multiple lambdas because function size affects the cold start times and how long the functions are retained in memory. So, forcing all our routes into a single lambda may introduce other cold start delay issues.
From the experiment results, I concluded that frequently pinging one or more website urls is not a reliable approach to mitigate cold start delays (also see this). As a result, deploying our website on Vercel, Netlify or a similar Node based serverless platform may potentially result in slow server-response speed for those server-side rendered routes and API end-points that aren’t requested frequently.
I also arrived at the following conclusions:
cache-hit ratioto determine effectiveness of a caching solution, any approach to mitigate cold start delays should be tracked via
cold start to warm cache ratio.