What Are Cold Starts and Why Do They Happen?

A cold start occurs when a serverless function receives a request but no warm instance is available to handle it. The platform must provision a new execution environment from scratch — allocating a container, loading the runtime, parsing your code, and initializing your application before it can process the first byte of the response.

On Vercel, every API route, every server-rendered page, and every middleware function runs as a serverless function under the hood. When traffic is steady, Vercel keeps these functions “warm” — the container stays alive and subsequent requests are fast. But when a function hasn't been invoked recently (typically after 5–15 minutes of inactivity), Vercel tears down the container to save resources.

The next request hitting that function triggers a cold start. The delay depends on several factors: the size of your function bundle, the runtime (Node.js vs. Edge), the number of dependencies being imported, and whether you're connecting to external services like databases during initialization. In practice, cold starts on Vercel range from 300ms for lightweight edge functions to 3–5 seconds for heavy Node.js functions with large dependency trees.

How Cold Starts Affect User Experience

Cold starts don't just feel slow — they directly impact the performance metrics that Google uses for search rankings and that your users feel in their bones.

Time to First Byte (TTFB)

TTFB measures how long the browser waits before receiving the first byte from the server. A cold start adds 1–5 seconds to this metric. Google recommends TTFB under 800ms for a “good” experience. A cold-starting serverless function routinely exceeds 2,000ms — pushing your page into the “poor” category.

Largest Contentful Paint (LCP)

LCP is a Core Web Vital that measures when the largest visible element finishes rendering. For server-rendered pages, a slow TTFB directly delays LCP. If your page's server response is delayed by a 3-second cold start, your LCP will be at least 3 seconds plus the time to download, parse, and render the HTML. Google's threshold for “good” LCP is 2.5 seconds — a cold start alone can exceed that.

Bounce Rate & Conversion

Studies consistently show that each additional second of load time increases bounce rate by 7–11%. For e-commerce, a 1-second delay can reduce conversions by up to 7%. Cold starts disproportionately affect first-time visitors — exactly the users you most need to impress.

Measuring Cold Starts on Vercel

Before you can fix cold starts, you need to identify which functions are affected and how badly. Here are three practical approaches:

1. Vercel Function Logs

In your Vercel dashboard, navigate to your project's “Logs” tab and filter by serverless functions. Cold starts appear as requests with significantly higher duration than the median. Look for the Init Duration field — this shows the time spent bootstrapping the function. If a function's total duration is 2,500ms but the init duration is 2,100ms, you know the cold start is the bottleneck.

2. Custom Timing Headers

Add a global variable that tracks whether the function instance is new. On the first invocation, let isCold = true at the module level. In your handler, check isCold, set a x-cold-start: true response header, then flip it to false. This lets you filter cold start requests in your monitoring tools.

3. Real User Monitoring (RUM)

Tools like Vercel Analytics, Datadog, or web-vitals can capture TTFB distributions across real users. Look at the p95 and p99 TTFB values — these outliers are almost always cold starts. If your median TTFB is 200ms but p99 is 3,500ms, you have a cold start problem.

5 Ways to Reduce Cold Starts on Vercel

While you can't eliminate cold starts entirely on a serverless platform, these strategies can significantly reduce their frequency and duration:

1. Keep-alive pings (cron warming)

The simplest approach: hit your critical endpoints on a schedule to prevent them from going cold. Set up a cron job (using Vercel Cron, GitHub Actions, or an external service like UptimeRobot) to ping your most important API routes every 5–10 minutes.

Limitations: This only works for a small number of endpoints. If your app has 50 API routes, keeping all of them warm means 300+ pings per hour — eating into your serverless invocation quota and adding cost. It also only keeps one instance warm per region; concurrent requests can still trigger cold starts.

2. Use Edge Functions where possible

Vercel Edge Functions run on a lighter-weight runtime (based on V8 isolates rather than full Node.js containers). They cold-start in 50–300ms instead of 1–5 seconds. If your function doesn't need Node.js-specific APIs (like fs, native modules, or certain npm packages), switching to the Edge runtime can dramatically reduce cold start latency.

Limitations: Edge Functions have a smaller API surface — no native Node.js modules, limited execution time (typically 30 seconds), and a 4MB bundle size limit. Many database drivers and ORMs (like Prisma without the edge adapter) don't work on Edge.

3. Reduce your function bundle size

Larger bundles take longer to load during cold starts. Every dependency you import gets loaded on initialization. Audit your imports: do you really need the entire lodash library for one utility function? Use tree-shakeable imports (import groupBy from 'lodash/groupBy'). Check your bundle with @vercel/nft to see exactly what gets included.

Impact: Reducing a function bundle from 10MB to 2MB can cut cold start time by 40–60%. This is often the highest-ROI optimization.

4. Use Incremental Static Regeneration (ISR)

ISR lets you statically generate pages at build time and revalidate them in the background. A page using ISR serves from the CDN cache — no serverless function invocation at all for cached requests. The cold start only happens during background revalidation, which the user never sees.

Limitations: ISR only works for pages that can tolerate slightly stale data. Real-time dashboards, personalized content, or pages requiring authentication can't use ISR effectively. You're also still paying for the revalidation invocations.

5. Go fully static where possible

If a page doesn't need server-side data, export it as a static page. Static pages are served from Vercel's CDN with zero serverless involvement — no cold starts, no function costs, sub-100ms response times worldwide. Marketing pages, documentation, and blog posts are all candidates for static generation.

Impact: Converting SSR pages to static or ISR is the single most effective way to eliminate cold starts for those pages. Audit your getServerSideProps usage — many pages using it could be static or ISR instead.

The Fundamental Limitation of Serverless

Here's the uncomfortable truth: cold starts are inherent to the serverless model, not a bug in Vercel's implementation. The entire value proposition of serverless — scale to zero when idle, pay only for what you use — requires tearing down idle containers. That teardown is what causes cold starts.

Vercel, AWS Lambda, Google Cloud Functions, and Cloudflare Workers all face the same fundamental tradeoff. You can mitigate cold starts with the strategies above, but you can't eliminate them without eliminating the scale-to-zero behavior that defines serverless. Even Vercel's paid “Fluid Compute” option reduces cold starts but doesn't remove them entirely. For a deeper dive into when serverless makes sense and when it doesn't, see our serverless vs VPS comparison.

For many applications, this tradeoff is perfectly acceptable. A blog that gets 100 visitors per day doesn't care about a 2-second cold start once per hour. But for applications where consistent sub-second response times matter — SaaS dashboards, API endpoints, e-commerce checkout flows — the serverless cold start problem becomes a real performance bottleneck.

The Alternative: Always-On Processes with PM2

If cold starts are a fundamental limitation of serverless, the logical alternative is to not use serverless. A VPS (Virtual Private Server) runs your application as an always-on process — the Node.js server starts once and stays running 24/7. There is no container to spin up, no runtime to initialize, no cold start. Every request hits a warm, ready process.

With a process manager like PM2, your Next.js application runs in cluster mode across all available CPU cores. PM2 handles automatic restarts if the process crashes, zero-downtime reloads during deployments, and log management. The result is consistent, predictable response times — typically 20–80ms TTFB regardless of whether the last request was 1 second ago or 1 hour ago.

The historical argument against this approach was complexity: you had to configure the server yourself, set up Nginx as a reverse proxy, manage SSL certificates, and build your own deployment pipeline. That's no longer the case. Tools like DeployWise automate the entire setup — connect your GitHub repo, add your VPS, and deploy. DeployWise configures PM2, Nginx, Let's Encrypt SSL, and git-webhook-based auto-deployments in under 2 minutes.

Vercel Serverless vs. VPS with PM2

Here's how the two approaches compare on the metrics that matter most for performance-sensitive applications:

Metric	Vercel Serverless	VPS + PM2
Cold start TTFB	1–5 seconds	N/A (always warm)
Warm TTFB	50–200ms	20–80ms
p99 TTFB	2–5 seconds	100–200ms
Consistency	Variable (cold vs warm)	Predictable
Scale-to-zero	Yes (causes cold starts)	No (always running)
Auto-scaling	Automatic	Manual (add servers)
Global edge	Built-in CDN	Add Cloudflare (free)
Cost (moderate traffic)	$20–100+/mo	$5–10/mo
Best for	Low-traffic, spiky workloads	Consistent traffic, latency-sensitive apps

Neither approach is universally better. Vercel excels at zero-config deployments and handles massive traffic spikes automatically. A VPS wins on consistent latency, cost predictability, and full infrastructure control. Choose based on what your application actually needs.

Vercel Cold Starts: Why They Happen & How to Fix Them