Serverless PDF Generation: Why Headless Browsers Break

Quick Summary

Serverless and headless Chrome conflict: large binary sizes, slow cold starts, and memory constraints cause timeouts and leaked processes.

Why does serverless PDF generation fail with headless Chrome?

Short answer: Serverless platforms are optimized for short, stateless functions; Chrome requires large binaries, memory, and time to start, causing instability and high costs.

Serverless PDF generation with headless Chrome (Puppeteer, Playwright) breaks because serverless environments have strict constraints—cold start delays, memory limits (128MB-10GB), timeout limits (15min max), and deployment size limits (250MB unzipped)—that conflict with Chrome's requirements: 150-300MB binary, 2-5s startup time, 200-500MB memory per instance, and zombie processes that leak memory. After 50-200 renders or a few timeout errors, your Lambda function crashes or your costs spike.

What are the key serverless constraints affecting PDF generation?

Short answer: Cold starts, memory limits, timeouts, and deployment size limits all make running Chrome in serverless fragile.

Serverless platforms (AWS Lambda, Vercel Functions, Cloudflare Workers, Google Cloud Functions) optimize for short-lived, stateless functions that start fast, use minimal memory, and handle one request at a time. Here are the hard limits:

Cold Start Delays

Cold start = Time to initialize new function instance (container) from scratch.

Platform	Cold Start Time	Warm Duration
AWS Lambda	200-800ms	5-15 minutes
Vercel Functions	300-1000ms	5 minutes
Cloudflare Workers	0ms (V8 isolates)	N/A
Google Cloud Functions	500-1200ms	15 minutes

Impact on Chrome:

Chrome launch adds 2-5s to cold start
With Chrome, total cold start = 3-8s
First request after idle = slow response

Memory Limits

Platform	Min Memory	Max Memory	Default
AWS Lambda	128MB	10GB	1GB
Vercel Functions	1GB	3GB	1GB
Cloudflare Workers	128MB	128MB	128MB
Google Cloud Functions	128MB	8GB	256MB

Impact on Chrome:

Chrome process alone: 150-250MB
PDF render memory: 50-200MB (depends on complexity)
Total per render: 200-500MB
Minimum viable: 512MB-1GB

Timeout Limits

Platform	Hobby/Free	Pro/Paid	Max
AWS Lambda	3s	900s	15min
Vercel Functions	10s	60s	900s (enterprise)
Cloudflare Workers	10ms (CPU)	50ms (CPU)	30s (wallclock)
Google Cloud Functions	60s	540s	9min

Impact on Chrome:

Chrome launch: 2-5s
PDF render: 3-10s (complex docs)
Total: 5-15s minimum
Vercel Hobby tier (10s) = frequent timeouts

Deployment Size Limits

Platform	Zipped	Unzipped	With Layers
AWS Lambda	50MB	250MB	250MB + 5 layers (250MB each)
Vercel Functions	Varies	~250MB	N/A
Cloudflare Workers	1MB	10MB	N/A
Google Cloud Functions	100MB	500MB	N/A

Impact on Chrome:

Chrome binary: 150-300MB (Linux), 200MB+ (with dependencies)
Requires Lambda Layers or custom Docker image
Deployment time: 30-60s (uploading large binary)

Ephemeral Filesystem

Serverless functions have limited, temporary disk space:

AWS Lambda: 512MB-10GB /tmp (cleared between cold starts)
Vercel/Cloudflare: No persistent disk

Impact on Chrome:

Chrome needs /tmp for profiles, caches
Can't persist state across invocations
Must write intermediate files to /tmp, clean up after

Why doesn't Chrome fit serverless environments?

Short answer: Chrome expects long-running processes and abundant resources; serverless enforces small, short-lived allocations and limits that conflict with Chrome's needs.

Headless Chrome was designed for long-running processes on servers with resources to spare. Serverless is the opposite.

Chrome Binary: 150-300MB

Uncompressed Chrome binary sizes:

chrome-linux: ~170MB (just the binary)
With dependencies (libX11, libgobject, etc.): ~250-300MB
AWS Lambda unzipped limit: 250MB

Solutions (none perfect):

Lambda Layers: Store Chrome in a Layer (up to 250MB)
- Pro: Separates Chrome from code
- Con: Still slow to download/mount on cold start
Docker container image: Package Chrome in custom runtime
- Pro: Up to 10GB image size
- Con: Even slower cold starts (1-3s extra)
Sparse Chrome (chrome-aws-lambda): Stripped-down Chrome (~50MB)
- Pro: Fits in Lambda easily
- Con: Missing features, can break on complex PDFs

Startup Time: 2-5 Seconds

Chrome launch time in Lambda (from cold start):

const puppeteer = require('puppeteer-core');
const chromium = require('chrome-aws-lambda');

exports.handler = async (event) => {
  console.time('chrome-launch');
  const browser = await puppeteer.launch({
    args: chromium.args,
    executablePath: await chromium.executablePath,
    headless: true
  });
  console.timeEnd('chrome-launch'); // 2-5 seconds
  
  const page = await browser.newPage();
  console.time('pdf-render');
  await page.setContent(event.html);
  const pdf = await page.pdf({ format: 'A4' });
  console.timeEnd('pdf-render'); // 3-10 seconds
  
  await browser.close();
  return pdf;
};

Timing breakdown (cold start):

Lambda init: 500ms
Chrome launch: 2-5s
PDF render: 3-10s
Total: 5-15s

Timing breakdown (warm start):

Lambda reuse: 0ms
Chrome launch: 2-5s (can't reuse browser across invocations reliably)
PDF render: 3-10s
Total: 5-15s (same as cold)

Why Chrome stays slow even when warm: Puppeteer best practice is to close browser after each request to avoid memory leaks. Reusing browser instances across Lambda invocations causes crashes.

Memory Usage: 200-500MB Per Instance

Memory breakdown for PDF generation:

Component	Memory Used
Node.js runtime	50-80MB
Chrome process	150-250MB
PDF rendering buffer	50-200MB
Total	250-530MB

Lambda cost impact:

AWS Lambda charges by GB-seconds (memory × duration):

512MB instance for 10s = 5.12 GB-seconds
1GB instance for 10s = 10 GB-seconds
Pricing: ~$0.0000166667 per GB-second
Cost per PDF: $0.00017 (1GB, 10s)

At scale (10,000 PDFs/month):

Lambda cost: ~$17/month
Add API Gateway, CloudWatch: ~$25-30/month

Compare to external API: $0.01-0.05 per PDF = $100-500/month, but zero infrastructure maintenance.

Process Management: Zombie Processes Accumulate

Even with browser.close(), Chrome subprocesses don't always terminate:

// This looks correct but still leaks
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.pdf({ path: 'output.pdf' });
await browser.close(); // Doesn't always kill all Chrome processes

Why it happens:

Chrome spawns multiple processes (main, renderer, GPU)
SIGTERM signal doesn't propagate to all subprocesses
Lambda force-kills function at timeout, leaving processes
Next warm invocation reuses container with zombie processes

Result: Memory accumulates, Lambda eventually crashes or gets throttled.

What Lambda-specific issues should you watch for?

Short answer: Layers, container reuse, and limited /tmp space create deployment and runtime problems for Chrome in Lambda.

Using Lambda Layers for Chrome

AWS Lambda Layers let you package Chrome separately from code:

Setup:

Create Layer with Chrome binary + dependencies
Attach Layer to Lambda function
Lambda mounts Layer at /opt

Example:

// Lambda function code (references Layer)
const chromium = require('chrome-aws-lambda');

exports.handler = async (event) => {
  // chrome-aws-lambda finds Chrome in /opt
  const browser = await puppeteer.launch({
    executablePath: await chromium.executablePath
  });
  // ... generate PDF
};

Limitations:

Layer size limit: 250MB (Chrome barely fits)
Cold start still slow (Layer must mount)
Can't update Chrome without redeploying Layer

Container Reuse Memory Leaks

Lambda reuses containers for 5-15 minutes to avoid cold starts. This causes memory leaks:

First invocation:

Start: 100MB
After PDF render: 300MB
After browser.close(): 250MB (50MB leaked)

Second invocation (same container):

Start: 250MB
After PDF render: 500MB
After browser.close(): 450MB (100MB total leaked)

After 5-10 invocations:

Container uses 800MB-1GB
Lambda OOM (Out of Memory) kills container
Next request gets cold start

Mitigation (doesn't fully solve):

// Force new container by exceeding memory limit
process.memoryUsage().heapUsed > 800 * 1024 * 1024
  ? process.exit(1) // Trigger cold start
  : null;

Concurrent Execution Limits

Lambda limits concurrent executions (default 1000, can request increase). Each PDF render uses one execution slot:

Example: 100 concurrent PDF requests

Each Lambda instance: 10s duration
Concurrent executions used: 100
If burst exceeds limit: requests throttled (429 error)

Solution: Queue requests (SQS + Lambda) or use reserved concurrency. But now you're managing queue infrastructure.

Cost at Scale (GB-Seconds Pricing)

AWS Lambda cost formula: (Memory in GB) × (Duration in seconds) × $0.0000166667

Example: 10,000 PDFs/month

Memory: 1GB
Duration: 10s average
Cost: 10,000 × (1 × 10 × $0.0000166667) = ~$16.67
Add API Gateway: +$10
Add CloudWatch Logs: +$5
Total: ~$32/month

Compare to EC2 instance:

t3.medium (2 vCPU, 4GB): $30/month
Can handle 10,000 PDFs/month easily
More predictable cost

Compare to external API:

$0.01-0.05 per PDF
10,000 PDFs = $100-500/month
But: Zero infrastructure, zero maintenance time (worth 5-10 dev hours saved)

Architectural Alternatives

1. External Rendering Service (API-Based)

Architecture:

Your Lambda → API (https://api.hundreddocs.com/v1/pdf) → PDF Binary

Code:

// Lambda handler (no Chrome, fast)
exports.handler = async (event) => {
  const response = await fetch('https://api.hundreddocs.com/v1/pdf', {
    method: 'POST',
    headers: {
      'X-API-Key': process.env.API_KEY,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      templateId: 'invoice-template',
      data: event.data
    })
  });
  
  return await response.arrayBuffer();
};

Pros:

No Chrome binary (0MB)
Fast (<1s response)
No memory leaks
No timeout issues

Cons:

External dependency (requires internet)
Per-PDF cost
Less control

When to use: >100 PDFs/day, serverless deployment, no infrastructure maintenance

2. Pre-Warmed Pool (Keep Chrome Warm)

Architecture:

Run EC2 instances with Chrome already launched
Lambda sends requests to pool
Pool returns PDFs

Pros:

No cold starts (Chrome always warm)
Faster than Lambda+Puppeteer

Cons:

Complex to manage (health checks, auto-scaling)
Not serverless anymore (EC2 costs)
Still has memory leaks (need process restarts)

When to use: Rare. Only if you need full control + low latency + high volume.

3. Long-Running Container (ECS/Fargate)

Architecture:

Deploy Chrome in ECS container
Container runs 24/7
API Gateway → ECS service

Pros:

Chrome stays warm
More memory available
Better process management

Cons:

Not serverless (always-on cost)
Fargate: ~$30-100/month minimum
Must handle scaling, load balancing

When to use: >1000 PDFs/day, need full control, okay with always-on cost

4. Queue + Worker (Async Processing)

Architecture:

API Request → SQS Queue → Lambda Worker (Puppeteer) → S3 → Callback

Pros:

Decouples request from rendering
Can batch multiple PDFs
Retries on failure

Cons:

Adds latency (async = not real-time)
More infrastructure (SQS, S3, DLQ)
Still has Chrome memory/timeout issues

When to use: Batch generation, can tolerate 1-5 min delay

Latency Comparison

Real-world latency measurements for generating a 5-page invoice PDF:

Architecture	p50	p95	p99	Cold Start
Lambda + Puppeteer (cold)	8s	15s	20s	Every time after 5-15min idle
Lambda + Puppeteer (warm)	7s	12s	18s	Rare (memory leaks kill container)
External API (Hundred Docs)	400ms	800ms	1.2s	N/A (stateless)
ECS Container	3s	5s	7s	None (always warm)
Queue + Worker	30s	60s	120s	N/A (async)

Key insight: Even warm Lambda with Puppeteer is 7-10x slower than external API due to Chrome startup overhead.

Cost Comparison (10,000 PDFs/month)

Architecture	Infrastructure Cost	Maintenance Time	Total Cost (dev @ $100/hr)
Lambda + Puppeteer	$30-50/month	5-10 hrs/month	$530-1050/month
External API	$100-500/month	0 hrs/month	$100-500/month
ECS Container	$60-150/month	8-15 hrs/month	$860-1650/month

Maintenance includes: Debugging timeouts, handling memory leaks, monitoring, scaling configuration, Chrome updates.

Code Comparison

Lambda with Puppeteer (Complex, Slow)

// lambda-pdf-generator.js
const chromium = require('chrome-aws-lambda');
const puppeteer = require('puppeteer-core');

exports.handler = async (event) => {
  let browser = null;
  
  try {
    console.log('Launching Chrome...');
    // Cold start: 2-5 seconds
    browser = await puppeteer.launch({
      args: chromium.args,
      defaultViewport: chromium.defaultViewport,
      executablePath: await chromium.executablePath,
      headless: chromium.headless
    });
    
    const page = await browser.newPage();
    
    // Generate HTML from data
    const html = generateInvoiceHTML(event.data);
    
    console.log('Rendering PDF...');
    // 3-10 seconds depending on complexity
    await page.setContent(html, { waitUntil: 'networkidle0' });
    const pdf = await page.pdf({
      format: 'A4',
      printBackground: true,
      margin: { top: '1cm', bottom: '1cm' }
    });
    
    return {
      statusCode: 200,
      headers: { 'Content-Type': 'application/pdf' },
      body: pdf.toString('base64'),
      isBase64Encoded: true
    };
    
  } catch (error) {
    console.error('PDF generation failed:', error);
    return {
      statusCode: 500,
      body: JSON.stringify({ error: error.message })
    };
    
  } finally {
    if (browser !== null) {
      await browser.close(); // Doesn't always work
    }
  }
};

function generateInvoiceHTML(data) {
  // 50-100 lines of HTML string concatenation...
  return `<html>...</html>`;
}

Deployment size: 250MB (Chrome binary + dependencies) Cold start: 5-8s Warm start: 5-7s (Chrome launch still needed) Memory: 512MB minimum, 1GB recommended

Lambda with External API (Simple, Fast)

// lambda-pdf-generator.js
exports.handler = async (event) => {
  try {
    const response = await fetch('https://api.hundreddocs.com/v1/pdf', {
      method: 'POST',
      headers: {
        'X-API-Key': process.env.HUNDRED_DOCS_API_KEY,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        templateId: 'invoice-template',
        data: event.data
      })
    });
    
    if (!response.ok) {
      throw new Error(`API error: ${response.statusText}`);
    }
    
    const pdfBuffer = await response.arrayBuffer();
    
    return {
      statusCode: 200,
      headers: { 'Content-Type': 'application/pdf' },
      body: Buffer.from(pdfBuffer).toString('base64'),
      isBase64Encoded: true
    };
    
  } catch (error) {
    console.error('PDF generation failed:', error);
    return {
      statusCode: 500,
      body: JSON.stringify({ error: error.message })
    };
  }
};

Deployment size: <10MB (no Chrome) Cold start: 200-500ms Warm start: 50-200ms Memory: 128MB sufficient, 256MB recommended

Lines of code: 30 vs 100+ (no HTML generation, no browser management)

Vercel Puppeteer Timeout - Specific timeout issues on Vercel's serverless platform
Puppeteer PDF Memory Leak - Why memory accumulates even with proper cleanup
JSON to PDF Architecture - How template-based APIs avoid serverless constraints
Invoice PDF Generation API - Real-world serverless PDF generation example
Self-Hosted PDF Generation vs Cloud API: The Real Cost of 'Free' - Compare self-hosting complexities with the simplicity of cloud APIs in a serverless context.
What is a PDF Generation API? - Understand the core concept of managed rendering services.

Technical takeaway: Serverless environments impose strict limits (cold starts, memory, timeouts, deployment size) that conflict with Chrome's resource requirements (150-300MB binary, 2-5s startup, 200-500MB memory, zombie processes). For <100 PDFs/day, Lambda + Puppeteer works with careful tuning. For >100 PDFs/day or Vercel/Cloudflare deployment, external rendering APIs eliminate Chrome entirely, achieving <1s response times with zero infrastructure maintenance.