Quick Summary
Serverless and headless Chrome conflict: large binary sizes, slow cold starts, and memory constraints cause timeouts and leaked processes.
Why does serverless PDF generation fail with headless Chrome?
Short answer: Serverless platforms are optimized for short, stateless functions; Chrome requires large binaries, memory, and time to start, causing instability and high costs.
Serverless PDF generation with headless Chrome (Puppeteer, Playwright) breaks because serverless environments have strict constraints—cold start delays, memory limits (128MB-10GB), timeout limits (15min max), and deployment size limits (250MB unzipped)—that conflict with Chrome's requirements: 150-300MB binary, 2-5s startup time, 200-500MB memory per instance, and zombie processes that leak memory. After 50-200 renders or a few timeout errors, your Lambda function crashes or your costs spike.
What are the key serverless constraints affecting PDF generation?
Short answer: Cold starts, memory limits, timeouts, and deployment size limits all make running Chrome in serverless fragile.
Serverless platforms (AWS Lambda, Vercel Functions, Cloudflare Workers, Google Cloud Functions) optimize for short-lived, stateless functions that start fast, use minimal memory, and handle one request at a time. Here are the hard limits:
Cold Start Delays
Cold start = Time to initialize new function instance (container) from scratch.
| Platform | Cold Start Time | Warm Duration |
|---|---|---|
| AWS Lambda | 200-800ms | 5-15 minutes |
| Vercel Functions | 300-1000ms | 5 minutes |
| Cloudflare Workers | 0ms (V8 isolates) | N/A |
| Google Cloud Functions | 500-1200ms | 15 minutes |
Impact on Chrome:
- Chrome launch adds 2-5s to cold start
- With Chrome, total cold start = 3-8s
- First request after idle = slow response
Memory Limits
| Platform | Min Memory | Max Memory | Default |
|---|---|---|---|
| AWS Lambda | 128MB | 10GB | 1GB |
| Vercel Functions | 1GB | 3GB | 1GB |
| Cloudflare Workers | 128MB | 128MB | 128MB |
| Google Cloud Functions | 128MB | 8GB | 256MB |
Impact on Chrome:
- Chrome process alone: 150-250MB
- PDF render memory: 50-200MB (depends on complexity)
- Total per render: 200-500MB
- Minimum viable: 512MB-1GB
Timeout Limits
| Platform | Hobby/Free | Pro/Paid | Max |
|---|---|---|---|
| AWS Lambda | 3s | 900s | 15min |
| Vercel Functions | 10s | 60s | 900s (enterprise) |
| Cloudflare Workers | 10ms (CPU) | 50ms (CPU) | 30s (wallclock) |
| Google Cloud Functions | 60s | 540s | 9min |
Impact on Chrome:
- Chrome launch: 2-5s
- PDF render: 3-10s (complex docs)
- Total: 5-15s minimum
- Vercel Hobby tier (10s) = frequent timeouts
Deployment Size Limits
| Platform | Zipped | Unzipped | With Layers |
|---|---|---|---|
| AWS Lambda | 50MB | 250MB | 250MB + 5 layers (250MB each) |
| Vercel Functions | Varies | ~250MB | N/A |
| Cloudflare Workers | 1MB | 10MB | N/A |
| Google Cloud Functions | 100MB | 500MB | N/A |
Impact on Chrome:
- Chrome binary: 150-300MB (Linux), 200MB+ (with dependencies)
- Requires Lambda Layers or custom Docker image
- Deployment time: 30-60s (uploading large binary)
Ephemeral Filesystem
Serverless functions have limited, temporary disk space:
- AWS Lambda: 512MB-10GB
/tmp(cleared between cold starts) - Vercel/Cloudflare: No persistent disk
Impact on Chrome:
- Chrome needs
/tmpfor profiles, caches - Can't persist state across invocations
- Must write intermediate files to
/tmp, clean up after
Why doesn't Chrome fit serverless environments?
Short answer: Chrome expects long-running processes and abundant resources; serverless enforces small, short-lived allocations and limits that conflict with Chrome's needs.
Headless Chrome was designed for long-running processes on servers with resources to spare. Serverless is the opposite.
Chrome Binary: 150-300MB
Uncompressed Chrome binary sizes:
chrome-linux: ~170MB (just the binary)- With dependencies (
libX11,libgobject, etc.): ~250-300MB - AWS Lambda unzipped limit: 250MB
Solutions (none perfect):
- Lambda Layers: Store Chrome in a Layer (up to 250MB)
- Pro: Separates Chrome from code
- Con: Still slow to download/mount on cold start
- Docker container image: Package Chrome in custom runtime
- Pro: Up to 10GB image size
- Con: Even slower cold starts (1-3s extra)
- Sparse Chrome (chrome-aws-lambda): Stripped-down Chrome (~50MB)
- Pro: Fits in Lambda easily
- Con: Missing features, can break on complex PDFs
Startup Time: 2-5 Seconds
Chrome launch time in Lambda (from cold start):
const puppeteer = require('puppeteer-core');
const chromium = require('chrome-aws-lambda');
exports.handler = async (event) => {
console.time('chrome-launch');
const browser = await puppeteer.launch({
args: chromium.args,
executablePath: await chromium.executablePath,
headless: true
});
console.timeEnd('chrome-launch'); // 2-5 seconds
const page = await browser.newPage();
console.time('pdf-render');
await page.setContent(event.html);
const pdf = await page.pdf({ format: 'A4' });
console.timeEnd('pdf-render'); // 3-10 seconds
await browser.close();
return pdf;
};
Timing breakdown (cold start):
- Lambda init: 500ms
- Chrome launch: 2-5s
- PDF render: 3-10s
- Total: 5-15s
Timing breakdown (warm start):
- Lambda reuse: 0ms
- Chrome launch: 2-5s (can't reuse browser across invocations reliably)
- PDF render: 3-10s
- Total: 5-15s (same as cold)
Why Chrome stays slow even when warm: Puppeteer best practice is to close browser after each request to avoid memory leaks. Reusing browser instances across Lambda invocations causes crashes.
Memory Usage: 200-500MB Per Instance
Memory breakdown for PDF generation:
| Component | Memory Used |
|---|---|
| Node.js runtime | 50-80MB |
| Chrome process | 150-250MB |
| PDF rendering buffer | 50-200MB |
| Total | 250-530MB |
Lambda cost impact:
AWS Lambda charges by GB-seconds (memory × duration):
- 512MB instance for 10s = 5.12 GB-seconds
- 1GB instance for 10s = 10 GB-seconds
- Pricing: ~$0.0000166667 per GB-second
- Cost per PDF: $0.00017 (1GB, 10s)
At scale (10,000 PDFs/month):
- Lambda cost: ~$17/month
- Add API Gateway, CloudWatch: ~$25-30/month
Compare to external API: $0.01-0.05 per PDF = $100-500/month, but zero infrastructure maintenance.
Process Management: Zombie Processes Accumulate
Even with browser.close(), Chrome subprocesses don't always terminate:
// This looks correct but still leaks
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.pdf({ path: 'output.pdf' });
await browser.close(); // Doesn't always kill all Chrome processes
Why it happens:
- Chrome spawns multiple processes (main, renderer, GPU)
- SIGTERM signal doesn't propagate to all subprocesses
- Lambda force-kills function at timeout, leaving processes
- Next warm invocation reuses container with zombie processes
Result: Memory accumulates, Lambda eventually crashes or gets throttled.
What Lambda-specific issues should you watch for?
Short answer: Layers, container reuse, and limited /tmp space create deployment and runtime problems for Chrome in Lambda.
Using Lambda Layers for Chrome
AWS Lambda Layers let you package Chrome separately from code:
Setup:
- Create Layer with Chrome binary + dependencies
- Attach Layer to Lambda function
- Lambda mounts Layer at
/opt
Example:
// Lambda function code (references Layer)
const chromium = require('chrome-aws-lambda');
exports.handler = async (event) => {
// chrome-aws-lambda finds Chrome in /opt
const browser = await puppeteer.launch({
executablePath: await chromium.executablePath
});
// ... generate PDF
};
Limitations:
- Layer size limit: 250MB (Chrome barely fits)
- Cold start still slow (Layer must mount)
- Can't update Chrome without redeploying Layer
Container Reuse Memory Leaks
Lambda reuses containers for 5-15 minutes to avoid cold starts. This causes memory leaks:
First invocation:
- Start: 100MB
- After PDF render: 300MB
- After
browser.close(): 250MB (50MB leaked)
Second invocation (same container):
- Start: 250MB
- After PDF render: 500MB
- After
browser.close(): 450MB (100MB total leaked)
After 5-10 invocations:
- Container uses 800MB-1GB
- Lambda OOM (Out of Memory) kills container
- Next request gets cold start
Mitigation (doesn't fully solve):
// Force new container by exceeding memory limit
process.memoryUsage().heapUsed > 800 * 1024 * 1024
? process.exit(1) // Trigger cold start
: null;
Concurrent Execution Limits
Lambda limits concurrent executions (default 1000, can request increase). Each PDF render uses one execution slot:
Example: 100 concurrent PDF requests
- Each Lambda instance: 10s duration
- Concurrent executions used: 100
- If burst exceeds limit: requests throttled (429 error)
Solution: Queue requests (SQS + Lambda) or use reserved concurrency. But now you're managing queue infrastructure.
Cost at Scale (GB-Seconds Pricing)
AWS Lambda cost formula: (Memory in GB) × (Duration in seconds) × $0.0000166667
Example: 10,000 PDFs/month
- Memory: 1GB
- Duration: 10s average
- Cost: 10,000 × (1 × 10 × $0.0000166667) = ~$16.67
- Add API Gateway: +$10
- Add CloudWatch Logs: +$5
- Total: ~$32/month
Compare to EC2 instance:
- t3.medium (2 vCPU, 4GB): $30/month
- Can handle 10,000 PDFs/month easily
- More predictable cost
Compare to external API:
- $0.01-0.05 per PDF
- 10,000 PDFs = $100-500/month
- But: Zero infrastructure, zero maintenance time (worth 5-10 dev hours saved)
Architectural Alternatives
1. External Rendering Service (API-Based)
Architecture:
Your Lambda → API (https://api.hundreddocs.com/v1/pdf) → PDF Binary
Code:
// Lambda handler (no Chrome, fast)
exports.handler = async (event) => {
const response = await fetch('https://api.hundreddocs.com/v1/pdf', {
method: 'POST',
headers: {
'X-API-Key': process.env.API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({
templateId: 'invoice-template',
data: event.data
})
});
return await response.arrayBuffer();
};
Pros:
- No Chrome binary (0MB)
- Fast (<1s response)
- No memory leaks
- No timeout issues
Cons:
- External dependency (requires internet)
- Per-PDF cost
- Less control
When to use: >100 PDFs/day, serverless deployment, no infrastructure maintenance
2. Pre-Warmed Pool (Keep Chrome Warm)
Architecture:
- Run EC2 instances with Chrome already launched
- Lambda sends requests to pool
- Pool returns PDFs
Pros:
- No cold starts (Chrome always warm)
- Faster than Lambda+Puppeteer
Cons:
- Complex to manage (health checks, auto-scaling)
- Not serverless anymore (EC2 costs)
- Still has memory leaks (need process restarts)
When to use: Rare. Only if you need full control + low latency + high volume.
3. Long-Running Container (ECS/Fargate)
Architecture:
- Deploy Chrome in ECS container
- Container runs 24/7
- API Gateway → ECS service
Pros:
- Chrome stays warm
- More memory available
- Better process management
Cons:
- Not serverless (always-on cost)
- Fargate: ~$30-100/month minimum
- Must handle scaling, load balancing
When to use: >1000 PDFs/day, need full control, okay with always-on cost
4. Queue + Worker (Async Processing)
Architecture:
API Request → SQS Queue → Lambda Worker (Puppeteer) → S3 → Callback
Pros:
- Decouples request from rendering
- Can batch multiple PDFs
- Retries on failure
Cons:
- Adds latency (async = not real-time)
- More infrastructure (SQS, S3, DLQ)
- Still has Chrome memory/timeout issues
When to use: Batch generation, can tolerate 1-5 min delay
Latency Comparison
Real-world latency measurements for generating a 5-page invoice PDF:
| Architecture | p50 | p95 | p99 | Cold Start |
|---|---|---|---|---|
| Lambda + Puppeteer (cold) | 8s | 15s | 20s | Every time after 5-15min idle |
| Lambda + Puppeteer (warm) | 7s | 12s | 18s | Rare (memory leaks kill container) |
| External API (Hundred Docs) | 400ms | 800ms | 1.2s | N/A (stateless) |
| ECS Container | 3s | 5s | 7s | None (always warm) |
| Queue + Worker | 30s | 60s | 120s | N/A (async) |
Key insight: Even warm Lambda with Puppeteer is 7-10x slower than external API due to Chrome startup overhead.
Cost Comparison (10,000 PDFs/month)
| Architecture | Infrastructure Cost | Maintenance Time | Total Cost (dev @ $100/hr) |
|---|---|---|---|
| Lambda + Puppeteer | $30-50/month | 5-10 hrs/month | $530-1050/month |
| External API | $100-500/month | 0 hrs/month | $100-500/month |
| ECS Container | $60-150/month | 8-15 hrs/month | $860-1650/month |
Maintenance includes: Debugging timeouts, handling memory leaks, monitoring, scaling configuration, Chrome updates.
Code Comparison
Lambda with Puppeteer (Complex, Slow)
// lambda-pdf-generator.js
const chromium = require('chrome-aws-lambda');
const puppeteer = require('puppeteer-core');
exports.handler = async (event) => {
let browser = null;
try {
console.log('Launching Chrome...');
// Cold start: 2-5 seconds
browser = await puppeteer.launch({
args: chromium.args,
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath,
headless: chromium.headless
});
const page = await browser.newPage();
// Generate HTML from data
const html = generateInvoiceHTML(event.data);
console.log('Rendering PDF...');
// 3-10 seconds depending on complexity
await page.setContent(html, { waitUntil: 'networkidle0' });
const pdf = await page.pdf({
format: 'A4',
printBackground: true,
margin: { top: '1cm', bottom: '1cm' }
});
return {
statusCode: 200,
headers: { 'Content-Type': 'application/pdf' },
body: pdf.toString('base64'),
isBase64Encoded: true
};
} catch (error) {
console.error('PDF generation failed:', error);
return {
statusCode: 500,
body: JSON.stringify({ error: error.message })
};
} finally {
if (browser !== null) {
await browser.close(); // Doesn't always work
}
}
};
function generateInvoiceHTML(data) {
// 50-100 lines of HTML string concatenation...
return `<html>...</html>`;
}
Deployment size: 250MB (Chrome binary + dependencies) Cold start: 5-8s Warm start: 5-7s (Chrome launch still needed) Memory: 512MB minimum, 1GB recommended
Lambda with External API (Simple, Fast)
// lambda-pdf-generator.js
exports.handler = async (event) => {
try {
const response = await fetch('https://api.hundreddocs.com/v1/pdf', {
method: 'POST',
headers: {
'X-API-Key': process.env.HUNDRED_DOCS_API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({
templateId: 'invoice-template',
data: event.data
})
});
if (!response.ok) {
throw new Error(`API error: ${response.statusText}`);
}
const pdfBuffer = await response.arrayBuffer();
return {
statusCode: 200,
headers: { 'Content-Type': 'application/pdf' },
body: Buffer.from(pdfBuffer).toString('base64'),
isBase64Encoded: true
};
} catch (error) {
console.error('PDF generation failed:', error);
return {
statusCode: 500,
body: JSON.stringify({ error: error.message })
};
}
};
Deployment size: <10MB (no Chrome) Cold start: 200-500ms Warm start: 50-200ms Memory: 128MB sufficient, 256MB recommended
Lines of code: 30 vs 100+ (no HTML generation, no browser management)
Related Content
- Vercel Puppeteer Timeout - Specific timeout issues on Vercel's serverless platform
- Puppeteer PDF Memory Leak - Why memory accumulates even with proper cleanup
- JSON to PDF Architecture - How template-based APIs avoid serverless constraints
- Invoice PDF Generation API - Real-world serverless PDF generation example
- Self-Hosted PDF Generation vs Cloud API: The Real Cost of 'Free' - Compare self-hosting complexities with the simplicity of cloud APIs in a serverless context.
- What is a PDF Generation API? - Understand the core concept of managed rendering services.
Technical takeaway: Serverless environments impose strict limits (cold starts, memory, timeouts, deployment size) that conflict with Chrome's resource requirements (150-300MB binary, 2-5s startup, 200-500MB memory, zombie processes). For <100 PDFs/day, Lambda + Puppeteer works with careful tuning. For >100 PDFs/day or Vercel/Cloudflare deployment, external rendering APIs eliminate Chrome entirely, achieving <1s response times with zero infrastructure maintenance.