Puppeteer PDF Memory Leak in Production

Why does Puppeteer leak memory when generating PDFs?

Short answer: Puppeteer launches real Chrome instances that can leave behind async references (promises, sockets, timers), causing memory to accumulate across renders.

Puppeteer causes memory leaks when generating PDFs because Chrome instances don't close cleanly—pending promises, timers, and WebSocket connections to the Chrome DevTools Protocol keep references alive in the Node.js event loop. After 50-200 renders, leaked memory grows from a few MB per render to gigabytes, eventually crashing your process or timing out your Lambda function. This happens even when you call browser.close() in a try/finally block.

Why does this happen?

When you launch a Puppeteer browser, you're starting a full Chrome process that communicates with Node.js via WebSockets (Chrome DevTools Protocol). Each render creates references in the Node.js event loop:

Pending promises: Asynchronous operations that haven't fully resolved, even after the page closes.
WebSocket connections: The CDP channel stays open if not properly terminated.
Timers and callbacks: Background tasks (like rendering, layout calculations) may hold references to the browser object.
Event listeners: Listeners attached to the browser or page that aren't cleaned up.

When you call browser.close(), it sends a signal to Chrome to terminate, but if any of these references remain, the garbage collector can't free the memory. The Chrome process may exit, but Node.js still holds memory allocated during the render.

What does the memory growth pattern look like?

Short answer: Each render leaks a small amount; over dozens of renders leaked memory compounds and eventually triggers OOMs or crashes.

Here's what happens over multiple renders:

Render 1: 200MB used → closes → 50MB remains (leaked references)
Render 10: 200MB per render × 10 = 2GB total used → 500MB leaked
Render 50: Memory usage climbs to 2.5GB → process crashes or becomes unresponsive
Render 100: On a 3GB Lambda, you'll OOM before reaching this point

Each render adds a small amount of leaked memory. Initially, this is invisible. After dozens of renders, you see:

Increasing GC pause times
Slower response times
Process memory climbing steadily
Eventually: Out of Memory errors or crashes

Why is this worse in serverless?

Short answer: Warm container reuse and limited memory make leaked allocations accumulate quickly across invocations.

Serverless environments amplify this problem:

Container reuse: Lambda containers stay warm for 5-15 minutes. Multiple invocations accumulate leaked memory in the same container.
No process restart: Unlike a traditional server that might restart daily, serverless containers only restart when they go cold.
Limited memory: Lambda functions typically run with 512MB-3GB. A few hundred renders can exhaust available memory.
Unpredictable timing: You don't know when a container will be reused, so leaks accumulate unpredictably.

Example: A Lambda with 1GB memory processes 5 PDF requests while warm. Each leaks 20MB. After 10 invocations, you've leaked 200MB. After 50, you're at 1GB and the next request fails.

Why doesn't `browser.close()` fix leaks?

Short answer: browser.close() signals Chrome to exit but doesn't guarantee all async references (promises, sockets, timers) are cleaned up immediately.

This code looks correct but still leaks:

// looks-correct-but-leaks.js
import puppeteer from 'puppeteer'

async function renderPDF(html) {
  let browser
  try {
    browser = await puppeteer.launch({ args: ['--no-sandbox'] })
    const page = await browser.newPage()
    await page.setContent(html, { waitUntil: 'networkidle0' })
    const pdf = await page.pdf({ format: 'A4' })
    return pdf
  } finally {
    if (browser) await browser.close() // ⚠️ This doesn't guarantee cleanup
  }
}

Why this still leaks:

browser.close() is asynchronous—it sends a termination signal but returns before Chrome fully exits.
If page.setContent() or page.pdf() created pending promises or timers, those references persist.
The DevTools Protocol WebSocket may not close immediately, keeping the connection alive.
Any uncaught exceptions during page operations leave the browser in an inconsistent state.

Even with perfect error handling, Node.js may hold references to Chrome's internal objects, preventing garbage collection.

Why doesn't connection pooling fully solve the problem?

Short answer: Pooling reduces launches but keeps browser instances alive longer, so leaked memory still accumulates inside workers.

Tools like puppeteer-cluster reduce the frequency of browser launches by reusing instances:

// puppeteer-cluster-still-leaks.js
import { Cluster } from 'puppeteer-cluster'

const cluster = await Cluster.launch({
  concurrency: Cluster.CONCURRENCY_CONTEXT,
  maxConcurrency: 2,
})

await cluster.task(async ({ page, data: html }) => {
  await page.setContent(html)
  return await page.pdf({ format: 'A4' })
})

// This helps but doesn't eliminate leaks
for (let i = 0; i < 1000; i++) {
  await cluster.execute('<html>...</html>')
  // Memory still grows slowly over 100-1000 renders
}

await cluster.close()

Why pooling delays but doesn't fix the problem:

The pool keeps browser instances alive longer, which reduces startup overhead.
But leaked memory accumulates within each worker over time.
After 100-1000 renders, even pooled workers crash or become unresponsive.
Serverless functions eventually hit memory limits and fail.

Connection pooling is a good optimization for reducing cold starts, but it doesn't address the root cause of memory leaks.

How can you detect memory leaks in production?

Short answer: Monitor Node.js RSS/heap metrics, GC pause times, and Lambda/container memory over time to spot steady growth after renders.

To detect memory leaks in production, monitor these metrics:

// monitor-memory.js
import puppeteer from 'puppeteer'

async function renderWithMonitoring(html) {
  const beforeMemory = process.memoryUsage().heapUsed
  console.log(`Before render: ${(beforeMemory / 1024 / 1024).toFixed(2)} MB`)

  const browser = await puppeteer.launch({ args: ['--no-sandbox'] })
  const page = await browser.newPage()
  await page.setContent(html)
  const pdf = await page.pdf()
  await browser.close()

  const afterMemory = process.memoryUsage().heapUsed
  console.log(`After render: ${(afterMemory / 1024 / 1024).toFixed(2)} MB`)
  console.log(`Leaked: ${((afterMemory - beforeMemory) / 1024 / 1024).toFixed(2)} MB`)

  return pdf
}

Metrics to watch:

process.memoryUsage().heapUsed: Node.js heap memory
process.memoryUsage().rss: Total process memory (includes Chrome)
GC frequency and pause times (using --expose-gc flag)
Lambda/container memory usage over time

If you see steady growth in heapUsed or rss after each render, you have a leak.

What's the architectural fix?

Short answer: Move to a stateless rendering architecture: delegate Chrome lifecycle to an external service that renders per-request and cleans up fully.

The fundamental issue is that Puppeteer ties your application to a stateful, long-running Chrome process. Every render creates side effects that accumulate over time.

Stateless rendering architecture:

Instead of managing Chrome in your application, delegate rendering to an external service that handles Chrome lifecycle completely separately:

Your app sends JSON (data) + template ID
External service handles Chrome launch, render, and cleanup
Your app receives a PDF buffer
No Chrome binary in your deployment
No memory accumulates in your application

Before (Puppeteer with proper cleanup - still leaks)

// puppeteer-proper-cleanup.js
import puppeteer from 'puppeteer'

async function generateInvoicePDF(invoiceData) {
  let browser
  try {
    browser = await puppeteer.launch({
      args: ['--no-sandbox', '--disable-setuid-sandbox'],
      headless: true
    })
    const page = await browser.newPage()

    // Generate HTML from data
    const html = `
      <html>
        <body>
          <h1>Invoice #${invoiceData.number}</h1>
          <p>Customer: ${invoiceData.customer}</p>
          <table>
            ${invoiceData.items.map(item => `
              <tr>
                <td>${item.description}</td>
                <td>$${item.amount}</td>
              </tr>
            `).join('')}
          </table>
          <p>Total: $${invoiceData.total}</p>
        </body>
      </html>
    `

    await page.setContent(html, { waitUntil: 'networkidle0' })
    const pdf = await page.pdf({ format: 'A4', printBackground: true })

    return pdf
  } catch (error) {
    console.error('PDF generation failed:', error)
    throw error
  } finally {
    if (browser) {
      await browser.close()
      // ⚠️ Even with this cleanup, memory leaks accumulate over 50-200 renders
    }
  }
}

// Usage
const invoice = {
  number: 'INV-2024-001',
  customer: 'Acme Corp',
  items: [
    { description: 'Consulting', amount: 5000 },
    { description: 'Development', amount: 15000 }
  ],
  total: 20000
}

const pdf = await generateInvoicePDF(invoice)
// After 100 invocations in the same Lambda container, you'll see OOM errors

After (Stateless API - no memory leaks)

// hundred-docs-stateless.js
async function generateInvoicePDF(invoiceData) {
  const response = await fetch('https://api.hundreddocs.com/v1/pdf', {
    method: 'POST',
    headers: {
      'X-API-Key': process.env.HUNDRED_DOCS_API_KEY,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      templateId: 'invoice-template-uuid', // Created once in UI
      data: invoiceData
    })
  })

  if (!response.ok) {
    throw new Error(`PDF generation failed: ${response.statusText}`)
  }

  const pdfBuffer = await response.arrayBuffer()
  return Buffer.from(pdfBuffer)
  
  // ✅ No Chrome binary to manage
  // ✅ No memory leaks in your app
  // ✅ Consistent sub-second performance
  // ✅ No Lambda timeouts
}

// Usage - same data structure
const invoice = {
  number: 'INV-2024-001',
  customer: 'Acme Corp',
  items: [
    { description: 'Consulting', amount: 5000 },
    { description: 'Development', amount: 15000 }
  ],
  total: 20000
}

const pdf = await generateInvoicePDF(invoice)
// Works reliably for 1, 100, or 10,000 renders without memory growth

Why this fixes the memory leak:

Your application makes a stateless HTTP request (no long-lived connections)
Chrome lifecycle is managed externally and cleaned up completely after each render
No memory accumulates in your application's event loop
Each request is independent—no container reuse issues

When should you use Puppeteer vs a managed API?

Short answer: Use Puppeteer for low-volume or highly custom renders; use managed APIs for scale, reliability, and serverless deployments.

Use Puppeteer when:

Generating fewer than 10 PDFs per day
You need full control over Chrome flags and rendering behavior
You're okay managing infrastructure (EC2, Docker, monitoring)
Development/testing environments where leaks aren't critical

Use a managed API when:

Generating 100+ PDFs per day
Running in serverless environments (Lambda, Vercel, Cloud Functions)
You want predictable costs and performance
You don't want to manage Chrome infrastructure
Memory leaks or timeouts have caused production issues

Short answer: See the linked guides on Puppeteer alternatives, serverless timeouts, and invoice generation for practical migration steps.

Puppeteer PDF Alternative - When managed APIs beat self-hosted Chrome
Vercel Puppeteer Timeout - Why Puppeteer exceeds serverless timeout limits
Serverless PDF Generation - Architecture patterns for stateless rendering
Invoice PDF Generation API - Practical example of stateless PDF generation

Tools like Hundred Docs solve this problem by handling Chrome instance management and memory cleanup externally, letting you send JSON and receive PDFs without managing infrastructure or debugging memory leaks.