Bulk PDF Report Generation: Architecture Patterns

Bulk PDF generation (100s-1000s of PDFs) breaks with serial processing because memory accumulates after each render, CPU becomes the bottleneck at 3-10 seconds per PDF, and batch job timeouts occur after 15-30 minutes. If you're generating monthly reports for 1000 customers with Puppeteer in a loop, you'll see memory grow from 200MB to 2GB after 100 renders, timeouts after 500 renders, or the process crashing entirely after 800-1000 renders.

Scale Challenges

Generating a single PDF is different from generating 1000 PDFs. Here's what breaks:

Memory Accumulation When Generating 1000+ PDFs

The problem: Even with proper browser.close(), memory leaks accumulate.

Memory growth pattern (Puppeteer):

PDFs GeneratedProcess MemoryChrome ZombiesStatus
1250 MB0✅ OK
10400 MB1-2✅ OK
50800 MB5-8⚠️ Slow
1001.5 GB12-15⚠️ Very slow
2002.5 GB25-30❌ Crashes soon
500N/AN/A❌ Already crashed

Why memory grows:

  1. Chrome spawns subprocesses (renderer, GPU, network)
  2. browser.close() sends SIGTERM to main process
  3. Subprocesses don't always receive signal
  4. Zombie processes accumulate
  5. Node.js event loop holds references
  6. Garbage collection can't free memory

Real-world impact:

// This looks correct but crashes after ~200 PDFs
async function generateBatch(customers) {
  for (const customer of customers) {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.setContent(generateHTML(customer));
    await page.pdf({ path: `report-${customer.id}.pdf` });
    await browser.close(); // Doesn't prevent memory leak
  }
}

// Memory after 100 iterations: 1.5GB
// Memory after 200 iterations: Process killed by OS

CPU Bottlenecks in Serial Processing

Single-threaded bottleneck:

  • Average PDF render: 5 seconds
  • 1000 PDFs × 5 seconds = 5000 seconds = 83 minutes
  • Plus overhead (HTML generation, DB queries): 100-120 minutes

Why CPU is the limit:

  • Chrome rendering is CPU-intensive
  • Node.js is single-threaded (one PDF at a time)
  • Each PDF blocks the next

Real numbers from production:

Report ComplexityRender Time1000 PDFs (serial)CPU Usage
Simple (1-2 pages)3s50 min80-100%
Medium (5-10 pages)7s116 min90-100%
Complex (20+ pages)15s250 min95-100%

Timeout Issues in Batch Jobs

Common timeout scenarios:

1. Cron job timeout (60 min max):

// Runs every month to generate customer reports
// Times out after 60 minutes if >700 PDFs
async function monthlyReportJob() {
  const customers = await db.customers.findAll(); // 1000 customers
  
  for (const customer of customers) {
    await generatePDF(customer); // 5s each
  }
  // Total time: 5000s = 83 minutes
  // Job killed at 60 minutes, only 720 PDFs generated
}

2. HTTP request timeout (30s-60s):

// API endpoint: POST /api/reports/generate-all
// Client timeout: 60s
// Actual time for 100 PDFs: 500s (8+ minutes)
app.post('/api/reports/generate-all', async (req, res) => {
  const pdfs = [];
  for (const customer of customers) {
    pdfs.push(await generatePDF(customer));
  }
  res.json({ pdfs }); // Never reaches here, client already timed out
});

3. Lambda timeout (15 min max):

// AWS Lambda max timeout: 15 minutes
// After 900 seconds, function forcibly killed
// Any in-progress PDFs lost

Storage and Delivery at Scale

The problem: 1000 PDFs = 500MB-2GB total

Where to store:

  • Memory: 1000 × 2MB average = 2GB (Node.js crashes)
  • Disk: Lambda has 512MB /tmp (insufficient)
  • S3: Need to upload 1000 files (adds time)

Delivery challenges:

  • Email 1000 PDFs: 1000 SMTP connections (rate limited)
  • Zip 1000 PDFs: 2GB zip file (too large for browser download)
  • Generate on-demand: User waits 83 minutes (unacceptable)

Architecture Patterns

Four approaches to handle bulk PDF generation:

Pattern 1: Synchronous Batch (Simple but Slow)

Architecture:

API Request → Generate PDF 1 → Generate PDF 2 → ... → Generate PDF 1000 → Response

Code:

// POST /api/reports/batch
async function generateBatch(req, res) {
  const { customerIds } = req.body;
  const pdfs = [];
  
  for (const customerId of customerIds) {
    const customer = await db.customers.findById(customerId);
    const html = generateReportHTML(customer);
    const pdf = await puppeteer.pdf(html);
    pdfs.push({ customerId, pdf });
  }
  
  res.json({ pdfs });
}

Pros:

  • Simple to implement
  • Easy to debug (sequential, predictable)
  • No additional infrastructure

Cons:

  • Slow: 100 PDFs = 10+ minutes
  • Memory leaks: Crashes after 200-500 PDFs
  • Timeout: HTTP request times out
  • Blocking: Ties up server thread

When to use: <10 PDFs at a time, low volume

Pattern 2: Queue + Workers (Scalable, Complex)

Architecture:

API Request → Add jobs to queue → Return job IDs
                    ↓
         [Queue: BullMQ, SQS, etc.]
                    ↓
     Worker 1, Worker 2, Worker 3 (parallel)
                    ↓
         Generate PDFs → Store in S3 → Notify

Code (using BullMQ):

1. API endpoint (adds jobs to queue):

// POST /api/reports/batch
const { Queue } = require('bullmq');
const pdfQueue = new Queue('pdf-generation', {
  connection: { host: 'redis', port: 6379 }
});

async function generateBatch(req, res) {
  const { customerIds } = req.body;
  const jobIds = [];
  
  for (const customerId of customerIds) {
    const job = await pdfQueue.add('generate-report', {
      customerId,
      requestId: req.id
    });
    jobIds.push(job.id);
  }
  
  res.json({
    message: 'PDFs queued for generation',
    jobIds,
    statusUrl: `/api/reports/batch/${req.id}/status`
  });
}

2. Worker (processes jobs from queue):

// worker.js (run separately, can scale to N instances)
const { Worker } = require('bullmq');
const puppeteer = require('puppeteer');
const s3 = require('./s3');

const worker = new Worker('pdf-generation', async (job) => {
  const { customerId, requestId } = job.data;
  
  // Generate PDF
  const customer = await db.customers.findById(customerId);
  const html = generateReportHTML(customer);
  const pdf = await puppeteer.pdf(html);
  
  // Upload to S3
  const key = `reports/${requestId}/${customerId}.pdf`;
  await s3.upload({ Key: key, Body: pdf });
  
  // Update progress
  await db.batchJobs.updateProgress(requestId, customerId, 'completed');
  
  return { customerId, s3Key: key };
}, {
  connection: { host: 'redis', port: 6379 },
  concurrency: 5 // Process 5 PDFs in parallel per worker
});

// Handle failures
worker.on('failed', async (job, err) => {
  console.error(`Job ${job.id} failed:`, err);
  await db.batchJobs.updateProgress(job.data.requestId, job.data.customerId, 'failed');
});

3. Status endpoint (check progress):

// GET /api/reports/batch/:requestId/status
async function getBatchStatus(req, res) {
  const { requestId } = req.params;
  const status = await db.batchJobs.getStatus(requestId);
  
  res.json({
    total: status.total,
    completed: status.completed,
    failed: status.failed,
    inProgress: status.inProgress,
    downloadUrl: status.completed === status.total
      ? `/api/reports/batch/${requestId}/download`
      : null
  });
}

Pros:

  • Scalable: Add more workers to increase throughput
  • Resilient: Failed jobs can retry
  • Non-blocking: API returns immediately
  • Parallel: 5 workers × 5 concurrency = 25 PDFs at once

Cons:

  • Complex: Redis/SQS, worker processes, job tracking
  • Infrastructure: Need queue service, worker servers
  • Debugging: Jobs fail silently, need monitoring
  • Cost: Always-on workers or Lambda invocations

When to use: >100 PDFs regularly, need reliability, okay with complexity

Pattern 3: Parallel API Calls (Fast, Requires Rate Limiting)

Architecture:

API Request → Split into batches → Parallel API calls (25-100 concurrent)
                                          ↓
                             External PDF API (handles rendering)
                                          ↓
                                     PDFs returned

Code:

// POST /api/reports/batch
async function generateBatch(req, res) {
  const { customerIds } = req.body;
  
  // Limit concurrency to avoid overwhelming API
  const batchSize = 50; // 50 concurrent requests
  const results = [];
  
  for (let i = 0; i < customerIds.length; i += batchSize) {
    const batch = customerIds.slice(i, i + batchSize);
    
    // Generate 50 PDFs in parallel
    const batchResults = await Promise.all(
      batch.map(async (customerId) => {
        const customer = await db.customers.findById(customerId);
        
        try {
          const pdf = await fetch('https://api.hundreddocs.com/v1/pdf', {
            method: 'POST',
            headers: {
              'X-API-Key': process.env.API_KEY,
              'Content-Type': 'application/json'
            },
            body: JSON.stringify({
              templateId: 'monthly-report',
              data: {
                customerName: customer.name,
                reportMonth: '2025-12',
                metrics: customer.metrics
              }
            })
          });
          
          const pdfBuffer = await pdf.arrayBuffer();
          
          // Upload to S3 immediately
          await s3.upload({
            Key: `reports/${customerId}.pdf`,
            Body: Buffer.from(pdfBuffer)
          });
          
          return { customerId, status: 'success' };
          
        } catch (error) {
          console.error(`Failed to generate PDF for ${customerId}:`, error);
          return { customerId, status: 'failed', error: error.message };
        }
      })
    );
    
    results.push(...batchResults);
  }
  
  res.json({ results });
}

Performance:

  • 1000 PDFs, 50 concurrent
  • 1000 / 50 = 20 batches
  • Each batch: 1-2 seconds (API response time)
  • Total: 20-40 seconds (vs 83 minutes serial)

Pros:

  • Fast: 1000 PDFs in <1 minute
  • No Chrome: Zero memory leaks
  • Stateless: No worker infrastructure
  • Simple: Standard HTTP requests

Cons:

  • API dependency: Requires external service
  • Rate limiting: Must respect API limits (429 errors)
  • Cost: Per-PDF pricing (vs fixed server cost)
  • Network: Need reliable internet

When to use: >100 PDFs, fast turnaround needed, serverless deployment

Pattern 4: Streaming Generation (Memory Efficient)

Architecture:

API Request → Generate PDF 1 → Stream to S3 → Generate PDF 2 → Stream to S3 → ...

Code:

// POST /api/reports/batch (streams PDFs as they're generated)
async function generateBatch(req, res) {
  const { customerIds } = req.body;
  
  res.writeHead(200, {
    'Content-Type': 'application/x-ndjson', // Newline-delimited JSON
    'Transfer-Encoding': 'chunked'
  });
  
  for (const customerId of customerIds) {
    try {
      const customer = await db.customers.findById(customerId);
      const html = generateReportHTML(customer);
      const pdf = await puppeteer.pdf(html);
      
      // Upload to S3
      const key = `reports/${customerId}.pdf`;
      await s3.upload({ Key: key, Body: pdf });
      
      // Stream result to client
      res.write(JSON.stringify({
        customerId,
        status: 'success',
        url: `https://s3.amazonaws.com/bucket/${key}`
      }) + '\n');
      
    } catch (error) {
      res.write(JSON.stringify({
        customerId,
        status: 'failed',
        error: error.message
      }) + '\n');
    }
  }
  
  res.end();
}

Client (receives streaming updates):

const response = await fetch('/api/reports/batch', {
  method: 'POST',
  body: JSON.stringify({ customerIds: [1, 2, 3, ..., 1000] })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  const results = chunk.split('\n').filter(Boolean).map(JSON.parse);
  
  results.forEach(result => {
    console.log(`PDF ${result.customerId}: ${result.status}`);
    updateProgressBar(result.customerId);
  });
}

Pros:

  • Memory efficient: PDFs streamed immediately, not accumulated
  • Real-time progress: Client sees updates as PDFs complete
  • No timeout: Long-running request stays open

Cons:

  • Still slow: Serial processing (5s × 1000 = 83 min)
  • Connection stability: Long HTTP connection can drop
  • Puppeteer issues: Memory leaks still occur

When to use: Moderate volume (50-200 PDFs), need progress updates, can't use queue

Queue vs Batch vs Parallel (Comparison)

DimensionSynchronous BatchQueue + WorkersParallel APIStreaming
Speed (1000 PDFs)83 min10-20 min1-2 min83 min
Complexity⭐ Low⭐⭐⭐ High⭐⭐ Medium⭐⭐ Medium
InfrastructureNoneRedis + workersNoneNone
Memory leaks❌ Yes❌ Yes✅ No❌ Yes
Scalability❌ Poor✅ Excellent✅ Excellent⚠️ Limited
Cost (1000 PDFs)$0-5$10-30$10-50$0-5
Real-time progress❌ No⚠️ Via polling❌ No✅ Yes
Best for<10 PDFs100-10k PDFs100-10k PDFs50-200 PDFs

Memory Management at Scale

Why Puppeteer Fails After 100-200 Renders

Root cause: Chrome doesn't clean up completely.

Memory profile (512MB RAM available):

Render 1:   [Chrome: 200MB] [Node: 50MB]  Available: 262MB ✅
Render 10:  [Chrome: 250MB] [Node: 70MB]  Available: 192MB ✅
Render 50:  [Chrome: 350MB] [Node: 120MB] Available: 42MB  ⚠️
Render 100: [Chrome: 480MB] [Node: 180MB] Available: -148MB ❌ Crash

Mitigation (doesn't fully solve):

// Restart process every N PDFs
let renderCount = 0;
const MAX_RENDERS = 50;

async function generatePDFWithRestart(data) {
  if (renderCount >= MAX_RENDERS) {
    console.log('Restarting process to prevent memory leak...');
    process.exit(0); // Process manager (PM2, Kubernetes) restarts
  }
  
  const pdf = await generatePDF(data);
  renderCount++;
  return pdf;
}

Restarting Workers Periodically

Pattern (with PM2 or Kubernetes):

// worker.js
const RESTART_AFTER_RENDERS = 100;
let renderCount = 0;

worker.on('completed', () => {
  renderCount++;
  
  if (renderCount >= RESTART_AFTER_RENDERS) {
    console.log('Graceful restart after 100 renders');
    worker.close(); // Finish current jobs
    process.exit(0); // PM2/K8s will restart
  }
});

PM2 config:

{
  "apps": [{
    "name": "pdf-worker",
    "script": "worker.js",
    "instances": 4,
    "max_memory_restart": "1G", // Restart if memory exceeds 1GB
    "autorestart": true
  }]
}

Stateless Rendering Advantages

Why external APIs don't have memory leaks:

Traditional (Puppeteer):
Request 1 → [Process A: Chrome launched, 200MB] → [Zombie process: 50MB leaked]
Request 2 → [Process A: Chrome launched, 250MB] → [Zombie process: 100MB leaked]
...memory grows...

Stateless API:
Request 1 → [API container, fresh] → [PDF generated] → [Container destroyed]
Request 2 → [API container, fresh] → [PDF generated] → [Container destroyed]
...no memory accumulation...

Result: Can generate 1 million PDFs without memory issues.

Performance Considerations

Render Time Per PDF

Factors affecting render time:

FactorSimple PDFComplex PDF
Page count1-2 pages20+ pages
ImagesNone10+ images
Tables1 small tableMultiple large tables
Custom fontsSystem fonts3+ custom fonts
Puppeteer time3s15s
API time300ms1.2s

Concurrency Limits

Puppeteer (local):

  • Safe concurrency: 2-5 per CPU core
  • 4-core server: 8-20 concurrent Chrome instances
  • Each Chrome: 200-500MB RAM
  • 8 concurrent × 400MB = 3.2GB RAM minimum

API (external):

  • Rate limit: 100-1000 req/min (depends on plan)
  • Recommended batch size: 50-100 concurrent
  • No local RAM constraints

Rate Limiting External APIs

Exponential backoff pattern:

async function generatePDFWithRetry(data, maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await generatePDF(data);
    } catch (error) {
      if (error.status === 429) { // Rate limited
        const delay = Math.pow(2, attempt) * 1000; // 2s, 4s, 8s
        console.log(`Rate limited, retrying in ${delay}ms...`);
        await sleep(delay);
      } else {
        throw error;
      }
    }
  }
}

Storage I/O Bottlenecks

Uploading 1000 PDFs to S3:

// Slow: Sequential uploads (1000 × 200ms = 200 seconds)
for (const pdf of pdfs) {
  await s3.upload({ Body: pdf });
}

// Fast: Parallel uploads (1000 / 50 batches × 500ms = 10 seconds)
const batchSize = 50;
for (let i = 0; i < pdfs.length; i += batchSize) {
  const batch = pdfs.slice(i, i + batchSize);
  await Promise.all(batch.map(pdf => s3.upload({ Body: pdf })));
}

Monitoring Bulk Jobs

Track Success/Failure Rates

// Database schema for batch jobs
{
  batchId: 'batch-2025-12-29-001',
  totalJobs: 1000,
  completed: 985,
  failed: 15,
  inProgress: 0,
  startedAt: '2025-12-29T10:00:00Z',
  completedAt: '2025-12-29T10:05:00Z',
  duration: 300, // seconds
  failureReasons: {
    'timeout': 8,
    'invalid_data': 5,
    'api_error': 2
  }
}

Monitor Memory Usage

// Log memory every 10 PDFs
let count = 0;

async function generatePDFWithMonitoring(data) {
  const pdf = await generatePDF(data);
  count++;
  
  if (count % 10 === 0) {
    const mem = process.memoryUsage();
    console.log(`After ${count} PDFs:`, {
      heapUsed: `${Math.round(mem.heapUsed / 1024 / 1024)}MB`,
      external: `${Math.round(mem.external / 1024 / 1024)}MB`,
      rss: `${Math.round(mem.rss / 1024 / 1024)}MB`
    });
    
    if (mem.rss > 800 * 1024 * 1024) { // > 800MB
      console.warn('Memory usage high, consider restarting');
    }
  }
  
  return pdf;
}

Alert on Stuck Jobs

// Check for jobs stuck >10 minutes
setInterval(async () => {
  const stuckJobs = await db.batchJobs.findStuck({
    status: 'in_progress',
    startedAt: { $lt: Date.now() - 10 * 60 * 1000 }
  });
  
  if (stuckJobs.length > 0) {
    await alertOps(`${stuckJobs.length} jobs stuck for >10 minutes`, {
      jobIds: stuckJobs.map(j => j.id)
    });
  }
}, 60 * 1000); // Check every minute

Technical takeaway: Bulk PDF generation (100s-1000s) requires architectural changes from single-PDF patterns. Serial Puppeteer processing is limited by memory leaks (crashes after 200-500 PDFs), CPU bottleneck (83 minutes for 1000 PDFs), and timeouts. Solutions: Queue + Workers (scalable but complex, requires Redis/worker infrastructure, 10-20 min for 1000 PDFs), Parallel API calls (fastest at 1-2 min, stateless, no memory leaks), or Streaming (memory-efficient, real-time progress). For >100 PDFs regularly, parallel API calls provide best balance of speed, simplicity, and reliability.