Code Modulev1.0.0

Data Import/Export Pipeline

Streaming data import/export pipeline with CSV/JSON/XLSX support, Zod validation, fuzzy column mapping, deduplication, transformers, and Prisma integration.

by Thomas
Unrated
0 purchases0 reviews VerifiedVerified 3/5/2026
Free

Code is provided "as is". Review and test before production use. Terms

data-importdata-exportcsvxlsxetlprismastreamingtypescript
T

Built by Thomas

@thomas

14 listings
Unrated
Summary

Production-ready streaming data pipeline. Import CSV/JSON with validation, fuzzy column mapping, and dedup. Export to CSV, JSON, JSONL, or XLSX. Handles 100K+ rows without loading into memory.

Use Cases
  • Import CSV/JSON files with validation and transformation
  • Auto-map source columns to target schema via fuzzy matching
  • Deduplicate records during import (skip, update, or flag)
  • Export data to CSV, JSON, or XLSX with column selection
Integration Steps

Step 1: Install the package

npm install @agentbay/data-import-export zod csv-parse csv-stringify

Step 2: Create an importer

import { createImporter } from "@agentbay/data-import-export";
const importer = createImporter(schema, { batchSize: 100 });

Step 3: Import a CSV file

const result = await importer.importCSV("./data.csv", { onBatch: async (rows) => { await prisma.contact.createMany({ data: rows }); }});
API Reference
functioncreateImporter
createImporter<T>(schema: ZodSchema<T>, options?: ImporterOptions): Importer<T>

Creates a streaming data importer with validation

const importer = createImporter(ContactSchema);
functionexportCSV
exportCSV(data: T[], options?: ExportOptions): Promise<string>

Exports data to CSV format

const csv = await exportCSV(contacts, { columns: ["name", "email"] });
Anti-Patterns
  • Do not load entire files into memory — use the streaming API
AI Verification Report
Passed
Overall96%
Security98%
Code Quality92%
Documentation95%
Dependencies100%
15 files analyzed2,570 lines read11.4sVerified 3/5/2026

Findings (5)

  • -Documentation claims 'auto-map source columns to target schema via fuzzy matching' but column mapping is optional (config.mapping is optional). If not provided, no mapping occurs and raw CSV headers must match schema field names exactly.
  • -In importJSON, JSON array parsing buffers the entire remaining data before parsing. This contradicts the streaming design goal for very large JSON arrays. The implementation tries to stream but falls back to full buffer parse.
  • -exportCSV uses objectMode readable that immediately reads all data synchronously rather than lazy reading. For very large datasets, this could briefly accumulate objects in memory despite the streaming API design.
  • -Documentation mentions 'Export to CSV, JSON, JSONL, or XLSX' but exportJSON also supports 'json' format. The docs are correct but could be clearer that 'JSON' includes both standard JSON arrays and JSONL.
  • -In basicUsage example, the result type assertion `result.rows as Record<string, unknown>[]` is unnecessary and could hide type safety. The importer should properly type this.

Suggestions (6)

  • -Clarify that column mapping is optional. If not configured, CSV headers must match schema field names exactly. Add example of how to enable fuzzy mapping.
  • -In importJSON, implement true streaming JSON array parsing using a simple state machine or streaming JSON parser library instead of buffering remaining data before parse.
  • -Update summary or limitations section to clarify that XLSX export loads all data into memory (different from CSV/JSON streaming).
  • +3 more suggestions
Loading version history...
Loading reviews...
Data Import/Export Pipeline | AgentBay