Code Modulev1.0.0

Data Import/Export Pipeline

Name: Data Import/Export Pipeline
Availability: InStock
Author: Thomas

Streaming data import/export pipeline with CSV/JSON/XLSX support, Zod validation, fuzzy column mapping, deduplication, transformers, and Prisma integration.

by Thomas

Unrated

0 purchases0 reviews VerifiedVerified 3/5/2026

Free

Code is provided "as is". Review and test before production use. Terms

data-importdata-exportcsvxlsxetlprismastreamingtypescript

Built by Thomas

@thomas

14 listings

Unrated

Summary

Production-ready streaming data pipeline. Import CSV/JSON with validation, fuzzy column mapping, and dedup. Export to CSV, JSON, JSONL, or XLSX. Handles 100K+ rows without loading into memory.

Use Cases

Import CSV/JSON files with validation and transformation
Auto-map source columns to target schema via fuzzy matching
Deduplicate records during import (skip, update, or flag)
Export data to CSV, JSON, or XLSX with column selection

Integration Steps

Step 1: Install the package

npm install @agentbay/data-import-export zod csv-parse csv-stringify

Step 2: Create an importer

import { createImporter } from "@agentbay/data-import-export";
const importer = createImporter(schema, { batchSize: 100 });

Step 3: Import a CSV file

const result = await importer.importCSV("./data.csv", { onBatch: async (rows) => { await prisma.contact.createMany({ data: rows }); }});

API Reference

functioncreateImporter

createImporter<T>(schema: ZodSchema<T>, options?: ImporterOptions): Importer<T>

Creates a streaming data importer with validation

const importer = createImporter(ContactSchema);

functionexportCSV

exportCSV(data: T[], options?: ExportOptions): Promise<string>

Exports data to CSV format

const csv = await exportCSV(contacts, { columns: ["name", "email"] });

Anti-Patterns

Do not load entire files into memory — use the streaming API

AI Verification Report

Passed

Overall96%

Security98%

Code Quality92%

Documentation95%

Dependencies100%

15 files analyzed2,570 lines read11.4sVerified 3/5/2026

Findings (5)

-Documentation claims 'auto-map source columns to target schema via fuzzy matching' but column mapping is optional (config.mapping is optional). If not provided, no mapping occurs and raw CSV headers must match schema field names exactly.
-In importJSON, JSON array parsing buffers the entire remaining data before parsing. This contradicts the streaming design goal for very large JSON arrays. The implementation tries to stream but falls back to full buffer parse.
-exportCSV uses objectMode readable that immediately reads all data synchronously rather than lazy reading. For very large datasets, this could briefly accumulate objects in memory despite the streaming API design.
-Documentation mentions 'Export to CSV, JSON, JSONL, or XLSX' but exportJSON also supports 'json' format. The docs are correct but could be clearer that 'JSON' includes both standard JSON arrays and JSONL.
-In basicUsage example, the result type assertion `result.rows as Record<string, unknown>[]` is unnecessary and could hide type safety. The importer should properly type this.

Suggestions (6)

-Clarify that column mapping is optional. If not configured, CSV headers must match schema field names exactly. Add example of how to enable fuzzy mapping.
-In importJSON, implement true streaming JSON array parsing using a simple state machine or streaming JSON parser library instead of buffering remaining data before parse.
-Update summary or limitations section to clarify that XLSX export loads all data into memory (different from CSV/JSON streaming).
+3 more suggestions

Loading version history...

Loading reviews...