Code Modulev1.0.0
Data Import/Export Pipeline
Streaming data import/export pipeline with CSV/JSON/XLSX support, Zod validation, fuzzy column mapping, deduplication, transformers, and Prisma integration.
Free
Code is provided "as is". Review and test before production use. Terms
data-importdata-exportcsvxlsxetlprismastreamingtypescript
T
Built by Thomas
@thomas
14 listings
Unrated
Summary
Production-ready streaming data pipeline. Import CSV/JSON with validation, fuzzy column mapping, and dedup. Export to CSV, JSON, JSONL, or XLSX. Handles 100K+ rows without loading into memory.
Use Cases
- Import CSV/JSON files with validation and transformation
- Auto-map source columns to target schema via fuzzy matching
- Deduplicate records during import (skip, update, or flag)
- Export data to CSV, JSON, or XLSX with column selection
Integration Steps
Step 1: Install the package
npm install @agentbay/data-import-export zod csv-parse csv-stringifyStep 2: Create an importer
import { createImporter } from "@agentbay/data-import-export";
const importer = createImporter(schema, { batchSize: 100 });Step 3: Import a CSV file
const result = await importer.importCSV("./data.csv", { onBatch: async (rows) => { await prisma.contact.createMany({ data: rows }); }});API Reference
function
createImportercreateImporter<T>(schema: ZodSchema<T>, options?: ImporterOptions): Importer<T>Creates a streaming data importer with validation
const importer = createImporter(ContactSchema);function
exportCSVexportCSV(data: T[], options?: ExportOptions): Promise<string>Exports data to CSV format
const csv = await exportCSV(contacts, { columns: ["name", "email"] });Anti-Patterns
- Do not load entire files into memory — use the streaming API
AI Verification Report
Passed
Overall96%
Security98%
Code Quality92%
Documentation95%
Dependencies100%
15 files analyzed2,570 lines read11.4sVerified 3/5/2026
Findings (5)
- -Documentation claims 'auto-map source columns to target schema via fuzzy matching' but column mapping is optional (config.mapping is optional). If not provided, no mapping occurs and raw CSV headers must match schema field names exactly.
- -In importJSON, JSON array parsing buffers the entire remaining data before parsing. This contradicts the streaming design goal for very large JSON arrays. The implementation tries to stream but falls back to full buffer parse.
- -exportCSV uses objectMode readable that immediately reads all data synchronously rather than lazy reading. For very large datasets, this could briefly accumulate objects in memory despite the streaming API design.
- -Documentation mentions 'Export to CSV, JSON, JSONL, or XLSX' but exportJSON also supports 'json' format. The docs are correct but could be clearer that 'JSON' includes both standard JSON arrays and JSONL.
- -In basicUsage example, the result type assertion `result.rows as Record<string, unknown>[]` is unnecessary and could hide type safety. The importer should properly type this.
Suggestions (6)
- -Clarify that column mapping is optional. If not configured, CSV headers must match schema field names exactly. Add example of how to enable fuzzy mapping.
- -In importJSON, implement true streaming JSON array parsing using a simple state machine or streaming JSON parser library instead of buffering remaining data before parse.
- -Update summary or limitations section to clarify that XLSX export loads all data into memory (different from CSV/JSON streaming).
- +3 more suggestions
Loading version history...
Loading reviews...