web / lib/utils/file-readers
lib/utils/file-readers
Provides utility functions for reading data from files in batches.
This module provides streaming batch iteration (streamBatchesFromFile). For CSV files, streaming uses
Papa.parse’s step-based parser with pause/resume backpressure, keeping memory at
one batch buffer (~3MB for 1000 rows). For Excel/ODS files, the selected sheet is
converted to a CSV sidecar on first access, then streamed identically.
Functions
streamBatchesFromFile()
streamBatchesFromFile(
filePath,options):AsyncGenerator<Record<string,unknown>[]>
Async generator that yields batches of rows from a file using streaming.
For CSV files, uses Papa.parse’s step callback with pause/resume backpressure — memory stays at one batch buffer regardless of file size.
For Excel/ODS files, transparently converts the selected sheet to a CSV sidecar file on first access, then streams that CSV identically.
Parameters
filePath
string
options
StreamBatchOptions
Returns
AsyncGenerator<Record<string, unknown>[]>
Yields
A batch of parsed rows.
getSidecarPath()
getSidecarPath(
filePath,sheetIndex):string
Build the sidecar CSV path for an Excel/ODS file + sheet index.
Parameters
filePath
string
sheetIndex
number
Returns
string
cleanupSidecarFiles()
cleanupSidecarFiles(
filePath,sheetIndex?):void
Delete any CSV sidecar files generated for a given file path.
Parameters
filePath
string
sheetIndex?
number = 0
Returns
void
getFileRowCount()
getFileRowCount(
filePath,sheetIndex?):Promise<number>
Get total row count from a file.
For CSV files, uses streaming line count to avoid loading the entire file into memory. For Excel/ODS files, loads the workbook (xlsx library requires this).
Parameters
filePath
string
sheetIndex?
number = 0
Returns
Promise<number>