web / lib/services/schema-inference-service
lib/services/schema-inference-service
Provides on-demand schema inference for datasets.
This service analyzes existing events in a dataset and generates a schema by sampling events in batches. It’s designed for datasets that weren’t created through the import pipeline (e.g., seeding, direct API creation).
The service reuses the existing ProgressiveSchemaBuilder for schema detection and SchemaVersioningService for creating schema versions.
Classes
SchemaInferenceService
Service for inferring schemas from existing event data.
Constructors
Constructor
new SchemaInferenceService():
SchemaInferenceService
Returns
Methods
inferSchemaFromEvents()
staticinferSchemaFromEvents(payload,datasetId,options?):Promise<SchemaInferenceResult>
Generate a schema from existing events in a dataset.
Parameters
payload
BasePayload
datasetId
number
options?
Returns
Promise<SchemaInferenceResult>
getLatestSchema()
staticgetLatestSchema(payload,datasetId,req?):Promise<DatasetSchema|null>
Get the latest schema for a dataset, or null if none exists.
Parameters
payload
BasePayload
datasetId
number
req?
PayloadRequest
Returns
Promise<DatasetSchema | null>
Interfaces
SchemaInferenceOptions
Properties
sampleSize?
optionalsampleSize:number
Maximum number of events to sample (default: 500)
batchSize?
optionalbatchSize:number
Number of events to process per batch (default: 100)
forceRegenerate?
optionalforceRegenerate:boolean
Generate schema even if one already exists and is fresh (default: false)
req?
optionalreq:PayloadRequest
Payload request for context passing
SchemaInferenceResult
Properties
generated
generated:
boolean
Whether a schema was generated
schema
schema:
DatasetSchema|null
The generated or existing schema, if any
message
message:
string
Message describing the result
eventsSampled?
optionaleventsSampled:number
Number of events sampled