web / lib/services/schema-similarity
lib/services/schema-similarity
Schema similarity service for comparing uploaded file schemas with existing datasets.
Calculates weighted similarity scores to suggest matching datasets during import. Used by the import wizard to help users select appropriate target datasets.
Interfaces
UploadedSchema
Represents the schema of an uploaded sheet for comparison
Properties
headers
headers:
string[]
sampleData
sampleData:
Record<string,unknown>[]
rowCount
rowCount:
number
DatasetSchema
Represents an existing dataset’s schema for comparison
Properties
datasetId
datasetId:
number
datasetName
datasetName:
string
language
language:
string
fields
fields:
string[]
fieldTypes?
optionalfieldTypes:Record<string,string>
hasGeoFields
hasGeoFields:
boolean
hasDateFields
hasDateFields:
boolean
SimilarityResult
Result of a similarity comparison
Properties
datasetId
datasetId:
number
datasetName
datasetName:
string
score
score:
number
breakdown
breakdown:
object
fieldOverlap
fieldOverlap:
number
typeCompatibility
typeCompatibility:
number
structureSimilarity
structureSimilarity:
number
semanticHints
semanticHints:
number
languageMatch
languageMatch:
number
matchingFields
matchingFields:
string[]
missingFields
missingFields:
string[]
newFields
newFields:
string[]
Functions
calculateSchemaSimilarity()
calculateSchemaSimilarity(
uploadedSchema,datasetSchema,detectedLanguage?):SimilarityResult
Calculate overall similarity between uploaded schema and dataset schema
Parameters
uploadedSchema
datasetSchema
detectedLanguage?
string
Returns
findSimilarDatasets()
findSimilarDatasets(
uploadedSchema,datasetSchemas,options):SimilarityResult[]
Find similar datasets for an uploaded schema
Parameters
uploadedSchema
datasetSchemas
options
minScore?
number
maxResults?
number
detectedLanguage?
string
Returns
datasetToSchema()
datasetToSchema(
dataset):DatasetSchema
Convert a Payload Dataset to a DatasetSchema for comparison
Parameters
dataset
Dataset