Indexing Datasets
You can index your dataset to use specific fields as filters. To index your dataset:
Create a mapping for your index in OpenSearch/Elasticsearch.
The Searchium.ai OpenSearch/Elasticsearch Plugin recognizes OpenSearch/Elasticsearch datasets that were indexed with
vector types of dense_vector
or knn_vector
, and can filter field types of keyword
.
Dataset documents can include:
Metadata fields of any OpenSearch/Elasticsearch type.
FP32 vectors.
Fields that you wish to filter must be:
A non-nested object
A string or a single array of strings
Included in all documents
Have a non-empty value. Empty values such as “” are not supported.
Examples:
Valid filter mappings for description
, sub_category
and source
fields:
{
"mappings":{
"properties":{
"description_vector":{
"type":"knn_vector",
"dimension":1280
},
"description":{
"type":"keyword"
},
"sub_category":{
"type":"keyword"
},
"source":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
}
}
Invalid filter mapping for sub_category
, due to nesting of properties
.
{
"mappings":{
"properties":{
"description_vector":{
"type":"knn_vector",
"dimension":1280
},
"description":{
"type":"keyword"
},
"category":{
"properties":{
"sub_category":{
"type":"keyword"
}
}
},
"source":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
}
}
Indexing example:
POST http://<serverHostName>:9200/<indexName>
{
"settings":{
"index":{
"number_of_shards":"1",
"knn":false,
"number_of_replicas":"0"
}
},
"mappings":{
"properties":{
"description_vector":{
"type":"knn_vector",
"dimension":1280
},
"description":{
"type":"keyword"
},
"sub_category":{
"type":"keyword"
},
"source":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
}
}