Skip to main content

Indexing Datasets

You can index your dataset to use specific fields as filters. To index your dataset:

Create a mapping for your index in OpenSearch/Elasticsearch.

The Searchium.ai OpenSearch/Elasticsearch Plugin recognizes OpenSearch/Elasticsearch datasets that were indexed with vector types of dense_vector or knn_vector, and can filter field types of keyword.

Dataset documents can include:

  • Metadata fields of any OpenSearch/Elasticsearch type.

  • FP32 vectors.

Fields that you wish to filter must be:

  • A non-nested object

  • A string or a single array of strings

  • Included in all documents

  • Have a non-empty value. Empty values such as “” are not supported.

Examples:

Valid filter mappings for description, sub_category and source fields:

{ 
"mappings":{
"properties":{
"description_vector":{
"type":"knn_vector",
"dimension":1280
},
"description":{
"type":"keyword"
},
"sub_category":{
"type":"keyword"
},
"source":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
}
}

Invalid filter mapping for sub_category, due to nesting of properties.

{
"mappings":{
"properties":{
"description_vector":{
"type":"knn_vector",
"dimension":1280
},
"description":{
"type":"keyword"
},
"category":{

"properties":{
"sub_category":{
"type":"keyword"
}
}

},
"source":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
}
}

Indexing example:

POST http://<serverHostName>:9200/<indexName>
{
"settings":{
"index":{
"number_of_shards":"1",
"knn":false,
"number_of_replicas":"0"
}
},
"mappings":{
"properties":{
"description_vector":{
"type":"knn_vector",
"dimension":1280
},
"description":{
"type":"keyword"
},
"sub_category":{
"type":"keyword"
},
"source":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
}
}