Search Configurations

How to Configure Search Features

The Advanced Search features in Blindata can be configured through the Search Settings page, which allows administrators (USERS_ADMIN permission required) to enable and configure both Full-Text Search (FTS) and Semantic Search capabilities.

To access the Search Settings page:

  1. Navigate to Settings in the main navigation menu
  2. Click on More (three dots menu)
  3. Select Search Settings from the dropdown menu

Search Settings page

The Search Settings page is divided into two main sections: FTS and Semantic Search.

Full-Text Search (FTS) Configuration

The FTS section provides configuration options for PostgreSQL-based full-text search capabilities. Here you can control how the system processes and indexes text for enhanced keyword matching. The configuration options are:

  • Enable FTS allows you to toggle the Full-Text Search functionality on or off.
  • Language Configuration specifies the language for PostgreSQL text processing and search optimization. It’s recommended to choose the primary language used in your data assets and descriptions, as this setting directly affects how PostgreSQL processes and indexes text for search operations.

FTS settings

Update FTS Statistics allows you to manually refresh the PostgreSQL lexeme statistics to improve search performance and accuracy. This operation updates the internal statistics that PostgreSQL uses for ranking search results. You should run this after enabling FTS for the first time, after changing the language setting, or if search results seem inconsistent.

The operation may take a few moments depending on the size of your data.

Note

Update FTS Statistics is only available after FTS has been enabled and a valid configuration has been saved.

Semantic Search Configuration

The Semantic Search section contains configuration options for AI-powered semantic search capabilities. The configuration options are:

  • Enable Semantic Search toggles the Semantic Search functionality on or off.
  • API Key Configuration provides authentication for external AI services, currently OpenAI. This key is used for embedding generation and vector-based similarity search operations. Required to use any Semantic Search features.
  • Automatic Embedding Updates lets you configure which resource types should have embeddings automatically updated when entities are created or modified. This ensures semantic search stays current without manual intervention. However, automatic updates may incur significant costs and are best suited for resources with low cardinality, such as Business Glossary resources. It’s recommended to enable this only for frequently updated, low-volume resource types to manage costs effectively.

Semantic Search settings

Update Embeddings allows you to manually update embeddings for all resources to ensure semantic search accuracy. This process generates or updates vector embeddings for all searchable resources. You should use this after enabling Semantic Search, after significant data changes, or if semantic search results seem outdated. The operation includes several parameters:

  • Resource Types (select which types to update, such as Data Products or Physical Entities)
  • Limit (maximum number of resources to update per resource type)
  • Batch Size (number of resources processed simultaneously).

This operation can be resource-intensive and may take several minutes to complete.

Note

Update Embeddings is only available after Semantic Search has been enabled and a valid configuration has been saved.

Semantic Search - update embeddings

Configuration Best Practices

For FTS:

  1. Enable FTS first, then select the appropriate language
  2. Choose a language that matches your primary data language
  3. Update FTS statistics after initial configuration

For Semantic Search:

  1. Ensure you have a valid OpenAI API key before enabling
  2. Start with automatic updates disabled for high-volume resource types
  3. Manually update embeddings after initial setup
  4. Monitor API usage and costs when using automatic updates

General Recommendations:

  • Test both search modes after configuration
  • Monitor search performance and user feedback
  • Periodically update lexeme statistics (for FTS) and embeddings (for Semantic Search)