CSV Import and Export

Introduction

CSV Import and Export functionality in Blindata enables you to efficiently manage large volumes of data through bulk operations. This feature allows you to download existing data in CSV format for analysis, modification, or backup purposes, and then import it back into the platform. CSV import and export is particularly useful for:

  • Bulk data operations: Import or update multiple records at once
  • Data migration: Transfer data between environments or systems
  • Data maintenance: Perform periodic updates or corrections on large datasets
  • Data analysis: Export data for external analysis in spreadsheet applications
  • Data backup: Create backups of your metadata and configurations

Blindata provides both resource-specific export/import features and a universal CSV import feature accessible through the Settings section, offering flexibility for different use cases and workflows.

Exporting Data to CSV

Many resources in Blindata support CSV export functionality, allowing you to download data in a structured format that can be easily edited and re-imported.

How to Export Data

Export functionality is typically available through:

  1. Download Icon: Look for a download icon (📥) in the top-right corner of resource list pages
  2. Export Button: Some pages have dedicated export buttons in the action bar
  3. Context Menus: Right-click or use context menus on resource lists

Tip

When exporting large datasets, consider applying filters first to reduce the export size. This can significantly speed up the download process and make the exported file more manageable.

Export File Format

Exported CSV files follow a structured format:

  • Header Row: Contains column names that match the resource properties
  • Data Rows: Each row represents a single resource or record
  • Nested Properties: Complex properties use dot notation (e.g., dataFlow.fromSystem.name)
  • Identifiers: Resources are identified by their unique uuid or names

The exported CSV files can be opened in spreadsheet applications like Microsoft Excel, Google Sheets, or any text editor for viewing and editing.

Example Export Format

The following table illustrates a typical CSV export structure for quality checks:

qualityCheck.code qualityCheck.name qualityCheck.qualitySuite.name qualityCheck.description
DWH_005 Daily Revenues DWH Checks Verify the correct value of the daily turnover
CUSTOMER-EMAIL-VALIDTY Email Validity Data Quality Suite Correctness measure - Count of records whose email don’t fit the regular expression
FILM-COMPLETENESS Film Completeness Content Validation Check that all required film metadata fields are populated

In this example:

  • The first row contains the header names (column identifiers)
  • Each subsequent row represents a single quality check resource
  • Nested properties use dot notation (e.g., qualityCheck.qualitySuite.name demonstrates accessing a nested property within the qualitySuite object)
  • Column order may vary between exports, but header names remain consistent

Evolution and Retrocompatibility

Blindata’s CSV export format is designed to evolve over time while maintaining backward compatibility. When working with CSV files, it’s important to understand the following guarantees:

  • Header-Based Access: Always access columns by their header name, not by their position. Column order may change between exports, and new columns may be added at any position in the header row.

  • New Headers: New column headers may be introduced in future versions of Blindata. These new headers can appear anywhere in the header row (beginning, middle, or end). Your import scripts and tools should be flexible enough to handle additional columns without breaking.

  • Retrocompatibility: Existing column headers will continue to be supported in future versions. Columns that were present in previous exports will remain available, ensuring that your existing CSV files and import processes continue to work.

  • Best Practice: When writing scripts or tools to process CSV files:

    • Use header names to identify columns (e.g., row['dataFlow.name'] instead of row[2])
    • Ignore unknown columns rather than failing on them
    • Don’t assume a fixed column order
    • Validate that required columns exist by name before processing

This approach ensures that your CSV-based workflows remain functional as Blindata evolves and adds new features, without requiring constant updates to your import scripts or data processing tools.

Importing Data from CSV

Blindata provides two main methods for importing CSV data:

Universal CSV Import

The universal CSV import feature is accessible through the Settings section and supports importing various resource types. This is the recommended method for importing business glossary items and other supported resources.

  1. Navigate to Settings page in the side menu
  2. Select Import from the Settings navbar
  3. Upload your CSV file

Using Universal CSV Import

The CSV import process in Blindata follows a simple three-step workflow:

Step 1: Access the Import Section and Configure Settings

Navigate to Settings in the side menu and select Import to access the universal CSV import feature. This opens the import interface where you can upload your CSV file and configure import settings.

CSV Import Step 1 - Access Import Section

The import page provides several key components:

  • File Selection Area: A large drag-and-drop zone where you can either drag your CSV file or click to browse and select it from your computer. The interface clearly indicates “Drag ’n’ drop some files here, or click to select files” to guide you through the file selection process.

  • Configuration Options: Before uploading, you can configure how the CSV file will be processed:

    • Patch Mode: When enabled, this mode allows you to update existing resources by matching identifiers. When disabled, the import will create new resources or replace existing ones entirely.
    • Case Sensitive Match: Controls whether identifier matching is case-sensitive. When enabled, “ResourceName” and “resourcename” are treated as different identifiers.
    • Automatic delimiter guessing: When enabled (default), Blindata automatically detects the CSV delimiter (comma, semicolon, tab, etc.). When disabled, you can manually specify the delimiter in the “Delimiter” field.
    • Delimiter: Manually specify the character used to separate columns in your CSV file (e.g., comma, semicolon, tab). This field is only needed when automatic delimiter guessing is disabled.
    • Encoding: Specify the character encoding of your CSV file (e.g., UTF-8, ISO-8859-1). If left empty, the system will attempt to auto-detect the encoding.
  • Action Buttons:

    • CLEAR: Removes the selected file and resets the configuration to default values
    • LOAD: Processes and validates the uploaded CSV file (this button becomes enabled after a file is selected)

At this stage, you can select your CSV file and adjust the configuration settings according to your file’s format and your import requirements.

Step 2: Review Parsed Data and Validate

After uploading your CSV file and clicking LOAD, Blindata processes and parses the file, then displays a review page where you can verify that the data was parsed correctly before proceeding with the import.

CSV Import Step 2 - Review Parsed Data

The review page consists of two main sections:

  • Parsing Summary: This panel displays key statistics about the parsed file:

    • Total Rows: Shows the number of data rows found in your CSV file (excluding the header row)
    • Total Errors: Indicates the number of parsing errors detected. If errors are present, you’ll need to correct them in your CSV file and upload again
  • Content Table: A detailed preview table showing how your data was parsed. This is your opportunity to verify:

    • Column Separation: Check that all columns are correctly separated and aligned. Each column header should match the expected field names, and data should appear in the correct columns
    • Special Characters: Verify that special characters, accented letters, and non-ASCII characters are displayed correctly. This confirms that the encoding settings (UTF-8 or otherwise) were applied properly
    • Date Formats: Review date fields to ensure they were parsed correctly. Dates should appear in a readable format and match the format in your original CSV file
    • Data Integrity: Examine the actual data values to confirm they match what you intended to import. Check for any unexpected truncation, formatting issues, or missing values

Important Validation Checks:

Before clicking CONTINUE, carefully review:

  • All columns are properly separated (no merged or split columns)
  • Special characters and accented letters display correctly
  • Date values are formatted as expected
  • Numeric values are correctly parsed
  • Text fields contain complete values (no truncation)
  • Required fields are populated
  • No unexpected empty cells or data misalignment

If you notice any issues during the review, click BACK to return to the upload page, adjust your CSV file or configuration settings, and upload again. Only proceed with CONTINUE when you’re satisfied that all data has been parsed correctly.

Step 3: Review Objects and Analyze Import Operations

After confirming the parsed data in Step 2, you’ll reach the final review page where you can examine the individual objects that will be imported and preview the operations that will be performed.

CSV Import Step 3 - Review Objects and Analyze

This page consists of two main sections:

  • Import Summary: Displays real-time statistics about the import process:

    • Processed: Shows how many objects have been processed (e.g., “0/5” means 0 out of 5 objects processed)
    • Imported: Indicates the number of objects successfully imported
    • Errors: Shows the count of errors encountered during the import process
  • Objects List: Displays each individual object that will be imported, showing:

    • Object Index and Name: Each object is numbered and displays its name (e.g., “Daily Revenues”, “Email Validity”)
    • Status: Objects show their current status (typically “PENDING” before import)
    • Checkbox: Each object has a checkbox that allows you to select or deselect it for import. By default, all objects are selected
    • Dropdown Arrow: Click to expand and view detailed information about each object

Analyze Import Operations

Before executing the import, you can use the ANALYZE button to preview what operations will be performed without actually writing any changes to the database. The analysis will show you:

  • CREATE: New resources that will be created
  • UPDATE: Existing resources that will be modified (when using Patch Mode or when matching existing identifiers)
  • DELETE: Resources that will be deleted (if applicable to your import type)

This preview allows you to:

  • Verify that the correct operations will be performed
  • Identify any unexpected CREATE, UPDATE, or DELETE operations
  • Review the scope of changes before committing
  • Make adjustments to your CSV file if needed

Executing the Import

Once you’ve reviewed the objects and analyzed the operations:

  1. Select Objects: Use the checkboxes to include or exclude specific objects from the import
  2. Analyze: Click ANALYZE to preview the operations (CREATE, UPDATE, DELETE) that will be performed
  3. Review Analysis: Check the analysis output to ensure the operations match your expectations
  4. Start Import: Click START to execute the import and perform the actual write operations
  5. Monitor Progress: Watch the Import Summary counters update as objects are processed, imported, or encounter errors

You can click BACK at any time to return to the previous step and make adjustments to your CSV file or configuration settings.

Note

When importing resources using CSV files, always start by downloading the template or an existing export to use as a guide. This ensures your CSV file has the correct structure and column headers required by the import feature.

Deleting Resources via CSV Import

Blindata allows you to delete resources using the CSV import feature by adding a special _DELETE column to your CSV file. This enables you to perform bulk deletion operations alongside create and update operations in a single import.

How to Delete Resources:

To delete a resource, add a column with the pattern {resourceType}._DELETE (e.g., qualityCheck._DELETE) and set its value to true for the rows you want to delete.

Example:

The following table shows how to delete quality checks using the _DELETE column:

qualityCheck.code qualityCheck.name qualityCheck.qualitySuite.name qualityCheck._DELETE
DWH_005 Daily Revenues DWH Checks false
OLD_CHECK_001 Old Quality Check Legacy Suite true
CUSTOMER-EMAIL-VALIDTY Email Validity Data Quality Suite false

In this example:

  • Rows with qualityCheck._DELETE set to true will be deleted
  • Rows with qualityCheck._DELETE set to false or left empty will be processed normally (created or updated)
  • The resource is identified by its unique identifier (e.g., qualityCheck.code)

Important Notes:

  • Resource Identification: The resource to be deleted must be identified by its unique identifier (code, uuid, or name depending on the resource type). Ensure the identifier column contains the correct value for resources you want to delete.

  • Mixed Operations: You can combine CREATE, UPDATE, and DELETE operations in a single CSV file. The import process will:

    • Delete resources marked with _DELETE = true
    • Create or update resources that are not marked for deletion
  • Safety Check: Use the ANALYZE button in Step 3 to preview which resources will be deleted before executing the import. This allows you to verify the deletion operations without actually performing them.

  • Column Format: The _DELETE column accepts boolean values (true/false) or can be left empty (which is treated as false). The column name must follow the pattern {resourceType}._DELETE.

Warning

Deletion operations are permanent. Always use the ANALYZE feature to preview deletions before executing the import, and ensure you have backups of important data.

Best Practices

Preparing CSV Files

  1. Use Templates: Always download and use existing exports as templates to ensure correct formatting
  2. Validate Data: Check that required fields are filled and data formats are correct
  3. Handle Special Characters: Ensure proper encoding (UTF-8) for special characters
  4. Check Identifiers: Verify that referenced resources (systems, entities, etc.) exist before importing
  5. Test with Small Files: Start with small test imports to verify the format before bulk operations

Import Workflow

  1. Export First: Download existing data to understand the structure
  2. Edit Carefully: Make changes in a spreadsheet application with validation
  3. Review Before Import: Double-check your data for errors
  4. Backup: Keep backups of original exports before importing
  5. Verify Results: After import, verify that data was imported correctly

Common Issues and Solutions

  • Missing Required Fields: Ensure all required columns are present and filled
  • Invalid References: Verify that referenced resources (systems, entities, concepts) exist
  • Format Errors: Check date formats (ISO 8601), number formats, and text encoding
  • Duplicate Identifiers: Ensure unique identifiers for resources that require them
  • Large File Sizes: Break large imports into smaller batches if needed