Configure data quality metrics

After you set up a datasource and prepare a table, you can create and configure metrics to measure data quality. A metric continuously measures some dimension of data quality, aggregating and recording results at a chosen aggregation interval.

Examples of metrics

  • The volume of data being brought into a table

  • The lag of data being loaded into a table

  • The percent of null values in a column

  • The percent of values that have 10 digits in a string column type

Lightup offers autometrics and custom metrics:

  • Autometrics: Ready-made metrics that you can just turn on or off for an active data asset, such as Data Delay. You can adjust some of the parameters of an autometric.

  • Custom metrics: Metrics that you create yourself, using built-in aggregates, your own custom SQL, comparing data assets or metrics, or an autometric that you customize.

Dimensions

Each data quality metric measures a dimension of data quality: accuracy, completeness, timeliness, or (if none of these applies), custom. These dimensions provide a way to work with similar metrics across datasources, such as on the Dashboard. They also are available for sorting and filtering lists of metrics and monitors.

Manage autometrics

You manage autometrics at the parent data asset level; for example, you enable column autometrics by managing the table that contains those columns. The following procedure assumes that you are making all kinds of data assets Active: starting with the schema, on to the table, then finishing with the columns.

  1. In the Explorer tree, select a datasource to manage metrics for.

  2. On the Actions menu, select Manage Metrics.

  3. In the Manage metrics dialog, toggle the Active slider to the right for each schema where you want to manage metrics. When you toggle the slider to activate a schema, the schema name becomes an active link. Select a link to open the Manage metrics dialog for the linked schema's tables. For example, you might see something like this:

The "Manage metrics" modal for a schema, with ten tables listed, five of them set "Active", and the Autometrics toggles for the five Active tables highlighted
Manage metrics, schema dialog

4. When a table is set to Active, you can toggle its autometrics on or off (one at a time or in bulk). Like a schema, an active table is listed as a link that opens the table's Manage Metrics dialog, where you can work with the table's columns and their metrics.

Types of autometrics

  • Activity autometrics measure structural changes to data assets (a measure of accuracy). Specifically, Table Activity measures when columns are added or dropped from a schema, Column Activity measures when columns are added or dropped from a table, and Category Activity measures when categories are added or dropped from a column.

  • Data delay autometrics measure delays in the arrival of expected data into data assets (a measure of timeliness). Available for use with tables or schema.

  • Data volume autometrics measure deviations from expected data volumes (a measure of completeness). Available for use with tables or schema.

  • Distribution autometrics measure changes in the distribution of values in a column.

  • Null percent autometrics measure the percent of values that are null, in a column.

All of these autometrics are calculated over specified periods of time, the aggregation interval for the metric.

Create a custom metric

You can create your own metrics tailored to your specific data quality concerns.

Step 1 (Metric Info)

  1. In the left pane, select the workspace where you want to create the new metric.

  2. On the top bar, select the Metrics tab, and then on the Metrics > Custom tab, select +.

  3. Under Custom Metric Info, select a metric type. The rest of the process depends on which metric type you select. For the remaining steps, see the subpage that corresponds to that metric type. For summary information about your options, see the following section, Types of custom metrics.

Step 1 (Metric Info) of metric configuration

Types of custom metrics

To proceed, select the subpage for your metric type:

Before you begin to create a metric to compare aggregate metrics, consider reviewing the source and target data assets and the metrics for both, to ensure that the metrics you plan to compare are comparable and have the same aggregation settings.

  • Conformity check metrics measure the validity of data by checking whether values meet a condition you specify.

  • Data delay metrics measure delay in the arrival of expected data into data assets. Available for use with tables and schemas.

  • Data volume metrics measure deviations from expected data volumes. Available for use with tables and schemas.

  • Distribution metrics measure the distribution of values in a specified column.

  • Null percent metrics measure the percent of a column's values that are null.

  • Row by row metrics let you measure data value differences between a source table and a target table, by picking key fields (to match the rows) and which columns' values you want to compare.

  • SQL metrics measure whatever you model with valid SQL, but must include a SELECT statement that meets basic metric query requirements (details available on the subpage).

Edit a metric

  1. In the Explorer tree, select the data asset that the metric is based on.

  2. On the right, find the chart for the metric you want to edit. Then, in its top-right corner select the three vertical dots, and then select Edit.

  3. The metric configuration opens and displays Step 1 (Metric Info) in the main pane. Select the step you want to edit by (1) using the tabs at the top left or (2) selecting the corresponding pencil icon in the left nav.

Annotated image of metric configuration, depicting 1) the steps across the top, and 2) the steps's settings on the left, highlighting how to select a step to edit the settings

4. Make any changes in the main pane, then select the Preview step. For more help about the options

5. If desired, preview the metric. Select Save at the top right corner when you're done.

Last updated