Configure data quality metrics
Last updated
Last updated
You need the Workspace Editor role to complete the procedures on this page, unless otherwise noted at the top of the procedure. However, if you have the Workspace Viewer role, you can open a metric configuration and run a preview of the metric.
After you set up a datasource and prepare a table, you can create and configure metrics to measure data quality. A metric continuously measures some dimension of data quality, aggregating and recording results at a chosen aggregation interval.
The volume of data being brought into a table
The lag of data being loaded into a table
The percent of null values in a column
The percent of values that have 10 digits in a string column type
Lightup offers autometrics and custom metrics:
Autometrics: Ready-made metrics that you can just turn on or off for an active data asset, such as Data Delay. You can adjust some of the parameters of an autometric.
Custom metrics: Metrics that you create yourself, using built-in aggregates, your own custom SQL, comparing data assets or metrics, or an autometric that you customize.
Each data quality metric measures a dimension of data quality: accuracy, completeness, timeliness, or (if none of these applies), custom. These dimensions provide a way to work with similar metrics across datasources, such as on the . They also are available for sorting and filtering lists of metrics and monitors.
You manage autometrics at the parent data asset level; for example, you enable column autometrics by managing the table that contains those columns. The following procedure assumes that you are making all kinds of data assets Active: starting with the schema, on to the table, then finishing with the columns.
In the Explorer tree, select a datasource to manage metrics for.
On the Actions menu, select Manage Metrics.
In the Manage metrics dialog, toggle the Active slider to the right for each schema where you want to manage metrics. When you toggle the slider to activate a schema, the schema name becomes an active link. Select a link to open the Manage metrics dialog for the linked schema's tables. For example, you might see something like this:
4. When a table is set to Active, you can toggle its autometrics on or off (one at a time or in bulk). Like a schema, an active table is listed as a link that opens the table's Manage Metrics dialog, where you can work with the table's columns and their metrics.
Activity autometrics measure structural changes to data assets (a measure of accuracy). Specifically, Table Activity measures when columns are added or dropped from a schema, Column Activity measures when columns are added or dropped from a table, and Category Activity measures when categories are added or dropped from a column.
Data delay autometrics measure delays in the arrival of expected data into data assets (a measure of timeliness). Available for use with tables or schema.
Data volume autometrics measure deviations from expected data volumes (a measure of completeness). Available for use with tables or schema.
Distribution autometrics measure changes in the distribution of values in a column.
Null percent autometrics measure the percent of values that are null, in a column.
All of these autometrics are calculated over specified periods of time, the aggregation interval for the metric.
You can create your own metrics tailored to your specific data quality concerns.
In the left pane, select the workspace where you want to create the new metric.
On the top bar, select the Metrics tab, and then on the Metrics > Custom tab, select +.
To proceed, select the subpage for your metric type:
In the Explorer tree, select the data asset that the metric is based on.
The metric configuration opens and displays Step 1 (Metric Info) in the main pane. Select the step you want to edit by (1) using the tabs at the top left or (2) selecting the corresponding pencil icon in the left nav.
4. Make any changes in the main pane, then select the Preview step. For more help about the options
5. If desired, preview the metric. Select Save at the top right corner when you're done.
Under Custom Metric Info, select a metric type. The rest of the process depends on which metric type you select. For the remaining steps, see the subpage that corresponds to that metric type. For summary information about your options, see the following section, .
measure changes in the result of an aggregate function that you define.
let you compare existing metrics for a source table and a target table.
measure the validity of data by checking whether values meet a condition you specify.
measure delay in the arrival of expected data into data assets. Available for use with tables and schemas.
measure deviations from expected data volumes. Available for use with tables and schemas.
measure the distribution of values in a specified column.
measure the percent of a column's values that are null.
let you measure data value differences between a source table and a target table, by picking key fields (to match the rows) and which columns' values you want to compare.
measure whatever you model with valid SQL, but must include a SELECT statement that meets basic metric query requirements (details available on the subpage).
On the right, find the chart for the metric you want to edit. Then, in its top-right corner select the three vertical dots, and then select Edit.