What Are Measures in Data Analysis?

A measure is a mathematical concept that generalizes and formalizes notions like length, area, and volume. It can even be extended to assume negative values, as with electrical charge.

When you add a measure to a PivotTable, PivotChart, or report, its value is calculated when the context changes. This makes a measure an excellent tool for analyzing data.

Types of Measures

Measures are calculations you create using Data Analysis Expressions (DAX), which provides a flexible and powerful formula language. You can use built-in quick measures to get started, or create custom ones to support any kind of aggregation or calculation you need for your data.

There are three basic types of measurements that you can build: process, outcome and balancing. Outcome measures track and report on the success of your improvement initiatives. Balancing measures help you ensure that your improvements do not widen existing inequities within the system.

Post-SQL measures are special measure types that perform additional calculations after Looker generates query SQL. These measures cannot reference dimensions or other measure types, and they cannot use the filters parameter. They are a great way to perform calculations that require large amounts of memory, such as running totals. The sql parameter for these measures must use the format: value_format_name or format: value_format_value_split functions, which convert values into different formats.

Measures of Uncertainty

In the metrological context, measurement uncertainty (or expanded uncertainty if using GUM) is a “parameter, associated with the result of a measurement, characterising the dispersion of values that could reasonably be attributed to the measurand, based on the information used” (VIM).

Uncertainty results from the interplay between random and systematic errors. Random error is unpredictable and cannot be controlled, so it must be included in the uncertainty evaluation, while systematic error can be corrected.

Ideally, the distribution of measurement data would be normal, so that all the individual components have a small contribution to the overall uncertainty. However, real-life measurements are more likely to be non-normal. As a result, the characterization of a measurement’s uncertainty is typically done using a probability distribution such as a Gaussian or Student’s t distribution. These distributions are characterized by the fact that they have symmetric tails, meaning that they can be positioned either on the left or right of a central line.

Measures of Variability

Like measures of central tendency, measures of variability summarize how far apart your data points are from each other and the center of the distribution. Variability is important because it influences how much you can generalize information about a population based on sample data. If a distribution has high variability, its values are more dissimilar from each other and extreme values are more likely.

There are several different measures of variability, including the range, standard deviation and variance. The standard deviation is the preferred measure because it takes all of the scores into account and ignores outliers, which are large or small values that occur more frequently than others in a dataset.

To calculate a measure of variation, start by adding up the deviation scores for all of the scores in your data set and then dividing that sum by the mean. For example, Figure 1 presents two histograms of the scores from two quizzes that have equal means, 7.0. The scores from Quiz 1 are more densely packed while the scores from Quiz 2 are more spread out.

Measures of Reliability

Reliability is the consistency with which a measurement method produces the same results. It is a necessary but not sufficient condition for validity, which describes whether a measure measures what it aims to measure in a truthful and systematic manner.

To determine reliability, you can use a number of methods. One is the test-retest method, in which you administer a measure to the same participants on two separate occasions, with some elapsed time between the assessments. If the scores on the two assessments are similar, then the measure has high reliability.

Another way to determine reliability is to perform formal psychometric analysis on the measure, such as item analysis. This involves calculating the typical error and the limits of agreement, as well as computing item difficulty and discrimination indices. Reliability indices and coefficients can range from 0 to 1, with 1 indicating perfect reliability. The higher the reliability index, the more reliable the instrument is.