## Use Case: Working with Mixed Frequency Data

Mixed frequency data, data that includes time series at daily, weekly, monthly, or any other frequencies in the same panel, is not familiar to most data scientists. The reason is simple: prior to OttoQuant, tools to handle mixed frequency data have not been widely available. Instead, practitioners tended to use data with uniform frequencies, aggregate high frequency data, or impute missing low frequency observations. However, none of these solutions are adequate to successfully capture the real time information in mixed frequency data. The table to the right provides an example of a mixed frequency data set typical of what OttoQuant uses as inputs for its modeling algorithms.

## Why use mixed frequency data?

In an ideal world all our data would be available in real time. In practice, that is not the case. We could then restrict ourselves to using daily data. However, lower frequency indexes often contain useful information. Purchasing managers indexes, for example, are a good metric of current economic conditions but are published only monthly. At a more micro level, a resale outlet may only receive inventory at a fixed interval yet need to balance stocks with demand in real time. For these reasons, we will often want to work with mixed frequency data.

## What are the challenges of using mixed frequency data?

Mixed frequency data typically involves many missing observations. Missing observations may be due directly to frequencies (monthly data is observed once for every 30 or so daily observations), or due to a “ragged edge” at both the beginning or end of the sample. At the end of the sample, a ragged edge, or higher incidence of missing observations, exists because different series have different publication lags. PMIs, for example, are published before the end of the month while industrial production in the US is published with a lag of around 15 days after the end of the reference month. At the beginning of the sample, series may start at different dates. For the US, consumer confidence data goes back to 1970 while the Redbook Index goes back to 2005.

## Handling mixed frequency data correctly

Aggregating high frequency observations or imputing low frequency observations will result in misleading analysis. Instead, OttoQuant uses signal extraction to synthesize mixed frequency data sets into standardized, square, uniform frequency, stationary data series that capture the real time information in the underlying mixed frequency panel. One can then use these extracted signals for any standard analysis including traditional econometrics and machine learning.

## An Example

As an example, we’ll extract real time signals from a mixed frequency panel of daily, weekly, and monthly data for the US, part of which is depicted in the above table. Input data is:

#### Daily

AAA – T bill spread

AAA – BAA spread

Interbank Rate

DJIA

Lithium spot price

Forward Inflation Expectations 5 y

Inflation Expectations 10y

Ted Spread

USD exchange rate

10Y T bill

#### Weekly

Banks Balance Sheet

Redbook Index

Mortgage Applications

30y 15y Mortgage Spread

Continuing Jobless Claims

Initial Jobless Claims

#### Monthly

Manufacturing PMI

Services PMI

Non-Manufacturing PMI

Consumer Confidence

Business Confidence

NY Manufacturing Index

Philadelphia Fed

Manufacturing Index

Richmond Fed

Manufacturing Index

Dallas Fed Manufacturing

Index

In the OttoQuant interface, we can estimate this model by selecting New Project >> Signal Extraction >> Bayesian Dynamic Factor Model and then selecting the above variables. Users can also select the number of factors (signals) they would like to estimate, the desired frequency of these signals, and use our Bayesian methodology to smooth or filter the resulting estimates. A sample of results for a three factor model are illustrated to the right. In addition to the graphical output, users can download output data in csv format as well as input data and, for technical users, parameter estimates. We’ve published a one factor version here.