Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Create a trading volume profile curve with a time series model factory

Access this AI accelerator on GitHub

In securities trading, it’s often useful to have an idea of how trading volume for a particular instrument will be distributed over the market session. This is done by building a volume curve — essentially, a prediction of how much of the volume will fall within the different time intervals (“time slices”) in a trading day. Volume curves allow traders to better anticipate how to time and pace their orders and are used as inputs into algorithmic execution strategies such as VWAP (volume weighted average price) and IS (implementation shortfall).

Historically, volume curves have been built by taking the average share of volume for a particular time slice over the last N trading days (for instance, the share of the daily volume in AAPL that traded between 10:35 and 10:40am on each of the last 20 trading days, on average), with manual adjustments to take account of scheduled events and anticipated differences. Machine learning allows you to do this in a structured, systematic way.

The goal of this AI accelerator is to provide a framework to build models that will allow you to predict how much of the next day trading volume will happen at each time interval. The granularity can vary from minute by minute (or even lower) to hourly or daily. If you are working with high granularity, such as minute by minute intervals, having a single time series model to predict the next 1440 minutes (or 480, based on how long the market is open) becomes problematic.

Instead, consider a time series model per interval (minute, half hour, hour, etc.) so that each model is only forecasting one step ahead. You can then bring together the predictions of all the models to create the full curve for the next day. Furthermore, while a model is built to predict each time interval, the model isn't restricted to data for that interval, but can leverage a wider window.

While the motivation for this repository is a financial markets use case, it should be useful in other scenarios where predictions are required at a high resolution, such as predictive maintenance.

Challenges

  • The number of models or deployments can explode, and you need to keep track of all of them.
  • Each model needs slightly different data.
  • Even if you are creating a model per minute, you want to use data from earlier and later on in the day.
  • You want to see a unified result (a single curve for the whole trading day).

Approach

  • Train a model per interval, but leverage data outside of the interval by "widening" the time window on which it is trained.
  • Use a data frame to track all the projects, models and deployments corresponding to each interval. This will make it easy to stitch all the predictions together to build the next day(s) curve.

Updated September 28, 2023