Skip to main content

Forecasting timeseries with PyTorch - dataloaders, normalizers, metrics and models

Project description

PyTorch Forecasting

PyPI Version Conda Version Docs Status Linter Status Build Status Code Coverage

Documentation | Tutorials | Release Notes

PyTorch Forecasting is a PyTorch-based package for forecasting time series with state-of-the-art network architectures. It provides a high-level API for training networks on pandas data frames and leverages PyTorch Lightning for scalable training on (multiple) GPUs, CPUs and for automatic logging.


Our article on Towards Data Science introduces the package and provides background information.

PyTorch Forecasting aims to ease state-of-the-art timeseries forecasting with neural networks for real-world cases and research alike. The goal is to provide a high-level API with maximum flexibility for professionals and reasonable defaults for beginners. Specifically, the package provides

  • A timeseries dataset class which abstracts handling variable transformations, missing values, randomized subsampling, multiple history lengths, etc.
  • A base model class which provides basic training of timeseries models along with logging in tensorboard and generic visualizations such actual vs predictions and dependency plots
  • Multiple neural network architectures for timeseries forecasting that have been enhanced for real-world deployment and come with in-built interpretation capabilities
  • Multi-horizon timeseries metrics
  • Ranger optimizer for faster model training
  • Hyperparameter tuning with optuna

The package is built on pytorch-lightning to allow training on CPUs, single and multiple GPUs out-of-the-box.

Installation

If you are working on windows, you need to first install PyTorch with

pip install torch -f https://download.pytorch.org/whl/torch_stable.html.

Otherwise, you can proceed with

pip install pytorch-forecasting

Alternatively, you can install the package via conda

conda install pytorch-forecasting pytorch -c pytorch>=1.7 -c conda-forge

PyTorch Forecasting is now installed from the conda-forge channel while PyTorch is install from the pytorch channel.

To use the MQF2 loss (multivariate quantile loss), also install pip install git+https://github.com/KelvinKan/CP-Flow.git@package-specific-version --no-deps

Documentation

Visit https://pytorch-forecasting.readthedocs.io to read the documentation with detailed tutorials.

Available models

The documentation provides a comparison of available models.

To implement new models or other custom components, see the How to implement new models tutorial. It covers basic as well as advanced architectures.

Usage example

Networks can be trained with the PyTorch Lighning Trainer on pandas Dataframes which are first converted to a TimeSeriesDataSet.

# imports for training
import pytorch_lightning as pl
from pytorch_lightning.loggers import TensorBoardLogger
from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor
# import dataset, network to train and metric to optimize
from pytorch_forecasting import TimeSeriesDataSet, TemporalFusionTransformer, QuantileLoss

# load data: this is pandas dataframe with at least a column for
# * the target (what you want to predict)
# * the timeseries ID (which should be a unique string to identify each timeseries)
# * the time of the observation (which should be a monotonically increasing integer)
data = ...

# define the dataset, i.e. add metadata to pandas dataframe for the model to understand it
max_encoder_length = 36
max_prediction_length = 6
training_cutoff = "YYYY-MM-DD"  # day for cutoff

training = TimeSeriesDataSet(
    data[lambda x: x.date <= training_cutoff],
    time_idx= ...,  # column name of time of observation
    target= ...,  # column name of target to predict
    group_ids=[ ... ],  # column name(s) for timeseries IDs
    max_encoder_length=max_encoder_length,  # how much history to use
    max_prediction_length=max_prediction_length,  # how far to predict into future
    # covariates static for a timeseries ID
    static_categoricals=[ ... ],
    static_reals=[ ... ],
    # covariates known and unknown in the future to inform prediction
    time_varying_known_categoricals=[ ... ],
    time_varying_known_reals=[ ... ],
    time_varying_unknown_categoricals=[ ... ],
    time_varying_unknown_reals=[ ... ],
)

# create validation dataset using the same normalization techniques as for the training dataset
validation = TimeSeriesDataSet.from_dataset(training, data, min_prediction_idx=training.index.time.max() + 1, stop_randomization=True)

# convert datasets to dataloaders for training
batch_size = 128
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=2)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size, num_workers=2)

# create PyTorch Lighning Trainer with early stopping
early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=1, verbose=False, mode="min")
lr_logger = LearningRateMonitor()
trainer = pl.Trainer(
    max_epochs=100,
    gpus=0,  # run on CPU, if on multiple GPUs, use accelerator="ddp"
    gradient_clip_val=0.1,
    limit_train_batches=30,  # 30 batches per epoch
    callbacks=[lr_logger, early_stop_callback],
    logger=TensorBoardLogger("lightning_logs")
)

# define network to train - the architecture is mostly inferred from the dataset, so that only a few hyperparameters have to be set by the user
tft = TemporalFusionTransformer.from_dataset(
    # dataset
    training,
    # architecture hyperparameters
    hidden_size=32,
    attention_head_size=1,
    dropout=0.1,
    hidden_continuous_size=16,
    # loss metric to optimize
    loss=QuantileLoss(),
    # logging frequency
    log_interval=2,
    # optimizer parameters
    learning_rate=0.03,
    reduce_on_plateau_patience=4
)
print(f"Number of parameters in network: {tft.size()/1e3:.1f}k")

# find the optimal learning rate
res = trainer.lr_find(
    tft, train_dataloaders=train_dataloader, val_dataloaders=val_dataloader, early_stop_threshold=1000.0, max_lr=0.3,
)
# and plot the result - always visually confirm that the suggested learning rate makes sense
print(f"suggested learning rate: {res.suggestion()}")
fig = res.plot(show=True, suggest=True)
fig.show()

# fit the model on the data - redefine the model with the correct learning rate if necessary
trainer.fit(
    tft, train_dataloaders=train_dataloader, val_dataloaders=val_dataloader,
)

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytorch_forecasting-0.10.2.tar.gz (125.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pytorch_forecasting-0.10.2-py3-none-any.whl (140.7 kB view details)

Uploaded Python 3

File details

Details for the file pytorch_forecasting-0.10.2.tar.gz.

File metadata

  • Download URL: pytorch_forecasting-0.10.2.tar.gz
  • Upload date:
  • Size: 125.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.8.12 Linux/5.13.0-1023-azure

File hashes

Hashes for pytorch_forecasting-0.10.2.tar.gz
Algorithm Hash digest
SHA256 069af34c17eea2edf5c11c79931097427b88cef8a862457818a03464260e5534
MD5 5eceda4a5372a7bd98db4d0247948064
BLAKE2b-256 5ae6a8c342e299abe24ff47df0ad4f809c5dd8057f4775691b6334e6972818c3

See more details on using hashes here.

File details

Details for the file pytorch_forecasting-0.10.2-py3-none-any.whl.

File metadata

  • Download URL: pytorch_forecasting-0.10.2-py3-none-any.whl
  • Upload date:
  • Size: 140.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.8.12 Linux/5.13.0-1023-azure

File hashes

Hashes for pytorch_forecasting-0.10.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fb5dfb3e4c8cee7e706de7932536145985a93f965ae5256607c3b87181df4a43
MD5 89cf8221e6c4543f787fd77feb1ca794
BLAKE2b-256 289f0ae16cc808f9b8bf6d4e6df3d32791ff37ef45e46fc72981e9ea5200b866

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page