Add model weights, config and README

Browse files

Files changed (7) hide show

.gitattributes +2 -0
README.md +111 -0
config.json +41 -0
images/mr_model_architecture.png +3 -0
images/multi_resolution_time_series_example.png +3 -0
model.safetensors +3 -0
torch_model.pt +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+images/mr_model_architecture.png filter=lfs diff=lfs merge=lfs -text
+images/multi_resolution_time_series_example.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,114 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
 ---
+# Cisco Time Series Model
+The Cisco Time Series Model is a foundation model trained to perform univariate zero-shot forecasting. Its core is a sequence of decoder-only transformer layers. It is heavily based on the [TimesFM2.0 model](https://huggingface.co/google/timesfm-2.0-500m-pytorch), with multiresolution modifications aimed at efficient use of long context. It expects a multiresolution context (x<sub>c</sub>, x<sub>f</sub>), where the resolution (i.e., space between data points) of x<sub>c</sub> is 60 times the resolution of x<sub>f</sub>. Both x<sub>c</sub> and x<sub>f</sub> can have length up to 512. The input contexts should be aligned “on the right,” e.g., if x<sub>f</sub> consists of the 512 minutes terminating at 11:00AM on November 11, then x<sub>c</sub> should consist of the 512 hours terminating at the same time. The output is a forecast of 128 points, which should be interpreted at the finer resolution; and corresponding quantiles for these points.
+For convenience, we provide utilities for preparing a multiresolution context from a single resolution context (with length up to 512 x 60 = 30,720) directly.
+## Model Architecture and Training Details
+<figure>
+  <img src="images/mr_model_architecture.png" alt="Multiresolution model architecture">
+  <figcaption><em>Architecture diagram illustrating our novel additions of Resolution Embeddings and Special Token.</em></figcaption>
+</figure>
+Despite not conforming to the TimesFM architecture, the pre-training of the Cisco Time Series Model began from the weights of TimesFM. The dataset used for the additional training contains over 300B unique datapoints. Slightly more than 50% of the data is derived from metric time series data from internal deployments of the Splunk Observability Cloud, with about 35% at (1-hour, 1-minute) resolution, and the remaining 15% at (5-hour, 5-minute) resolution. Additional multiresolution data, comprising about 30% of the training set, was derived from the [GIFT-Eval](https://huggingface.co/datasets/Salesforce/GiftEvalPretrain) pretraining corpus. Another 5% was derived from the [Chronos](https://huggingface.co/datasets/autogluon/chronos_datasets) dataset collection (less overlap with GIFT-Eval test). The final 15% is synthetic multiresolution data.
+**Note:** A PyTorch implementation of the model architecture can be found in our [GitHub repository](https://github.com/splunk/cisco-time-series-model). A more detailed technical report will be released on arXiv soon; you can also access it [here](https://github.com/splunk/cisco-time-series-model/blob/main/1.0-preview/technical_report/Cisco-Time-Series-Model-Techincal-Report.pdf).
+### Example Visualization of Multiresolution Time Series Input to the Model
+<figure>
+  <img src="images/multi_resolution_time_series_example.png" alt="Multiresolution time series example with padded 1-hour context">
+  <figcaption><em>Multiresolution time series example with padded 1-hour context.</em></figcaption>
+</figure>
+## Usage notes
+- If the input time series is missing some values, imputation via last value is recommended; if the time series is naturally sparse and this leads to excessive imputation (e.g., more than 30% of values are imputed), the model forecasts will deteriorate.
+- The model generally works better when more coarse resolution history is provided. Its performance may suffer on very short inputs.
+- The quantiles have not been calibrated or rigorously evaluated, e.g., we currently do not have evidence to support a claim along the lines of “the range from q=0.1 to q=0.9 contains the true value 80% of the time (under some mild conditions).”
+## Checkpoint
+We currently provide one open checkpoint, [cisco-time-series-model-1.0-preview](https://huggingface.co/cisco-ai/cisco-time-series-model-1.0-preview).
+## Minimal Installation Instructions
+Clone the repository:
+```shell
+git clone https://github.com/splunk/cisco-time-series-model.git
+cd cisco-time-series-model
+pip install -r requirements.txt
+```
+For more detailed instructions and virtual environment setup, please refer to the [GitHub repository](https://github.com/splunk/cisco-time-series-model).
+## Example Usage
+```python
+import torch
+import numpy as np
+from modeling import CiscoTsmMR, TimesFmHparams, TimesFmCheckpoint
+rng = np.random.default_rng(42)
+## Sample data
+T = 512 * 60
+hours = (T + 59) // 60
+k = np.arange(hours, dtype=np.float32)
+h = (80 + 0.1 * k) * (1 + 0.25 * np.sin(2 * np.pi * k / 24))
+t = np.arange(T, dtype=np.float32)
+input_series = h[(t // 60).astype(int)] * (1 + 0.05 * np.sin(2 * np.pi * t / 30)) + rng.normal(0, 0.4, size=T)
+# Hyperparameters
+hparams = TimesFmHparams(
+    num_layers=50,
+    use_positional_embedding=False,
+    backend="gpu" if torch.cuda.is_available() else "cpu",
+)
+ckpt = TimesFmCheckpoint(huggingface_repo_id="cisco-ai/cisco-time-series-model-1.0-preview")
+model = CiscoTsmMR(
+    hparams=hparams,
+    checkpoint=ckpt,
+    use_resolution_embeddings=True,
+    use_special_token=True,
+)
+# Model Inference
+forecast_preds = model.forecast(input_series, horizon_len=128)
+# Access forecast mean and quantiles of each series
+mean_forecast = forecast_preds[0]['mean'] # (128,)
+quantiles = forecast_preds[0]['quantiles'] # dict with keys as quantile levels (0.1, 0.2, ...., 0.9) and values as (128,) numpy arrays
+# You can also forecast multiple series at once
+T = 25_000
+hours = (T + 59) // 60
+k = np.arange(hours, dtype=np.float32)
+h = 120 / (1 + np.exp(-0.01 * (k - 300))) + 10 * np.cos(2 * np.pi * k / (24*7))
+t = np.arange(T, dtype=np.float32)
+input_series_2 = h[(t // 60).astype(int)] + 2 * np.sin(2 * np.pi * t / 60) + rng.normal(0, 0.5, size=T)
+multi_series_forecasts = model.forecast([input_series_1, input_series_2], horizon_len=128)
+# Long horizon forecasting is also supported and can be invoked as follows
+long_horizon_forecasts = model.forecast(input_series_1, horizon_len=240)
+```
+<b>Authored by:</b>
+- Liang Gou \*
+- Archit Khare \*
+- Praneet Pabolu \*
+- Prachi Patel \*
+- Joseph Ross \*
+- Hercy Shen \*‡
+- Yuhan (Ellen) Song \*
+- Jingze Sun \*
+- Kristal Curtis †
+- Vedant Dharnidharka †
+- Abhinav Mathur †
+- Hao Yang †
+\* These authors contributed equally to the core development of this work, listed alphabetically by last name. <br>
+† These authors contributed equally to supporting and extending this work, listed alphabetically by last name. <br>
+‡ Hercy Shen contributed to this work while an intern at Splunk.<br>

config.json ADDED Viewed

	@@ -0,0 +1,41 @@

+{
+  "_name_or_path": "cisco-time-series-model",
+  "architectures": [
+    "PatchedTSMultiResolutionDecoder"
+  ],
+  "model_type": "cisco-time-series-model",
+  "context_length_fine": 512,
+  "context_length_coarse": 512,
+  "horizon_length": 128,
+  "patch_length": 32,
+  "freq_size": 3,
+  "num_hidden_layers": 50,
+  "num_attention_heads": 16,
+  "num_kv_heads": 16,
+  "hidden_size": 1280,
+  "intermediate_size": 1280,
+  "head_dim": 80,
+  "rms_norm_eps": 1e-6,
+  "pad_val": 1123581321.0,
+  "tolerance": 1e-6,
+  "quantiles": [
+    0.1,
+    0.2,
+    0.3,
+    0.4,
+    0.5,
+    0.6,
+    0.7,
+    0.8,
+    0.9
+  ],
+  "use_positional_embedding": false,
+  "use_resolution_embeddings": true,
+  "use_special_token": true,
+  "min_timescale": 1,
+  "max_timescale": 10000,
+  "agg_factor_default": 60,
+  "torch_dtype": "float32",
+  "transformers_version": "4.52.0"
+}

images/mr_model_architecture.png ADDED Viewed

Git LFS Details

SHA256: 7d763bb9ef8b6291aeba53471f5402745a9dc08ccb8c38051a21936d850a8e8b
Pointer size: 132 Bytes
Size of remote file: 1.03 MB

images/multi_resolution_time_series_example.png ADDED Viewed

Git LFS Details

SHA256: d0481daa2423e390b4ae699a81b0753c04fa7db6a910ad518f07970e3622de06
Pointer size: 131 Bytes
Size of remote file: 541 kB

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f29111cf1f0e94660f0b6b1edfb0778d6df36406e536aeff9ec9b01d5679fd31
+size 1995407184

torch_model.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4a7c2c52fb13038573a0407e784f074760aef84e0d5af5cd6f77a21a0ff176d8
+size 1995580564