DeathReaper0965 commited on
Commit
301162b
·
verified ·
1 Parent(s): 4e021d4

Add model weights, config and README

Browse files
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ images/mr_model_architecture.png filter=lfs diff=lfs merge=lfs -text
37
+ images/multi_resolution_time_series_example.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,114 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # Cisco Time Series Model
5
+ The Cisco Time Series Model is a foundation model trained to perform univariate zero-shot forecasting. Its core is a sequence of decoder-only transformer layers. It is heavily based on the [TimesFM2.0 model](https://huggingface.co/google/timesfm-2.0-500m-pytorch), with multiresolution modifications aimed at efficient use of long context. It expects a multiresolution context (x<sub>c</sub>, x<sub>f</sub>), where the resolution (i.e., space between data points) of x<sub>c</sub> is 60 times the resolution of x<sub>f</sub>. Both x<sub>c</sub> and x<sub>f</sub> can have length up to 512. The input contexts should be aligned “on the right,” e.g., if x<sub>f</sub> consists of the 512 minutes terminating at 11:00AM on November 11, then x<sub>c</sub> should consist of the 512 hours terminating at the same time. The output is a forecast of 128 points, which should be interpreted at the finer resolution; and corresponding quantiles for these points.
6
+
7
+ For convenience, we provide utilities for preparing a multiresolution context from a single resolution context (with length up to 512 x 60 = 30,720) directly.
8
+
9
+ ## Model Architecture and Training Details
10
+ <figure>
11
+ <img src="images/mr_model_architecture.png" alt="Multiresolution model architecture">
12
+ <figcaption><em>Architecture diagram illustrating our novel additions of Resolution Embeddings and Special Token.</em></figcaption>
13
+ </figure>
14
+
15
+ Despite not conforming to the TimesFM architecture, the pre-training of the Cisco Time Series Model began from the weights of TimesFM. The dataset used for the additional training contains over 300B unique datapoints. Slightly more than 50% of the data is derived from metric time series data from internal deployments of the Splunk Observability Cloud, with about 35% at (1-hour, 1-minute) resolution, and the remaining 15% at (5-hour, 5-minute) resolution. Additional multiresolution data, comprising about 30% of the training set, was derived from the [GIFT-Eval](https://huggingface.co/datasets/Salesforce/GiftEvalPretrain) pretraining corpus. Another 5% was derived from the [Chronos](https://huggingface.co/datasets/autogluon/chronos_datasets) dataset collection (less overlap with GIFT-Eval test). The final 15% is synthetic multiresolution data.
16
+
17
+ **Note:** A PyTorch implementation of the model architecture can be found in our [GitHub repository](https://github.com/splunk/cisco-time-series-model). A more detailed technical report will be released on arXiv soon; you can also access it [here](https://github.com/splunk/cisco-time-series-model/blob/main/1.0-preview/technical_report/Cisco-Time-Series-Model-Techincal-Report.pdf).
18
+
19
+ ### Example Visualization of Multiresolution Time Series Input to the Model
20
+ <figure>
21
+ <img src="images/multi_resolution_time_series_example.png" alt="Multiresolution time series example with padded 1-hour context">
22
+ <figcaption><em>Multiresolution time series example with padded 1-hour context.</em></figcaption>
23
+ </figure>
24
+
25
+ ## Usage notes
26
+ - If the input time series is missing some values, imputation via last value is recommended; if the time series is naturally sparse and this leads to excessive imputation (e.g., more than 30% of values are imputed), the model forecasts will deteriorate.
27
+ - The model generally works better when more coarse resolution history is provided. Its performance may suffer on very short inputs.
28
+ - The quantiles have not been calibrated or rigorously evaluated, e.g., we currently do not have evidence to support a claim along the lines of “the range from q=0.1 to q=0.9 contains the true value 80% of the time (under some mild conditions).”
29
+
30
+ ## Checkpoint
31
+ We currently provide one open checkpoint, [cisco-time-series-model-1.0-preview](https://huggingface.co/cisco-ai/cisco-time-series-model-1.0-preview).
32
+
33
+ ## Minimal Installation Instructions
34
+ Clone the repository:
35
+ ```shell
36
+ git clone https://github.com/splunk/cisco-time-series-model.git
37
+ cd cisco-time-series-model
38
+ pip install -r requirements.txt
39
+ ```
40
+
41
+ For more detailed instructions and virtual environment setup, please refer to the [GitHub repository](https://github.com/splunk/cisco-time-series-model).
42
+
43
+ ## Example Usage
44
+ ```python
45
+ import torch
46
+ import numpy as np
47
+ from modeling import CiscoTsmMR, TimesFmHparams, TimesFmCheckpoint
48
+
49
+ rng = np.random.default_rng(42)
50
+
51
+ ## Sample data
52
+ T = 512 * 60
53
+ hours = (T + 59) // 60
54
+ k = np.arange(hours, dtype=np.float32)
55
+ h = (80 + 0.1 * k) * (1 + 0.25 * np.sin(2 * np.pi * k / 24))
56
+ t = np.arange(T, dtype=np.float32)
57
+
58
+ input_series = h[(t // 60).astype(int)] * (1 + 0.05 * np.sin(2 * np.pi * t / 30)) + rng.normal(0, 0.4, size=T)
59
+
60
+ # Hyperparameters
61
+ hparams = TimesFmHparams(
62
+ num_layers=50,
63
+ use_positional_embedding=False,
64
+ backend="gpu" if torch.cuda.is_available() else "cpu",
65
+ )
66
+
67
+ ckpt = TimesFmCheckpoint(huggingface_repo_id="cisco-ai/cisco-time-series-model-1.0-preview")
68
+
69
+ model = CiscoTsmMR(
70
+ hparams=hparams,
71
+ checkpoint=ckpt,
72
+ use_resolution_embeddings=True,
73
+ use_special_token=True,
74
+ )
75
+
76
+ # Model Inference
77
+ forecast_preds = model.forecast(input_series, horizon_len=128)
78
+
79
+ # Access forecast mean and quantiles of each series
80
+ mean_forecast = forecast_preds[0]['mean'] # (128,)
81
+ quantiles = forecast_preds[0]['quantiles'] # dict with keys as quantile levels (0.1, 0.2, ...., 0.9) and values as (128,) numpy arrays
82
+
83
+ # You can also forecast multiple series at once
84
+ T = 25_000
85
+ hours = (T + 59) // 60
86
+ k = np.arange(hours, dtype=np.float32)
87
+ h = 120 / (1 + np.exp(-0.01 * (k - 300))) + 10 * np.cos(2 * np.pi * k / (24*7))
88
+ t = np.arange(T, dtype=np.float32)
89
+ input_series_2 = h[(t // 60).astype(int)] + 2 * np.sin(2 * np.pi * t / 60) + rng.normal(0, 0.5, size=T)
90
+
91
+ multi_series_forecasts = model.forecast([input_series_1, input_series_2], horizon_len=128)
92
+
93
+ # Long horizon forecasting is also supported and can be invoked as follows
94
+ long_horizon_forecasts = model.forecast(input_series_1, horizon_len=240)
95
+
96
+ ```
97
+
98
+ <b>Authored by:</b>
99
+ - Liang Gou \*
100
+ - Archit Khare \*
101
+ - Praneet Pabolu \*
102
+ - Prachi Patel \*
103
+ - Joseph Ross \*
104
+ - Hercy Shen \*‡
105
+ - Yuhan (Ellen) Song \*
106
+ - Jingze Sun \*
107
+ - Kristal Curtis †
108
+ - Vedant Dharnidharka †
109
+ - Abhinav Mathur †
110
+ - Hao Yang †
111
+
112
+ \* These authors contributed equally to the core development of this work, listed alphabetically by last name. <br>
113
+ † These authors contributed equally to supporting and extending this work, listed alphabetically by last name. <br>
114
+ ‡ Hercy Shen contributed to this work while an intern at Splunk.<br>
config.json ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "cisco-time-series-model",
3
+ "architectures": [
4
+ "PatchedTSMultiResolutionDecoder"
5
+ ],
6
+ "model_type": "cisco-time-series-model",
7
+ "context_length_fine": 512,
8
+ "context_length_coarse": 512,
9
+ "horizon_length": 128,
10
+ "patch_length": 32,
11
+ "freq_size": 3,
12
+ "num_hidden_layers": 50,
13
+ "num_attention_heads": 16,
14
+ "num_kv_heads": 16,
15
+ "hidden_size": 1280,
16
+ "intermediate_size": 1280,
17
+ "head_dim": 80,
18
+ "rms_norm_eps": 1e-6,
19
+ "pad_val": 1123581321.0,
20
+ "tolerance": 1e-6,
21
+ "quantiles": [
22
+ 0.1,
23
+ 0.2,
24
+ 0.3,
25
+ 0.4,
26
+ 0.5,
27
+ 0.6,
28
+ 0.7,
29
+ 0.8,
30
+ 0.9
31
+ ],
32
+ "use_positional_embedding": false,
33
+ "use_resolution_embeddings": true,
34
+ "use_special_token": true,
35
+ "min_timescale": 1,
36
+ "max_timescale": 10000,
37
+
38
+ "agg_factor_default": 60,
39
+ "torch_dtype": "float32",
40
+ "transformers_version": "4.52.0"
41
+ }
images/mr_model_architecture.png ADDED

Git LFS Details

  • SHA256: 7d763bb9ef8b6291aeba53471f5402745a9dc08ccb8c38051a21936d850a8e8b
  • Pointer size: 132 Bytes
  • Size of remote file: 1.03 MB
images/multi_resolution_time_series_example.png ADDED

Git LFS Details

  • SHA256: d0481daa2423e390b4ae699a81b0753c04fa7db6a910ad518f07970e3622de06
  • Pointer size: 131 Bytes
  • Size of remote file: 541 kB
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f29111cf1f0e94660f0b6b1edfb0778d6df36406e536aeff9ec9b01d5679fd31
3
+ size 1995407184
torch_model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4a7c2c52fb13038573a0407e784f074760aef84e0d5af5cd6f77a21a0ff176d8
3
+ size 1995580564