Shreya Goyal commited on
Commit
8a73979
·
1 Parent(s): 842c1e0

update readme for NJTs

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -166,3 +166,20 @@ If you find our paper or models helpful, please consider cite as follows:
166
  year={2025}
167
  }
168
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
166
  year={2025}
167
  }
168
  ```
169
+
170
+ ## Efficient DRAMA
171
+ ### Nested Tensors
172
+ [Nested Tensors](https://docs.pytorch.org/docs/stable/nested.html) provide a way to handle ragged-shaped data within a single tensor, allowing for efficient operations on such data.
173
+ They store data in a compact packed representation while offering a standard PyTorch tensor interface, making it easy to apply various
174
+ operations.
175
+ Nested Tensors are particularly advantageous for model deployments that perform inference on large batches of sequences with varying
176
+ lengths. Traditional tensors require padding all sequences in a batch to the same length, which can be inefficient, especially when
177
+ the batch includesmany short sequences and a single long sequence. Nested Tensors eliminate the need for padding, thus avoiding
178
+ unnecessary computation on extra pad tokens. This results in more efficient processing of batches with varying sequence lengths.
179
+
180
+ ### Performance
181
+ Experiments have demonstrated a 1.7x to 2.3x (base,large and 1B) improvement in queries per second (QPS) for batch inference with sequences of varied lengths.
182
+
183
+ ### Usage
184
+ To enable Nested Tensors, simply set the use_nested variable to true. This will activate the nested jagged tensors and allow you to
185
+ take advantage of efficient inference.