facebook
/

drama-base

Sentence Similarity

sentence-transformers

text-generation-inference

Model card Files Files and versions

Shreya Goyal commited on Jun 12

Commit

8a73979

·

1 Parent(s): 842c1e0

update readme for NJTs

Files changed (1) hide show

README.md +17 -0

README.md CHANGED Viewed

@@ -166,3 +166,20 @@ If you find our paper or models helpful, please consider cite as follows:
   year={2025}
 }
 ```

   year={2025}
 }
 ```
+## Efficient DRAMA
+### Nested Tensors
+[Nested Tensors](https://docs.pytorch.org/docs/stable/nested.html) provide a way to handle ragged-shaped data within a single tensor, allowing for efficient operations on such data.
+They store data in a compact packed representation while offering a standard PyTorch tensor interface, making it easy to apply various
+operations.
+Nested Tensors are particularly advantageous for model deployments that perform inference on large batches of sequences with varying
+lengths. Traditional tensors require padding all sequences in a batch to the same length, which can be inefficient, especially when
+the batch includesmany short sequences and a single long sequence. Nested Tensors eliminate the need for padding, thus avoiding
+unnecessary computation on extra pad tokens. This results in more efficient processing of batches with varying sequence lengths.
+### Performance
+Experiments have demonstrated a 1.7x to 2.3x (base,large and 1B) improvement in queries per second (QPS) for batch inference with sequences of varied lengths.
+### Usage
+To enable Nested Tensors, simply set the use_nested variable to true. This will activate the nested jagged tensors and allow you to
+take advantage of efficient inference.