Input text size

by mrsjane - opened Jan 9, 2023

Jan 9, 2023

I noticed that example 4 has more than 2000 words, but when I wanted to try the summary pipeline myself, I faced with below error:

"Token indices sequence length is longer than the specified maximum sequence length for this model (2114 > 1024). Running this sequence through the model will result in indexing errors."

Is there a parameter I need to pass in specifically?

knkarthick

Owner Jan 9, 2023

Try giving the argument truncation=True while calling the model.

FatiAI

Feb 23, 2023

when we accept the truncation it summarize the whole text or it stops at the maximum length plz ?

tensiondriven

Jun 24, 2023

From what I can understaand, truncation=True will truncate to the context length. Based on the error above, I suspect this is 1024 tokens.

knkarthick

Owner Sep 13, 2023

you need to split the text into chunks of 1024 (This model limitation is 1024) and then get the summary and then append the results to get summary. if you want to, you can pass the final concatenated result to the model, to get the summary of summaries.

knkarthick changed discussion status to closed Sep 13, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment