Does LLama4 have chunked attention in generation phase ?
#64
by
vanshils
- opened
Same as title.
I know chunked attention mask is there for context phase. But does LLama4 implement chunked attention mask in generation phase too ?
yes
Sexy girl