Thanks for your great work!I want to understand how the model embedding of Special Learned Tokens is trained or initialized? because I found you mask the question when pretrain and sft.Speical Learned Tokens is located in Section 3.1 of the paper.
Thx
· Sign up or log in to comment