I can't execute this

#3
by SmallPS - opened

I did install transformers==4.49.0 and flash-attn==2.6.3
and I launch this with sample code.
then error occurs

Traceback (most recent call last):
query_embeddings = model.forward_queries(queries, batch_size=8)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users.cache\huggingface\modules\transformers_modules\nvidia\llama-nemoretriever-colembed-3b-v1\50c36f4d5271c6851aa08bd26d69f6e7ca8b870c\modeling_llama_nemoretrievercolembed.py", line 458, in forward_queries
return self._extract_embeddings(dataloader=dataloader, is_query=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users.cache\huggingface\modules\transformers_modules\nvidia\llama-nemoretriever-colembed-3b-v1\50c36f4d5271c6851aa08bd26d69f6e7ca8b870c\modeling_llama_nemoretrievercolembed.py", line 430, in _extract_embeddings
assert torch.sum(embeddings).float().item() not in [float(0.), float("inf")]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
[W CudaIPCTypes.cpp:16] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

modeling_llama_nemoretrievercolembed.py
line 424 batch has a value
but line 425 self(**batch, output_hidden_states=True).hidden_states[-1] result is
tensor([[[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]]], device='cuda:0',
dtype=torch.bfloat16)

What did I do wrong
My os is windows 11
I'm sorry I couldn't eat it even though you gave it to me

Hello,

unfortunately, I cannot test the Windows environment.

It seems that assert torch.sum(embeddings).float().item() not in [float(0.), float("inf")] fails, which seems that the embeddings are either all 0 or inf, which is an error.

What is the input for queries? Is it a valid text?
Does it run on CPU for a simple query?

Sign up or log in to comment