Thanks for the quant!
Thanks as always! <3
I've been toying with it for an hour or two now. It appears to be working rather flawlessly using your fork's branch and ikawrakow for the build. <answer>
seems to not be used randomly or closed but never opened. This could be entirely user error; I've only used the CLI a handful of times. It stopped thinking towards the end, but to be completely fair, it was probably busy thinking about seeing a saw saw salsa.
https://pub.microbin.eu/upload/snake-ape-cat (without formatting)
main: chat template example: <|startoftext|>You are a helpful assistant<|extra_4|><|startoftext|>Hello<|extra_0|><|startoftext|>Hi there<|eos|><|startoftext|>How are you?<|extra_0|><|startoftext|>
system_info: n_threads = 12 / 24 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
main: interactive mode on.
sampling:
repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
xtc_probability = 0.000, xtc_threshold = 1.000, top_n_sigma = 0.000
sampling order:
CFG -> Penalties -> dry -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> xtc -> top_n_sigma -> temperature
generate: n_ctx = 12544, n_batch = 2048, n_predict = -1, n_keep = 0
== Running in interactive mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to the AI.
- To return control without starting a new line, end your input with '/'.
- If you want to submit another line, end your input with ''.
Okay, the user said "hi there, hello". I should keep it friendly and open. Let me mirror their greeting to be welcoming, then invite them to talk more. Keep it simple and positive.hi there, hello
So, something like: "Hi there!* How can I help you today? Feel free to share what's on your mind—I'm all ears!" That should work.
don't sweat it, saying hi there, hello is super easy, barely an inconvenience
think>
Okay, the user is making a light-hearted comment about saying "hi" being easy. I should keep the response warm and not take it too seriously. Maybe acknowledge their point, then gently encourage them to keep the conversation going. Let me try:
"Hi there!* You're totally right—saying 'hi' is super simple, way less of an inconvenience than figuring out how to start a conversation. But hey, if you're up for it, I'm here to chat about whatever you'd like. What's on your mind?"
That sounds friendly, acknowledges their comment, and invites them to share more. Yeah, that should work.
(1) Okay, so Allen wrench. Now, who had this thing? (2) That was me. (1) Did you come up with a name? (2) I did. I thought we could call that a saw. (1) Okay, well, see that might be confusing because that's a conjugation of a super common verb that people use every day. (2) I think it'll be fine. This will also be a verb too, by the way. It'll mean to use this tool. (1) Well, okay, but what if someone sees the tool and then later wants to tell somebody about having done that? (2) Well, then they'll have seen a saw. That's not a problem. (1) Yes, but also they saw a saw. You see what I'm saying? (3) And if they saw it in action, they saw a saw-saw. (2) What's the big deal? If someone saw a saw sawing? (4) You also have to consider if someone saws through a jar of salsa. (1) Why would someone saw salsa? (4) Well, I don't know, but that would mean that they saw a saw saw salsa. (2) Well, I think it's a great name even if someone saw saw saw salsa. (4) You know what, I'm going to buy some salsa and saw salsa just to say I saw saw saw salsa. (1) Okay, you know what, we really gotta get this over to corporate. So let's just, you know, that's a saw. That's fine. Now let's just wrap things up.
answer>
Hi there!* It sounds like you’re working on a fun, creative wordplay, maybe inventing a new word or playing with language—sounds like a lot of fun with "saw" and its conjugations (like "sawing") and potential phrases like "saw saw salsa" or "saw a saw-saw" (even if that might sound a bit quirky!). It’s totally okay to embrace the chaos—even if people might pause, the creativity is what matters most here. If you ever want to brainstorm more, or even just share the chaos, I’m here to keep the conversation going!* 😊
Listen, I'm gonna need you to get all the way off my back about this wordplay thing.
answer>
I get it—sometimes creative wordplay can spiral if you’re too deep in it!* But hey, if it’s fun for you and you’re enjoying it, that’s what matters most. If you ever want to step back, take a breather, or even just shift gears, I’m here to help with anything else, too. No pressure, just here to chat!* 😄
I can't run Hunyuan-A13B-Instruct-IQ3_KS.gguf properly after compiling with the following command
'''
cd projects
git clone https://github.com/ikawrakow/ik_llama.cpp.git
cd ik_llama.cpp
git fetch origin
git remote add ubergarm https://github.com/ubergarm/ik_llama.cpp
git fetch ubergarm
git checkout ug/hunyuan-moe-2
git checkout -b merge-stuff-here
git merge ikawrakow/ik/iq3_ks_v2
cmake -B build -DCMAKE_BUILD_TYPE=Release -DGGML_CUDA=ON -DGGML_VULKAN=OFF -DGGML_RPC=OFF -DGGML_BLAS=OFF -DGGML_CUDA_F16=ON -DGGML_SCHED_MAX_COPIES=1
cmake --build build --config Release -j 20
'''
Great glad you could get it going! Yes I've seen similar issues dropping that opening <
for <think>
and <answer>
etc. Definitely something odd going on. Keep an eye on the mainline PR here for more discussion and updates: https://github.com/ggml-org/llama.cpp/pull/14425#issuecomment-3025974262
Also the IQ3_KS PR got merged into ik_llama.cpp main so I will be force pushing my branch to include those updates. The easiest way if you want to move over to that would be to:
cd ik_llama.cpp
git branch -D ug/hunyuan-moe-2
git fetch ubergarm
git checkout ug/hunyuan-moe-2
# now build as usual