zai-org/GLM-4.7-Flash
Text Generation
•
Updated
•
1.77M
•
•
1.56k
Interesting models in 2026 that can run decently with 24GB of VRAM (or a lot of patience)
Note Damn it's fast and clever. I'm generally not a fan of MoE models, but this one is really, really, good. It will however eat a bazillion tokens in its thinking block.
Note First RP model of the year in this list. Similar to previous iterations, but grounded by normal assistant prompts. It's a really solid model, fun to use, adaptable beyond RP too. It uses L7-Tekken as an instruct format.