Instructions to use mlx-community/MiniMax-M2.1-8bit-gs32 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/MiniMax-M2.1-8bit-gs32 with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("mlx-community/MiniMax-M2.1-8bit-gs32") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Transformers
How to use mlx-community/MiniMax-M2.1-8bit-gs32 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="mlx-community/MiniMax-M2.1-8bit-gs32", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("mlx-community/MiniMax-M2.1-8bit-gs32", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("mlx-community/MiniMax-M2.1-8bit-gs32", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- vLLM
How to use mlx-community/MiniMax-M2.1-8bit-gs32 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "mlx-community/MiniMax-M2.1-8bit-gs32" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mlx-community/MiniMax-M2.1-8bit-gs32", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/mlx-community/MiniMax-M2.1-8bit-gs32
- SGLang
How to use mlx-community/MiniMax-M2.1-8bit-gs32 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "mlx-community/MiniMax-M2.1-8bit-gs32" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mlx-community/MiniMax-M2.1-8bit-gs32", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "mlx-community/MiniMax-M2.1-8bit-gs32" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mlx-community/MiniMax-M2.1-8bit-gs32", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Pi new
How to use mlx-community/MiniMax-M2.1-8bit-gs32 with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "mlx-community/MiniMax-M2.1-8bit-gs32"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "mlx-community/MiniMax-M2.1-8bit-gs32" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use mlx-community/MiniMax-M2.1-8bit-gs32 with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "mlx-community/MiniMax-M2.1-8bit-gs32"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default mlx-community/MiniMax-M2.1-8bit-gs32
Run Hermes
hermes
- MLX LM
How to use mlx-community/MiniMax-M2.1-8bit-gs32 with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "mlx-community/MiniMax-M2.1-8bit-gs32"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "mlx-community/MiniMax-M2.1-8bit-gs32" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mlx-community/MiniMax-M2.1-8bit-gs32", "messages": [ {"role": "user", "content": "Hello"} ] }' - Docker Model Runner
How to use mlx-community/MiniMax-M2.1-8bit-gs32 with Docker Model Runner:
docker model run hf.co/mlx-community/MiniMax-M2.1-8bit-gs32
Upload
Browse files- model-00036-of-00054.safetensors +3 -0
- model-00037-of-00054.safetensors +3 -0
- model-00038-of-00054.safetensors +3 -0
- model-00039-of-00054.safetensors +3 -0
- model-00040-of-00054.safetensors +3 -0
- model-00041-of-00054.safetensors +3 -0
- model-00042-of-00054.safetensors +3 -0
- model-00043-of-00054.safetensors +3 -0
- model-00044-of-00054.safetensors +3 -0
- model-00045-of-00054.safetensors +3 -0
- model-00046-of-00054.safetensors +3 -0
- model-00047-of-00054.safetensors +3 -0
- model-00048-of-00054.safetensors +3 -0
- model-00049-of-00054.safetensors +3 -0
- model-00050-of-00054.safetensors +3 -0
model-00036-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bb122a6064d38f8327a427ce84454554683cf9e5e55cb3cbe1c988459da93383
|
| 3 |
+
size 5335284408
|
model-00037-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:200cf337f350659134c350365f98fc8cbd53063cac2ddcd2bee37de5fa28685e
|
| 3 |
+
size 4328779821
|
model-00038-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f91fb09e5da94b48dc41853ca726fe344d543a5a7f7c8948fffb095dfa853c33
|
| 3 |
+
size 5335284348
|
model-00039-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:88d963f7a7efb1dc161bd51f08dcae23446eaef89741f45bcf41d2277bb72c42
|
| 3 |
+
size 4278319949
|
model-00040-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d956655b083a5f97241074c591cf5e71ef8b146898e27f10c4c637e3abb8117d
|
| 3 |
+
size 5335284414
|
model-00041-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ad7938c5f718736b1e7e7389d1875a0a98429e399814e819452cb2d2cfe45776
|
| 3 |
+
size 4278319941
|
model-00042-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:222e557c62aad8ac2548417eb479f32bfa07002bd2f5a8cc68bda0b4b6909ca8
|
| 3 |
+
size 5335284416
|
model-00043-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e550da568182f2655f0ef19e761556a41260d91f540b4b4d38ac75ccb7fe020b
|
| 3 |
+
size 4328779817
|
model-00044-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:02ac060686a52a1a5366c47e44a3aea15b7781aad1d207a51abfca5ac9903214
|
| 3 |
+
size 5335284400
|
model-00045-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b26fe00b19481879bdc57b3216dca558cd34e91e7adc74f6b3bafda638297107
|
| 3 |
+
size 4278319945
|
model-00046-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:61d2ae2012b8742b7ac7313546827645e8f6a6a034cb477cd8bbad5968cf4c45
|
| 3 |
+
size 5335284408
|
model-00047-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a235cc25e569dad6a8e1eb14242c5d3c588b41f4c5a41ee264d2d35af3f43dc2
|
| 3 |
+
size 4278319939
|
model-00048-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:382e114a56d8a96b9bf9bc2197901ce2b959e029679a67970e4f3910a3d7d6f6
|
| 3 |
+
size 5335284412
|
model-00049-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:18a31d54d2794711194bfd572a434cbf4c73d121060b62b7a4ad8f4ec4bcc289
|
| 3 |
+
size 4328779837
|
model-00050-of-00054.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c8f72f7e1d82d57c49c39f6f48939040ed09db3bb499a4e5553ec17eb0a3b118
|
| 3 |
+
size 5335284400
|