aws-neuron
/

ChessLM_Qwen3_Trainium_2_AWS_Format

Text Generation

continuous-batching

Model card Files Files and versions

ChessLM_Qwen3_Trainium_2_AWS_Format / README.md

jburtoft's picture

Update README.md

1296456 verified 21 days ago

|

history blame contribute delete

2.04 kB

	---
	language:
	- en
	license: apache-2.0
	pipeline_tag: text-generation
	tags:
	- chess
	- neuron
	- aws-trainium
	- vllm
	- optimum-neuron
	- continuous-batching
	base_model: karanps/ChessLM_Qwen3
	---

	# ChessLM Qwen3 - Neuron Traced (AWS Format Structure)

	This is a Neuron-traced version of [karanps/ChessLM_Qwen3](https://huggingface.co/karanps/ChessLM_Qwen3) optimized for AWS Trainium (trn2) instances using vLLM.

	This model follows the AWS Neuron repository structure with separate directories for compiled artifacts.

	This model is meant to be used from within the Neuron Workshop (https://github.com/aws-neuron/neuron-workshops)

	## Model Details

	- Base Model: Qwen3-8B fine-tuned for chess
	- Compilation: optimum-neuron[vllm]==0.3.0
	- Compiler Version: neuronxcc 2.21.33363.0
	- Target Hardware: AWS Trainium2 (trn2)
	- Precision: BF16
	- Tensor Parallelism: 2 cores
	- Batch Size: 4 (continuous batching enabled)
	- Max Sequence Length: 2048


	## Compilation instructions
	```
	optimum-cli export neuron \
	--model karanps/ChessLM_Qwen3 \
	--task text-generation \
	--sequence_length 2048 \
	--batch_size 4 \
	/home/ubuntu/environment/ml/qwen-chess/karanps/ChessLM_Qwen3_compiled
	```

	### Key Files

	- context_encoding_model/: Compiled NEFF files for processing initial prompt sequences (up to 2048 tokens)
	- token_generation_model/: Compiled NEFF files for autoregressive token generation
	- layout_opt/: Layout optimization artifacts from compilation
	- model.pt: Main model file containing compiled graphs and embedded weights (17GB)
	- neuron_config.json: Neuron compilation configuration

	## Model Files

	\| File \| Purpose \|
	\|------\|---------\|
	\| model.pt \| Main model with embedded weights (17GB) \|
	\| config.json \| Base model configuration \|
	\| neuron_config.json \| Neuron compilation settings \|
	\| tokenizer* \| Tokenizer files for text processing \|

	## License

	This model inherits the license from the base model [karanps/ChessLM_Qwen3](https://huggingface.co/karanps/ChessLM_Qwen3).