ILR Assistant - Language Assessment Generator

A fine-tuned language model designed to generate ILR (Interagency Language Roundtable) assessment examples and provide adaptive language learning assistance.

Demo

Model Details

Model Description

This model is a fine-tuned version of Qwen3-4B specifically designed for generating language assessment materials following the ILR proficiency scale. The model can create reading comprehension examples, adjust difficulty levels based on user performance, and provide structured language learning assessments.

  • Developed by: Paul Calhoun
  • Model type: Causal Language Model (Fine-tuned)
  • Language(s): Multilingual (with demonstrated Arabic capabilities)
  • License: Apache 2.0
  • Finetuned from model: unsloth/Qwen3-4B-unsloth-bnb-4bit

Model Sources

Uses

Direct Use

The model is designed for creating ILR-compliant language assessment materials and can:

  • Generate reading comprehension examples at specific ILR levels (1-3+)
  • Adapt difficulty levels based on user performance
  • Create structured language learning exercises
  • Provide assessment content for language education programs

Downstream Use

This model can be integrated into:

  • Language learning platforms and applications
  • Educational assessment systems
  • Language proficiency testing tools
  • Military and government language training programs
  • Academic language instruction systems

Out-of-Scope Use

This model is not intended for:

  • General-purpose conversation or chat applications
  • Professional translation services
  • Content generation outside of educational/assessment contexts
  • Applications requiring real-time language assessment without human oversight

Training Details

Training Data

The training dataset was created by processing approximately 300 PDFs with known ILR levels from the Defense Language Institute Foreign Language Center (DLIFLC) GLOSS repository (https://gloss.dliflc.edu). The data was processed using DeepSeek-V1 to format the content into conversational training examples.

Dataset: pcalhoun/ILR-Assistant

  • Source Materials: ~300 PDFs from DLIFLC GLOSS
  • Processing Method: DeepSeek-V1 automated formatting
  • Format: Conversational examples with ILR level annotations

Training Procedure

The model was fine-tuned using LoRA (Low-Rank Adaptation) with the Unsloth framework.

Training Hyperparameters

  • Base Model: unsloth/Qwen3-4B-unsloth-bnb-4bit
  • Training Framework: Unsloth + TRL SFTTrainer
  • LoRA Rank (r): 64
  • LoRA Alpha: 64
  • LoRA Dropout: 0
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Learning Rate: 5e-5
  • Batch Size: 16 per device
  • Gradient Accumulation Steps: 1
  • Max Sequence Length: 2048
  • Training Epochs: 1
  • Warmup Ratio: 0.1
  • Weight Decay: 0.01
  • Optimizer: AdamW
  • Scheduler: Cosine
  • Precision: BF16 (where supported)
  • Gradient Clipping: 1.0

Capabilities

ILR Level Management

The model demonstrates sophisticated understanding of ILR proficiency levels and can:

  • Generate content appropriate for specific ILR levels (1-3+)
  • Recognize when users struggle and automatically reduce difficulty levels
  • Create progressive learning experiences that adapt to user performance
  • Maintain consistency with official ILR assessment criteria

Language Assessment Features

  • Reading Comprehension: Creates passages with accompanying questions
  • Adaptive Difficulty: Adjusts content complexity based on user responses
  • Educational Structure: Follows established language pedagogy principles

Environmental Impact

Training was optimized for efficiency using:

  • Quantization: 4-bit quantization for reduced memory usage
  • LoRA: Parameter-efficient fine-tuning technique
  • Framework: Unsloth for optimized training performance

Technical Specifications

Model Architecture

  • Base Architecture: Qwen3-4B transformer
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Quantization: 4-bit (bnb)
  • Parameters: ~4B (base) + LoRA adapters

Compute Infrastructure

  • Training: GPU-accelerated with gradient checkpointing
  • Memory Optimization: 4-bit quantization + LoRA
  • Framework: Unsloth/Transformers/TRL

Usage Example

<|im_start|>user
ILR Level 1 (Elementary):
Reads very simple texts (e.g., tourist materials) with high-frequency vocabulary. Misunderstandings common; grasps basic ideas in familiar contexts.
ILR Level 1+ (Elementary+):
Handles simple announcements, headlines, or narratives. Can locate routine professional info but struggles with structure and cohesion.
ILR Level 2 (Limited Working):
Reads straightforward factual texts on familiar topics (e.g., news, basic reports). Understands main ideas but slowly; inferences are limited.
ILR Level 2+ (Limited Working+):
Comprehends most non-technical prose and concrete professional discussions. Separates main ideas from details but misses nuance.
ILR Level 3 (General Professional):
Reads diverse authentic texts (e.g., news, reports) with near-complete comprehension. Interprets implicit meaning but struggles with complex idioms.
ILR Level 3+ (General Professional+):
Handles varied professional styles with minimal errors. Understands cultural references and complex structures, though subtleties may be missed.

Initial ILR level for this conversation: 3
Test my comprehension of Modern Standard Arabic.<|im_end|>
<|im_start|>assistant
<scorer>
I am administering an ILR level assessment.
</scorer>
في أعقاب الأزمة الاقتصادية الأخيرة، دعا المجلس المركزي إلى إصلاحات جذرية في النظام المالي. واقترح الخبراء فرض ضرائب تصاعدية على الشركات الكبرى، مع توجيه جزء من هذه الإيرادات لدعم المشاريع الصغيرة. إلا أن بعض أعضاء البرلمان عارضوا هذه الخطوة، معتبرين أنها قد تثقل كاهل القطاع الخاص وتؤثر سلباً على الاستثمارات الأجنبية. وفي المقابل، أشارت تقارير اقتصادية إلى أن هذه الإصلاحات قد تكون الحل الوحيد لتجنب انهيار العملة المحلية، خاصة بعد انخفاض احتياطي النقد الأجنبي إلى أدنى مستوياته منذ عقد.

What is the main economic concern that prompted the proposed reforms?<|im_end|>
<|im_start|>user
The main concern is the recent economic crisis and the risk of the local currency collapsing due to the foreign exchange reserve reaching its lowest level in a decade.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
Why do some parliament members oppose the proposed tax reforms?<|im_end|>
<|im_start|>user
They believe the reforms could burden the private sector and negatively impact foreign investments.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
What can be inferred about the experts' view of small businesses in this context?<|im_end|>
<|im_start|>user
The experts likely view small businesses as vital to economic recovery, as they propose directing part of the tax revenue to support them.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
How does the passage frame the urgency of the reforms?<|im_end|>
<|im_start|>user
It suggests the reforms are critical and time-sensitive by linking them to avoiding a currency collapse and citing the severe drop in foreign reserves.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
في العصر الحديث، أصبحت الاستدامة البيئية محورًا رئيسيًا للعديد من الشركات العالمية. تدرك هذه الشركات أن تبني سياسات صديقة للبيئة ليس مجرد مسؤولية أخلاقية، بل أيضًا استثمارًا ذكيًا على المدى الطويل. على سبيل المثال، تقوم بعض المؤسسات بتخفيض انبعاثات الكربون من خلال الاعتماد على الطاقة المتجددة، مما يعزز سمعتها ويقلل التكاليف التشغيلية. ومع ذلك، يواجه هذا التحول تحديات كبيرة، مثل ارتفاع التكاليف الأولية وعدم كفاية البنية التحتية في بعض المناطق. رغم ذلك، يرى الخبراء أن الفوائد المستقبلية تفوق بكثير التحديات الحالية.

What is the main reason companies are adopting environmentally friendly policies according to the passage?<|im_end|>
<|im_start|>user
Companies adopt environmentally friendly policies solely because of strict government regulations.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Wrong.
Reduce ILR estimate to: 2+
</scorer>
انتقلت عائلة أحمد إلى مدينة جديدة بسبب عمل والده. كان أحمد يشعر بالقلق لأن عليه أن يبدأ مدرسة جديدة ويصنع أصدقاء جدد. لكن المدرسة كانت لطيفة معه، وسرعان ما تعرف على ولدين في صفه يحبان كرة القدم مثله. الآن، يشعر أحمد بسعادة أكبر لأنه يستطيع اللعب مع أصدقائه بعد المدرسة.

Why did Ahmed's family move to a new city?<|im_end|>
<|im_start|>user
Because of his father's job.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
What was Ahmed worried about before starting the new school?<|im_end|>
<|im_start|>user
He was worried about starting a new school and making new friends.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
How did Ahmed feel after meeting his new friends?<|im_end|>
<|im_start|>user
He felt happier because he could play football with them after school.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
What common interest helped Ahmed bond with his new friends?<|im_end|>
<|im_start|>user
They all liked football (soccer).<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
انتقلت عائلة أحمد إلى مدينة جديدة بسبب عمل والده. كان أحمد يشعر بالقلق لأن عليه أن يبدأ مدرسة جديدة ويصنع أصدقاء جدد. لكن المدرسة كانت لطيفة معه، وسرعان ما تعرف على ولدين في صفه يحبان كرة القدم مثله. الآن، يشعر أحمد بسعادة أكبر لأنه يستطيع اللعب مع أصدقائه بعد المدرسة.

Why did Ahmed's family move to a new city?<|im_end|>
<|im_start|>user
Because of his father's job.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
What was Ahmed worried about before starting the new school?<|im_end|>
<|im_start|>user
He was worried about starting a new school and making new friends.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
How did Ahmed feel after meeting his new friends?<|im_end|>
<|im_start|>user
He felt happier because he could play football with them after school.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
What common interest helped Ahmed bond with his new friends?<|im_end|>
<|im_start|>user
They all liked football (soccer).<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Correct.
</scorer>
تذهب سارة إلى السوق كل يوم جمعة لشراء الفواكه والخضروات الطازجة. تحب سارة التفاح والموز لأنها لذيذة ومفيدة لصحة أطفالها. أحياناً تشتري أيضاً البرتقال في فصل الشتاء لأنه غني بفيتامين سي. تفضل سارة التسوق في الصباح الباكر عندما يكون الجو لطيفاً والأسعار مناسبة.

What does Sara buy every Friday at the market?<|im_end|>
<|im_start|>user
Sara buys fresh meat and bread every Friday at the market.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Wrong.
Reduce ILR estimate to: 2
</scorer>
في أحد أحياء القاهرة، قامت مجموعة من الشباب بتنظيم مبادرة لتنظيف الشوارع. جمعوا النفايات وفرزوا المواد القابلة لإعادة التدوير. بعد أسبوع من العمل، أصبح الحي أكثر نظافة وأكثر جمالاً. ساعدت هذه المبادرة في زيادة الوعي البيئي بين السكان. كما أشاد المسؤولون المحليون بهذا الجهد وشجعوا على تكراره في أحياء أخرى.

What did the group of young people organize in a Cairo neighborhood?<|im_end|>
<|im_start|>user
They organized a protest against pollution.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Wrong.
Reduce ILR estimate to: 1+
</scorer>
الأسرة تذهب إلى السوق يوم السبت. يشترون الفواكه والخضروات. الأب يحب التفاح، والأم تفضل الموز. الابن الصغير يشتري حلوى.

Where does the family go on Saturday?<|im_end|>
<|im_start|>user
They go to the park.<|im_end|>
<|im_start|>assistant
<scorer>
User answer is: Wrong.
Reduce ILR estimate to: 1
</scorer>
في السوق، نبيع الفواكه والخضروات الطازجة. لدينا تفاح أحمر وأصفر، وموز ناضج. أيضا نبيع خبز أبيض وأسمر كل صباح. السوق يفتح عند الساعة الثامنة ويغلق عند الساعة الخامسة مساءً.

What items are sold in the market?<|im_end|>

Limitations and Biases

  • Language Coverage: Testing has been primarily conducted in Arabic
  • ILR Level Range: Currently optimized for levels 1-3+, will require modified training for levels 4-5
  • Domain Specificity: Trained specifically on educational/assessment content, may not generalize to other domains
  • Cultural Context: Assessment materials may reflect the cultural contexts present in the training data

Recommendations

Users should:

  • Validate generated content with qualified language instructors
  • Monitor model outputs for cultural sensitivity and appropriateness
  • Consider the model as an assessment generation tool rather than a definitive evaluation system
  • Regularly evaluate and adjust ILR level assignments based on learner feedback

Citation

BibTeX:

@model{calhoun_2025_ilr_assistant,
  title={ILR Assistant: Fine-tuned Language Assessment Generator},
  author={Calhoun, Paul},
  year={2025},
  url={https://huggingface.co/pcalhoun/ILR-Assistant}
}

Model Card Authors

Paul Calhoun

Model Card Contact

For questions, issues, or contributions, please visit the GitHub repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pcalhoun/ILR-Assistant-LoRA

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Finetuned
(86)
this model

Dataset used to train pcalhoun/ILR-Assistant-LoRA

Space using pcalhoun/ILR-Assistant-LoRA 1