turhancan97
/

mini-game-rl-models

Reinforcement Learning

Model card Files Files and versions

turhancan97 commited on Sep 14

Commit

b210234

·

verified ·

1 Parent(s): aca90dd

Update README.md

Files changed (1) hide show

README.md +75 -1

README.md CHANGED Viewed

@@ -5,4 +5,78 @@ language:
 tags:
 - game
 - reinforcement-learning
----

 tags:
 - game
 - reinforcement-learning
+---
+# Mini RL Game — DQN (Vector + Pixels)
+A simple **Pygame** environment with a **DQN** agent that learns two scenarios. It is a **Educational RL example**. Quick experimentation with DQN on a minimal game.
+- **Eat**: Catch falling objects.
+- **Avoid**: Dodge falling objects as long as possible.
+Check the project from the [GitHub Link](https://github.com/turhancan97/mini-game-with-RL). You can download the models here.
+## Observation Types
+- **Vector (MLP)**: Compact state per enemy with normalized deltas.
+- **Pixels (CNN)**: Raw frames (84×84 grayscale) stacked over 4 frames.
+---
+## ✅ Checkpoints
+### Vector (MLP)
+| Scenario | Episodes | Enemies | File                    |
+|----------|----------|---------|------------------------|
+| Eat      | 1000     | 4       | `model_vector_eat.h5`  |
+| Avoid    | 3000     | 8       | `model_vector_avoid.h5`|
+### Pixels (CNN)
+| Scenario | Episodes | Enemies | File                     |
+|----------|----------|---------|-------------------------|
+| Eat      | 1000     | 4       | `model_pixels_eat.h5`   |
+| Avoid    | 3000     | 8       | `model_pixels_avoid.h5` |
+---
+## 🧠 Model Architecture
+### Vector (MLP) DQN
+- **Input**: `2 * N_enemies` features (per enemy: Δx/width, Δy/height).
+- **Network**:
+  `Dense(128, relu) → Dense(128, relu) → Dense(3, linear)`
+### Pixels (CNN) DQN
+- **Input**: `(84, 84, 4)` stacked grayscale frames.
+- **Network**:
+  `Conv(32, 8×8, s=4, relu) → Conv(64, 4×4, s=2, relu) → Conv(64, 3×3, s=1, relu) → Dense(512, relu) → Dense(3, linear)`
+---
+## ⚙️ Training Setup
+**Algorithm**: DQN with target network
+**Loss**: Huber
+**Optimizer**: Adam (`lr=1e-3` for MLP, `lr=2.5e-4` for CNN)
+**Target Updates**: Soft update with τ=0.005
+### Replay
+- Buffer size: `50k (MLP)` / `100k (CNN)`
+- Warm-up (`train_start`): `2000 (MLP)` / `5000 (CNN)`
+- Updates per env step: `2`
+- Batch size: `64–128` (typical)
+### Exploration
+- Linear epsilon decay per episode: `1.0 → 0.05` over ~750 episodes.
+### Rewards *(scaled small for stability)*
+- **Eat**: `step −0.01`, `catch +1.0`, `miss −1.0`
+- **Avoid**: `survival +0.001` per step, `near-miss up to −0.25`, `collision −5.0`
+### Environment
+- **Pygame**; player moves along bottom; multiple falling enemies.
+- Dependencies
+    * Python 3.8
+	* TensorFlow 2.x (e.g., 2.9)
+	* NumPy
+	* scikit-image (for pixels preprocessing)
+	* Pygame