turhancan97 commited on
Commit
b210234
·
verified ·
1 Parent(s): aca90dd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -1
README.md CHANGED
@@ -5,4 +5,78 @@ language:
5
  tags:
6
  - game
7
  - reinforcement-learning
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  tags:
6
  - game
7
  - reinforcement-learning
8
+ ---
9
+
10
+ # Mini RL Game — DQN (Vector + Pixels)
11
+
12
+ A simple **Pygame** environment with a **DQN** agent that learns two scenarios. It is a **Educational RL example**. Quick experimentation with DQN on a minimal game.
13
+
14
+ - **Eat**: Catch falling objects.
15
+ - **Avoid**: Dodge falling objects as long as possible.
16
+
17
+ Check the project from the [GitHub Link](https://github.com/turhancan97/mini-game-with-RL). You can download the models here.
18
+
19
+ ## Observation Types
20
+ - **Vector (MLP)**: Compact state per enemy with normalized deltas.
21
+ - **Pixels (CNN)**: Raw frames (84×84 grayscale) stacked over 4 frames.
22
+
23
+ ---
24
+
25
+ ## ✅ Checkpoints
26
+
27
+ ### Vector (MLP)
28
+ | Scenario | Episodes | Enemies | File |
29
+ |----------|----------|---------|------------------------|
30
+ | Eat | 1000 | 4 | `model_vector_eat.h5` |
31
+ | Avoid | 3000 | 8 | `model_vector_avoid.h5`|
32
+
33
+ ### Pixels (CNN)
34
+ | Scenario | Episodes | Enemies | File |
35
+ |----------|----------|---------|-------------------------|
36
+ | Eat | 1000 | 4 | `model_pixels_eat.h5` |
37
+ | Avoid | 3000 | 8 | `model_pixels_avoid.h5` |
38
+
39
+ ---
40
+
41
+ ## 🧠 Model Architecture
42
+
43
+ ### Vector (MLP) DQN
44
+ - **Input**: `2 * N_enemies` features (per enemy: Δx/width, Δy/height).
45
+ - **Network**:
46
+ `Dense(128, relu) → Dense(128, relu) → Dense(3, linear)`
47
+
48
+ ### Pixels (CNN) DQN
49
+ - **Input**: `(84, 84, 4)` stacked grayscale frames.
50
+ - **Network**:
51
+ `Conv(32, 8×8, s=4, relu) → Conv(64, 4×4, s=2, relu) → Conv(64, 3×3, s=1, relu) → Dense(512, relu) → Dense(3, linear)`
52
+
53
+ ---
54
+
55
+ ## ⚙️ Training Setup
56
+
57
+ **Algorithm**: DQN with target network
58
+ **Loss**: Huber
59
+ **Optimizer**: Adam (`lr=1e-3` for MLP, `lr=2.5e-4` for CNN)
60
+ **Target Updates**: Soft update with τ=0.005
61
+
62
+ ### Replay
63
+ - Buffer size: `50k (MLP)` / `100k (CNN)`
64
+ - Warm-up (`train_start`): `2000 (MLP)` / `5000 (CNN)`
65
+ - Updates per env step: `2`
66
+ - Batch size: `64–128` (typical)
67
+
68
+ ### Exploration
69
+ - Linear epsilon decay per episode: `1.0 → 0.05` over ~750 episodes.
70
+
71
+ ### Rewards *(scaled small for stability)*
72
+ - **Eat**: `step −0.01`, `catch +1.0`, `miss −1.0`
73
+ - **Avoid**: `survival +0.001` per step, `near-miss up to −0.25`, `collision −5.0`
74
+
75
+ ### Environment
76
+ - **Pygame**; player moves along bottom; multiple falling enemies.
77
+ - Dependencies
78
+ * Python 3.8
79
+ * TensorFlow 2.x (e.g., 2.9)
80
+ * NumPy
81
+ * scikit-image (for pixels preprocessing)
82
+ * Pygame