vimmoos@Thor
commited on
Commit
·
bcb5af3
1
Parent(s):
b3103d6
generated basic home page
Browse files- udrl/app/home.py +105 -1
udrl/app/home.py
CHANGED
@@ -3,4 +3,108 @@ import streamlit as st
|
|
3 |
|
4 |
st.image("logo.jpg")
|
5 |
|
6 |
-
st.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
|
4 |
st.image("logo.jpg")
|
5 |
|
6 |
+
st.html(
|
7 |
+
"""<div>
|
8 |
+
<h1>Upside-Down Reinforcement Learning for More Interpretable Optimal Control</h1>
|
9 |
+
|
10 |
+
<div class="abstract">
|
11 |
+
<h2>Abstract</h2>
|
12 |
+
<p>This research introduces a novel approach to reinforcement learning that emphasizes interpretability and explainability. By leveraging tree-based methods within the Upside-Down Reinforcement Learning (UDRL) framework, we demonstrate that it's possible to achieve performance comparable to neural networks while gaining significant advantages in terms of interpretability.</p>
|
13 |
+
</div>
|
14 |
+
|
15 |
+
<h2>What is Upside-Down Reinforcement Learning?</h2>
|
16 |
+
<p>UDRL is an innovative paradigm that transforms reinforcement learning problems into supervised learning tasks. Unlike traditional approaches that focus on predicting rewards or learning environment models, UDRL learns to predict actions based on:</p>
|
17 |
+
<ul>
|
18 |
+
<li>Current state (s<sub>t</sub>)</li>
|
19 |
+
<li>Desired reward (d<sub>r</sub>)</li>
|
20 |
+
<li>Time horizon (d<sub>t</sub>)</li>
|
21 |
+
</ul>
|
22 |
+
|
23 |
+
<h2>Motivation</h2>
|
24 |
+
<p>While neural networks have been the go-to choice for implementing UDRL, they lack interpretability. Our research explores whether other supervised learning algorithms, particularly tree-based methods, can:</p>
|
25 |
+
<ul>
|
26 |
+
<li>Match the performance of neural networks</li>
|
27 |
+
<li>Provide more interpretable policies</li>
|
28 |
+
<li>Enhance the explainability of reinforcement learning systems</li>
|
29 |
+
</ul>
|
30 |
+
|
31 |
+
<div class="results">
|
32 |
+
<h2>Results</h2>
|
33 |
+
<p>We tested three different implementations of the Behaviour Function:</p>
|
34 |
+
<ul>
|
35 |
+
<li>Neural Networks (NN)</li>
|
36 |
+
<li>Random Forests (RF)</li>
|
37 |
+
<li>Extremely Randomized Trees (ET)</li>
|
38 |
+
</ul>
|
39 |
+
<p>Tests were conducted on three popular OpenAI Gym environments:</p>
|
40 |
+
<ul>
|
41 |
+
<li>CartPole</li>
|
42 |
+
<li>Acrobot</li>
|
43 |
+
<li>Lunar-Lander</li>
|
44 |
+
</ul>
|
45 |
+
|
46 |
+
</div>
|
47 |
+
|
48 |
+
<h2>Key Findings</h2>
|
49 |
+
<ul>
|
50 |
+
<li>Tree-based methods performed comparably to neural networks</li>
|
51 |
+
<li>Random Forests and Extremely Randomized Trees provided fully interpretable policies</li>
|
52 |
+
<li>Feature importance analysis revealed insights into decision-making processes</li>
|
53 |
+
</ul>
|
54 |
+
|
55 |
+
<h2>Implications</h2>
|
56 |
+
<p>This research opens new avenues for:</p>
|
57 |
+
<ul>
|
58 |
+
<li>More explainable reinforcement learning systems</li>
|
59 |
+
<li>Enhanced safety in AI decision-making</li>
|
60 |
+
<li>Better understanding of agent behavior in complex environments</li>
|
61 |
+
</ul>
|
62 |
+
</div>
|
63 |
+
"""
|
64 |
+
)
|
65 |
+
st.html(
|
66 |
+
"""
|
67 |
+
<style>
|
68 |
+
body {
|
69 |
+
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
|
70 |
+
line-height: 1.6;
|
71 |
+
color: #e0e0e0;
|
72 |
+
background-color: #1a1a1a;
|
73 |
+
max-width: 800px;
|
74 |
+
margin: 0 auto;
|
75 |
+
padding: 20px;
|
76 |
+
}
|
77 |
+
h1, h2 {
|
78 |
+
color: #81a1c1;
|
79 |
+
}
|
80 |
+
.abstract {
|
81 |
+
background-color: #2e3440;
|
82 |
+
padding: 20px;
|
83 |
+
border-left: 4px solid #88c0d0;
|
84 |
+
margin: 20px 0;
|
85 |
+
border-radius: 5px;
|
86 |
+
}
|
87 |
+
.results {
|
88 |
+
margin-top: 30px;
|
89 |
+
}
|
90 |
+
.chart-container {
|
91 |
+
margin-top: 20px;
|
92 |
+
height: 400px;
|
93 |
+
background-color: #2e3440;
|
94 |
+
padding: 20px;
|
95 |
+
border-radius: 5px;
|
96 |
+
}
|
97 |
+
.highlight {
|
98 |
+
background-color: #4c566a;
|
99 |
+
padding: 2px 5px;
|
100 |
+
border-radius: 3px;
|
101 |
+
}
|
102 |
+
code {
|
103 |
+
background-color: #3b4252;
|
104 |
+
padding: 2px 5px;
|
105 |
+
border-radius: 3px;
|
106 |
+
font-family: 'Courier New', Courier, monospace;
|
107 |
+
}
|
108 |
+
</style>
|
109 |
+
"""
|
110 |
+
)
|