- Frechet Differentiability in Besov Spaces in the Optimal Control of Parabolic Free Boundary Problems We consider the inverse Stefan type free boundary problem, where information on the boundary heat flux and density of the sources are missing and must be found along with the temperature and the free boundary. We pursue optimal control framework where boundary heat flux, density of sources, and free boundary are components of the control vector. The optimality criteria consists of the minimization of the L_2-norm declinations of the temperature measurements at the final moment, phase transition temperature, and final position of the free boundary. We prove the Frechet differentiability in Besov spaces, and derive the formula for the Frechet differential under minimal regularity assumptions on the data. The result implies a necessary condition for optimal control and opens the way to the application of projective gradient methods in Besov spaces for the numerical solution of the inverse Stefan problem. 2 authors · Mar 31, 2016
- Goal-Conditioned Imitation Learning using Score-based Diffusion Policies We propose a new policy representation based on score-based diffusion models (SDMs). We apply our new policy representation in the domain of Goal-Conditioned Imitation Learning (GCIL) to learn general-purpose goal-specified policies from large uncurated datasets without rewards. Our new goal-conditioned policy architecture "BEhavior generation with ScOre-based Diffusion Policies" (BESO) leverages a generative, score-based diffusion model as its policy. BESO decouples the learning of the score model from the inference sampling process, and, hence allows for fast sampling strategies to generate goal-specified behavior in just 3 denoising steps, compared to 30+ steps of other diffusion based policies. Furthermore, BESO is highly expressive and can effectively capture multi-modality present in the solution space of the play data. Unlike previous methods such as Latent Plans or C-Bet, BESO does not rely on complex hierarchical policies or additional clustering for effective goal-conditioned behavior learning. Finally, we show how BESO can even be used to learn a goal-independent policy from play-data using classifier-free guidance. To the best of our knowledge this is the first work that a) represents a behavior policy based on such a decoupled SDM b) learns an SDM based policy in the domain of GCIL and c) provides a way to simultaneously learn a goal-dependent and a goal-independent policy from play-data. We evaluate BESO through detailed simulation and show that it consistently outperforms several state-of-the-art goal-conditioned imitation learning methods on challenging benchmarks. We additionally provide extensive ablation studies and experiments to demonstrate the effectiveness of our method for goal-conditioned behavior generation. Demonstrations and Code are available at https://intuitive-robots.github.io/beso-website/ 4 authors · Apr 5, 2023