We present Structured World Modeling for Policy Optimization (SWMPO), a framework for unsupervised learning of neurosymbolic Finite State Machines (FSM) that capture environmental structure for policy optimization. SWMPO models the environment as a FSM, where each state corresponds to a specific region of the state space with distinct dynamics (e.g., water and land). This structured representation can be leveraged for tasks like policy optimization. Our proposed FSM synthesis algorithm operates in an unsupervised manner, leveraging low-level features from unprocessed, non-visual data to learn non-linear models, making it adaptable across various domains. The synthesized FSM models are expressive enough to be used in a model-based Reinforcement Learning scheme that leverages offline data to efficiently synthesize environment-specific world models. We demonstrate the advantages of SWMPO by benchmarking its environment modeling capabilities in a number of simulation tasks.
@inproceedings{HernandezCano:2025:NWMSDM,
author = "Leonardo Hernandez Cano and Maxine Perroni-Scharf and Neil Dhir and Arun Ramamurthy and Armando Solar-Lezama",
title = "Neurosymbolic World Models for Sequential Decision Making",
booktitle = "Proceedings of the 42nd International Conference on Machine Learning",
series = "PMLR 267",
year = "2025",
month = "July",
address = "Vancouver, Canada",
url = "https://icml.cc/virtual/2025/poster/43925"
}