← Home

Publications

Selected papers and preprints. For a complete list, see Google Scholar.

2025

  1. TraCeS: Trajectory Based Credit Assignment From Sparse Safety Feedback
    Under review · earlier: ICLR 2025 Workshop on Bidirectional Human-AI Alignment (non-archival) · Siow Meng Low, Akshat Kumar

    Turns rollout-level safety labels into per-step safety signals, enabling agents to learn safer behavior without hand-designed safety costs.

    Extended version (major additions: theory + experiments) planned for arXiv.

    BibTeX
    @inproceedings{low2025traces,
      title={TraCeS: Trajectory Based Credit Assignment From Sparse Safety Feedback},
      author={Low, Siow Meng and Kumar, Akshat},
      booktitle={ICLR 2025 Workshop on Bidirectional Human-AI Alignment},}
    }

2024

  1. Safe Reinforcement Learning with Learned Non-Markovian Safety Constraints
    arXiv:2405.03005 · Siow Meng Low, Akshat Kumar

    Learned non-Markovian safety constraints for RL, capturing temporal safety requirements beyond stepwise costs.

    BibTeX
    @article{low2024safe,
      title={Safe Reinforcement Learning with Learned Non-Markovian Safety Constraints},
      author={Low, Siow Meng and Kumar, Akshat},
      journal={arXiv preprint arXiv:2405.03005},
      year={2024}
    }

2023

  1. Safe MDP Planning by Learning Temporal Patterns of Undesirable Trajectories and Averting Negative Side Effects
    ICAPS 2023 · Siow Meng Low, Akshat Kumar, Scott Sanner

    Planning with learned temporal patterns of undesirable behavior to reduce negative side effects without requiring an explicit cost specification.

    BibTeX
    @inproceedings{low2023safe,
      title={Safe MDP planning by learning temporal patterns of undesirable trajectories and averting negative side effects},
      author={Low, Siow Meng and Kumar, Akshat and Sanner, Scott},
      booktitle={Proceedings of the International Conference on Automated Planning and Scheduling},
      volume={33},
      pages={596--604},
      year={2023}
    }

2022

  1. Sample-efficient Iterative Lower Bound Optimization of Deep Reactive Policies for Planning in Continuous MDPs
    AAAI 2022 · Siow Meng Low, Akshat Kumar, Scott Sanner

    A sample-efficient optimisation approach for deep reactive policies in continuous MDP planning, using iterative lower-bound optimisation.

    BibTeX
    @inproceedings{low2022sample,
      title={Sample-efficient iterative lower bound optimization of deep reactive policies for planning in continuous MDPs},
      author={Low, Siow Meng and Kumar, Akshat and Sanner, Scott},
      booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
      volume={36},
      number={9},
      pages={9840--9848},
      year={2022}
    }

2009

  1. Prediction Based Energy-efficient Task Allocation for Delay-constrained Wireless Sensor Networks
    IEEE SECON Workshops 2009 · Wendong Xiao, Siow Meng Low, Chen Khong Tham, Sajal Das

    Prediction-based task allocation to reduce energy usage while meeting delay constraints in wireless sensor networks.

    BibTeX
    @inproceedings{xiao2009prediction,
      title={Prediction based energy-efficient task allocation for delay-constrained wireless sensor networks},
      author={Xiao, Wendong and Low, Siow Meng and Tham, Chen Khong and Das, Sajal},
      booktitle={2009 6th IEEE Annual Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks Workshops},
      pages={1--3},
      year={2009},
      organization={IEEE}
    }