Luke Raus, Declan Freeman-Gleason, Krishna Suresh
December 17, 2022
Have you ever wondered how robots make decisions? Science fiction is rife with depictions of automatons carefully weighing the possible consequences of their decisions to choose the optimal plan of action. Remember this scene from Star Wars: The Empire Strikes Back, where C-3PO recommends against Han Solo’s attempt to navigate through an asteroid field because the odds of doing so successfully are “approximately 3720 to 1”? We’re all used to the idea that the world is inherently uncertain and unpredictable, but us humans often deal with this uncertainty in intuitive, fuzzy ways when making planning decisions: such as Han Solo’s bold choice to trust his gut and fly on through in the lack of more appealing options (hence ”Never tell me the odds!”). Since robots and other computerized systems notably lack intuitions, how could we go about programming a robot to analyze uncertain situations and make plans when outcomes aren’t guaranteed?

Han Solo may not be a big fan of probability theory, but we are!
As it turns out, this question is incredibly important not only in robotics but whenever one wants to make plans in nondeterministic scenarios (and basically all scenarios are nondeterministic!) In this blog, we’ll build up a key framework for formulating this problem - Markov Decision Processes (MDPs) - and show an important algorithm for making decisions under this framework known as Value Iteration.
Let’s dive in and make some decisions!
We’re students at Olin College taking a project-based microbiology class where we want to analyze how the soil microbiome changes over the winter. This will involve frequently taking small soil samples from various sites in our 40-acre forest (called Parcel B) to bring back to the lab for analysis. Being engineers, we can’t help but try to automate this process so we can venture out into the snow only when we feel like it. The mechanical and electrical engineers have already gone ahead and built what they claim is a robust robotic platform capable of driving around Parcel B and collecting soil samples from under the snow. As the math-focused software engineers, we have been tasked with programming the robot to go out and collect the samples. The robot has a very accurate GPS and we have detailed maps of Parcel B so we’re feeling set.
However, there are 2 big caveats. First, it’s wintertime in Boston, and the mechanical engineers have informed me that the tires they chose appear to sometimes lose traction and slip on the snow and ice blanketing Parcel B. In addition, there are always wind gusts and unexpected obstacles that might knock the robot off course. That means that there’s a significant chance the robot doesn’t do quite what we command it to do, so we will want to make my control algorithm robust to these mishaps lest the robot lose its way.
Second, the electrical engineers have informed me that they accidentally chose a battery that is only barely big enough to complete a full sampling mission before needing to be recharged, and there’s no budget for a larger one. Since we don’t want to have to rescue the stranded robot if its battery dies, we will need to prioritize completing the mission as quickly and efficiently as possible and thus might need to consider taking a few shortcuts here and there. We might not be able to strictly drive on the well-trodden paths through the forest as originally planned.
During a scouting expedition, we encounter a poignant example where this framing will matter. The robot will need to travel to the opposite end of the large frozen drainage pond in Parcel B.

The path around this pond is quite long and really goes out of the way, so the electrical engineers suggest taking a shortcut across the frozen pond; it could significantly reduce the mission time and alleviate our battery limitations. However, the mechanical engineers point out the pond’s icy surface is really slippery, so the robot’s wheels will lose traction more frequently; it’s possible that this “shortcut” might actually take more time than the long way around! It’s not immediately obvious how we should go about weighing this complex tradeoff surrounding accepting versus avoiding uncertainty in the pursuit of our goal. We would like a systematic method for the robot to autonomously decide where to drive, given some knowledge of the uncertainty in its environment.
Let’s get theoretical and make some definitions that should prove useful for approaching this probabilistic robotics problem.
Now that we have a challenging problem we would like to solve, let’s establish a consistent framework for thinking about a robot and the different elements that impact its behavior and circumstance. While these definitions come from the probabilistic robotics literature, it’s worth noting that they apply to a wide range of other systems (you can even think about yourself under these definitions!).
The state includes all information relevant to describing the robot’s current existence. While we usually think about this state as belonging to the robot, it also includes information on the environment in which the robot operates (as this absolutely impacts how the future will unfold for the robot) in addition to variables such as robot’s position, orientation, velocities, etc. It’s worth noting that the state is really more a notational nicety in practice. We can denote the state of the robot with the symbol $x$. [1]