In this article, I’m going to showcase some of the incredible ways we’ve been utilizing computer vision for Mars exploration. Human Mars exploration is definitely the next giant leap for humanity. With our work here at JPL, we’re paving the way for the future and for future generations!

Before we dive in, let’s break down our main talking points:

Interested in seeing more? Download Ryan's presentation below.

Computer vision for exploring Mars
Ryan Alimo, Lead Machine Learning Scientist at NASA, showcases new innovations in computer vision technology for Mars exploration.

A brief history of space exploration and tech

Over the past several decades, we’ve seen an evolution in terms of how we explore space. Below you can see some of the vessels that we’ve sent to other planets. 👇

Brief history of space exploration and tech

From the 70s onwards we’ve witnessed the arrival of fly-bys, orbiters, and landers. We’re currently sending helicopters to Mars, but we’ve also progressed into using rovers on Mars. These are way bigger than helicopters. Check it out below. 👇

The rover

The Perseverance Rover

If you want to visualize the scale of it, it’s about the size of an SUV. And guess what, it’s flying over Mars right now! This current rover moves at a speed of about 5 centimeters per hour.

In contrast, the next rover we’re going to have, which is a collaboration with the European Space Agency, is going to move at a top speed of one kilometer per sol. This is significantly faster than the existing rovers that we have.

Great, right? The issue is, it brings lots of complexities, machine learning, and computer vision to the table that we need to solve.

Challenges for Mars exploration

Of course, the challenges of Mars exploration are nothing new. These are complex technologies and with that comes complex problems:

The communication challenge

Ultimately, Mars is a long way away, and we have to be able to communicate with these machines when they’re there. There’s an urgent need to have more advanced onboard edge computing on the rovers so that they can have online decision-making capabilities.

When the data is sent back to earth, we typically observe that about 60 megabytes per sol are going to be transferred back to Earth.

That's due to the physical challenges that we have. NASA has centers in various places in the world: Australia, California, and Madrid. We take this huge amount of data, but because there’s so, much of it’s going to take a while to send it back.

But these new processes represent new autonomy capabilities that’re going to help us to tackle the self-driving capabilities of the rovers.

Hardware challenges

Hardware challenges

Hardware limitations are a major issue that we have to deal with. The onboard processing that is available on the rover is called RAD750. Its power is equivalent to an iMac from 1998. That's significantly less powerful than the cell phone that you have in your pocket right now.


We’re working on different projects to overcome this issue. For example, one of the active projects that we have is a collaboration between NASA and the Air Force, the high-performance spacecraft computing (HSBC). This technology is radiation-hardened, which means that it’s built to still function reliably despite the harsh conditions on Mars.

Snapdragon (Qualcomm)

The other trend that we observe is the use of commercial off-the-shelf hardware. This year, the Ingenuity Rover flew with Snapdragon. Snapdragon is so commercially available, that you can actually get it on your cell phone. The Snapdragon 820 has the capability to run deep neural networks in real-time with the support of GPUs and digital signal processing.

It's going to have a lot more capability in terms of what we can do with the computer vision algorithms and onboard processing units.

Part 1: New era onboard autonomy for Mars exploration with COTS and Edge processors

This leveraging of existing commercial off-the-shelf products is opening a new era for space exploration. Look at the image below: 👇

Part 1: New era onboard autonomy for Mars exploration with COTS and Edge processors

On the left-hand side, you can see the Mars helicopter that runs on a Snapdragon, and on the right-hand side, you see a CubeSats. These spacecraft are much cheaper to build, and they usually leverage the hardware that available to you. This creates a new capability for us to test high-risk/high-reward projects.

Vision guided manipulation for in-space assembly

One of the projects that I worked on with my colleagues at JPL was building computer vision capabilities to build in-orbit assembly. So, in this work, we've developed a monocular-based pose estimation algorithm based on the convolutional neural network.

This takes an image in a simulated environment and predicts the relative distance and orientation of a known object in a real-world situation. After that, you can grasp them and put them next to each other, and build a much larger structure in orbit.

Vision guided manipulation for in-space assembly

You can take different small parts and assemble them in orbit to create a larger structure.

Autonomous navigation and terrain reconstruction

When it comes to the Mars rovers, we’ve been doing this kind of autonomous driving for decades. My colleagues at Jet Propulsion Laboratory have worked on a number of different autonomous navigation apps that enable us to drive smoothly on Mars. It creates a terrain reconstruction online so that you can observe where you're driving.

Autonomous navigation and terrain reconstruction

As with anything though, this does come with its challenges. There was an incident where one of our rovers, Opportunity, got stuck in the sand. We imagined from the images that we took that the area was going to be safe.

That's why computer vision and semantic understanding of the terrain are critical elements in Mars exploration for the rovers.

Terrain awareness and semantic image understanding

In this project, one of the robots that we have at JPL called Athena is navigating into Mars. It navigates different areas and builds semantic information regarding safe and unsafe areas.

Terrain awareness and semantic image understanding

Also, there’s another project called Mars that allows us to classify around 300,000 images. That way you can be clear on whether something is bedrock, sand, or soil, etc. After that information has been processed, we were able to predict different terrains.

Vision-based estimation of driving energy

After you have all of this information, you want to be able to establish whether the terrain is safe. Consider this Mars sample return.

Vision-based estimation of driving energy

This is not nuclear-based, it’s reliant on solar panels to go from one location to the other location. It’s not usually the best idea to take the minimum distant path. That may end up being unsafe.

It’s often better to take another path that is safer. Here, you can wait for a day and recharge, then you can move on and eventually reach your destination. With this work, you can predict how much energy you're going to have and take that into consideration when planning out your path.

Part 2: Image classification and prioritization

We can gather the information taken by satellites and classify them into different types. This enabled us to do a more autonomous classification of the images that we collect.

The main aim of this information is to be able to accurately assess the kind of terrain that you’re going to be dealing with. Is there anything novel that’s worth examining? What can it tell us?

Part 2: Image classification and prioritization

Computer-vision for discovery and drive-by science

One of the problems that we encountered in the past is that we were taking a lot of images of Mars and other planets, but we had a limited number of scientists around the world who could look at our images and determine whether we’d found something revolutionary or not, and there was no AI to help us.

With novelty-detection algorithms, we can highlight areas of terrain that're more important to look at, whilst disregarding a vast array of other images. This way, we've a greater likelihood of uncovering great things.

Computer-vision for discovery and drive-by science

The end objective is to collect images of these different objects and send them back to Earth from Mars. But we have to be able to prioritize so that we’re not getting distracted and wasting time looking at a whole heap of useless images.

With a project called SCOTI, geologists are captioning different images that we take from Mars. The image captioning architecture of an attention-based model is going to help us to describe the captions in English sentences.

Then, it becomes possible to interact with their web app where you can search for different types of trends or different scientific facts.

Computer vision for Automated Mars Rover System

Computer vision for Automated Mars Rover System

There are many different elements that keep this computer vision technology working. One part is going to be The Rover. The Rover can take an image and then send the data to the stereo in order to determine the distances, and it can also establish terrain reconstruction. Global localization sends data to the convolution neural network where we can use backpropagation to find what the encoder part does.

We can use an encoder/ decoder network that uses convolutional neural networks to extract different features that you have in the form of a vector. After that, we can use the decoder to send the data to the planning team and find out about different elements and segmentations.

Then, we can use a long short-term memory network that produces a caption for us on the data.

All of this can be summarized in the form of a database, and scientists can look at the captioned images to determine whether there’s anything novel to look at.

To wrap up: pioneering the way to the future

In summary, computer vision plays a critical role in space exploration. There’s obviously limitations in terms of the onboard autonomy for the processors due to the radiation-hardened constraints that we’re dealing with.

We’re working in a harsh environment, and we’re always trying to find ways to innovate so that we can optimize our technology in that environment.

Our ultimate goal is always to give our technology more autonomy in these environments. We believe that computer vision has the potential to optimize our technology, as well as increase scientific discoveries.