Sequence Agnostic Multi-ON: this is a task in which a robot is tasked to find a set of objects in an environment in a sequence-agnostic manner(no ordering constraint)

For example: Assume you are in a supermarket with a 15-item grocery list. Let's also assume that you don’t know the locations of various objects. How would you go about putting items in your cart?

  1. One way could be to go according to the numbering on the list. Giving high priority to item placed higher in the list. For ex: If “Egg” is written above “Bread” in the list, then even though I come across Bread during my pursuit for “Egg” I’ll ignore it till I begin my search for “Bread”. It won’t be optimal but you could be assured that you’ll get all the objects.
  2. Another possible approach could be, you start with the intention of “Let's start my exploration from this area, and whatever item I see, I’ll check with the list and put it in my cart if it needs to be.”

We tried to extend this simple concept to a more intuitive approach. Suppose your list consists only of frozen objects, then you needn’t start exploring a random direction. You can directly go in search of a fridge and quickly put them all in your cart. As simple and intuitive as this sounds, implementing this approach raised 3 major questions:

  1. How do we tell an agent that these all objects belong to the frozen section? [i.e how to learn object-object and object-region relationships]
  2. We previously assumed that it was an unexplored supermarket, so how would we know where to find a fridge? [i.e. appropriate exploration techniques]
  3. What if your list comprises various such clustering? Frozen stuff, Vegetables, Stationery items? Now what? [i.e. ]

This work tries to answer the first two questions whereas I provide with a solution and framework for integrating the third question’s solution White Paper's Link 📜

In this blog I’ll try to limit myself to the work I’ve contributed to.

The thinking process

While everything here is being explained in context of supermarket, in our work we performed training and evaluation on household scenes on Habitat Simulator.

In household scenario the first question translates to what are the different object-object relationship for various static objects and how can we leverage it to enhance our search for a list of such objects?