Google's Soda Fetching Robot

Have you ever used a smart voice assistant like Alexa, Siri, or something else? If yes, you must have noticed that the tech is improving daily. Siri speaks in a gender-neutral voice while Alexa reads bedtime stories in your dead grandmother's voice. People have been continuously exploring robotics events. However, there is a huge gap between voice commands and autonomous robotics. It is never easy to teach robots what to do or what not. You can see the uses of industrial robotics in different places.

General-purpose robots can fix many problems related to voice commands in spaces where humans also exist. However, people like Robo-vacuum are mainly programmed to avoid touching things except for the floor. Table tennis is one of the games where it can determine itself to know if a task was successful. Besides, it can learn from its mistakes.

The major change in robotics is quick, precise and adaptive. Sometimes, people are quick but not adaptive. However, it isn't an issue because it is quite common in an industrial setting. But you should remember that it is hard to be fast, adaptive, and precise. In this regard, you need to know about Ping-pong, a nice microcosm of the problem requiring precision and speed.

What did Vincent Vanhoucke say about Google's soda fetching robot?

Vincent Vanhoucke said that people develop this skill by practicing. He is a Distinguished Scientist and head of robotics at Google Research. Besides, he added that you could not read the skills' rules to be a champion overnight. Ensure that one must practice it. While speed and precision are one thing, Google wants to crack into its robotic labs. It serves as the intersection between human language and robotics. In addition, it has made a few interesting leaps in the level of robotic understanding of the natural language used by us.

Several problems exist, which are tackled by Google with its natural language processing system, the Pathways Language Model or PaLM. It processes and absorbs what a person wants to say. Another challenge is recognizing what it can do. It can understand if you ask it to grab a bottle from the fridge top. But the issue is it can not reach high. Affordances mean what they can do with a few reasonable degrees of success.

For example, it can do simple jobs like moving a meter forward. Regarding a more advanced job, you can ask to find s Coke can go in the kitchen. But if you want it to perform a complex job, you can ask your Coke can to mop up after spilling it and bring a healthy drink.

Google's approach uses the information in language models to identify and score actions beneficial for high-level instructions. In addition, it uses a function "Can," allowing real-world-grounding. Besides, it determines what actions you can execute in a given environment. Google is calling it PaLM-SayCan with the PaLM language model. The robotics lab uses many robots from Everyday Robots.

These chaps get an R&R (rest and recharge) helping to know how to connect themselves to recharge. If you want it to understand a more advanced command, it needs to follow these steps.

We have given a simple example following:

Come to the speaker.
Look at the floor, find the spill.
Find a mop or paper towel in the drawers, cabinets and kitchen counters.
Pick up the cleaning tool after finding it.
Shut down the drawer.
Move to the spill.
Wash the spill, and monitor if the sponge can absorb all the liquid. If not, go wring it out in the sink and return.
As soon as it is washed, wring the sponge one more time.
Turn on the tap, wash the sponge, turn it off, and wring the sponge again.
Open the drawer and put it away. Close the drawer.
Recognize which drinks are in the kitchen and which are "healthier" than a Coke.
Look for a water bottle in the fridge, choose it, and bring it to the person asking for it.

It is essential to teach robots what they can do and what is not possible for them. In addition, it must know what it should do in different situations. An exciting challenge robotics faces is that language model can't ground in the physical world. These got training on extensive text libraries, but the libraries do not interact with their environments. While asking Google to direct you to the nearest coffee shop, you will find it funny, but Maps give you details of a 45-day hike and a three-day swim across a lake. Silly mistakes have become consequences in the real world.

Suppose you told it, "I spilled my drink, can you help?" In this case, you will get a response from the language model GPT-3 with "You could try using a vacuum cleaner." Sometimes, you can go with a vacuum cleaner. However, you can understand at least that a language model can connect a vacuum cleaner with cleaning. But if google's soda fetching robot does, it would probably fail. Vacuums are not recommended for spilled drinks. Ensure that water and electronics should not mixed. Therefore, you can get a broken vacuum at best.

You can place PaLM-SayCan-enabled robots in a kitchen setting. In addition, you can give training to them to improve them in different aspects of the kitchen. While you instruct them, they will start trying to create a determination. These have become smarter day by day in between those two considerations.

Conclusion:

Affordances are not binary. It is difficult to balance three golf balls on top of each. But we can't say that it is impossible. A robot can not open a drawer easily. But if you train them how to do it, they will get high confidence in doing that. According to the company, an untrained robot can not grab a bag of potato chips from a drawer. But if you give some instructions to google's soda fetching robot and ask it to practice, the chance will increase significantly. The motive of all training is to enable it to find out things.

Pages

Thursday, 18 August 2022