GRID: Gestures and Reference Instructions for Deep robot learning
Persons participating in the project
Using only speech as main communication channel sometimes is not enough. Misunderstandings during dialogues can occur, especially if the robot does not understand the context. The same problem can occur with gesture-based communication. Sometimes a gesture is not enough to express the command, and a dialogue could be very hard to accomplish through the limitations of gesture signs. One example of these limitations is natural pointing
The multi-modal characteristic of communication with speech and pointing combined is what makes it so hard to achieve. The lack of correlation between these inputs does not give the robot robust information representation. The use of deep neural architectures, inspired by representations in the human brain, could solve this problem through the capability to represent complex sentences composed by gestures and speech. It could also help to understand how the human brain combines audio-visual inputs into one percept.
The use of neurally