Continuing in the inventory of research problems in secure natural communications.
The player experience has two components. Players physically send and receive bits of information and the semantics are understood naturally. The decode engine is the proxy for the player that allows the information infrastructure to interpret the user desire and allow for natural behavior.
The level of sophistication in the decode engine depends on the depth and breadth of capability. In the most basic level, we use today’s keyboard and display and coded commands and the decode engine barely exists. But this is not “natural”. So the decode engine must become sophisticated to enable the natural behavior at both the semantic and physical layers.
Research problems related to the decode engine
• Core speech recognition algorithms. The most natural way that we communicate is by speaking. Speech recognition has made enormous progress but it is a very hard problem to get 100% so there will always be room for improvement.
• Networked speech recognition. There are efficiencies to performing some speech recognition locally, at the device that is picking up the sound itself. This especially is the case if the microphone is in a room attached to a processing chip. However, if the microphone is tiny – a wearable – then there must be a way to find processing and stream the bits to the processor.
• Handwriting recognition. Less generally used that speech recognition but a similar application and problem.
• Gesture recognition. For some, gestures enhance the content of the message. For others, such as speech impaired, the gesture language is the language.
• Natural language recognition. Another hard problem! A natural way to communicate is to express in a sentence – rather than a structured form – what function one wants to achieve.
• Language translation. The Internet and World Wide Web are global. We need translation capability that provides the full power of the web to speakers of any language.
• Searching unstructured data. We now know that search is a fundamental piece of the infrastructure.
• Query by image content. There are images on the web. These should be searchable.
• Query by video content. There is video on the web. This should be searchable.
• Separating commands from data. As we communicate naturally we provide bits – either in text form, or speech, gesture, etc. Some of these bits are the content and some are the command structure. For text, we know how to separate the two. For more natural forms such as speech, we need to find approaches to make this separation as natural as possible.