 Home > Chemicals Industry > China Chemical > Technology allows machines to "see" what you say

Technology allows machines to "see" what you say

 Last Update: 2022-04-26
 Source: Internet
 Author: User

Tags

what soot mean

fiber technology

Search more information of high quality chemicals, good prices and reliable suppliers, visit www.echemi.com

From text recognition to voice input, the development of technology has made our lives more convenient, and the way people interact with computers is even more incredible, but this does not mean that this technology has developed to the end.
In fact, many problems are left to wait.
Let us solve it, such as using a machine to interpret lip language
.

Lip language is a very special technique, its essence is to understand the content of the other party's expression by observing the movement of the other party's lips when they speak
.
When speaking normally, people's lips and speech are synchronized, and during the pronunciation process, the lips move in different ways, so the purpose of "seeing" the content of the other party's speech can be achieved
.
However, in the actual use process, it is difficult to understand the lip language because it is difficult to detect the difference in the movement of the lips when speaking
.

In short, there are two elements to realize the interpretation of lip language: recognizing lip movement and making corresponding feedback
.
And these two points actually have advantages for the machine
.
On the one hand, it is a very mature technology for machines to capture moving objects through image sensors , and the current accuracy of this technology is very high, and it is not difficult to instantly recognize the movement of lips; on the other hand, in storage technology and With the development of semiconductor technology, the current mechanical interaction response speed has been very impressive.
Combined with the assistance of big data and artificial intelligence algorithms, as long as the pronunciation corresponding to different lip shapes can be pre-defined in the system, the captured lips can be detected in a short time.
It is not difficult to decipher the content of the speech
.

sensor

So does it mean that the lip reading system is easy to manufacture? The answer is actually no.
In fact, even if the two basic conditions at the core of lip reading can be satisfied, there is still a more serious problem that affects the difficulty of lip reading—interference factors
.
In fact, in our normal life, the angle of the face, light, hair occlusion, clothing occlusion, etc.
, may become factors that affect the capture of lip language.
Coupled with the influence of some people's speaking habits, the actual change of lips may actually be far away.
There are more than expected, which also makes the manufacturing of this kind of machine very difficult
.
Even the current non-contact visual image method with high accuracy of lip language recognition makes mistakes due to the existence of interference factors
.

　　So is there no other workaround? The answer is yes, that is, from the lip shape to the interpretation of muscle action
.
When we speak, the movement of the lips is completed by muscle contraction, and the muscle movement will drive the change of the face.
Therefore, if the details of the muscles can be captured, the existence of interference factors can be avoided to complete the lip recognition
.
However, it is precisely because the movements of the muscles are very subtle that the difficulty of interpretation has increased significantly
.

　　Just recently, the Intelligence and Biomechanics Team of the Department of Mechanical Engineering, Tsinghua University launched a novel lip reading system.
This system collects tiny motion signals of muscles through self-powered flexible sensors, and uses a deep learning model based on prototype learning.
Capture and interpret lip language, so the accuracy rate is also higher
.

　　The publication of this achievement has also made further progress in the research on human-computer interaction and the restoration of the basic voice communication ability of the deaf people
.
However, at this stage, this technology still needs to solve the problem of sample size.
Since the machine interpretation of lip language depends on the size of the library, theoretically, as more and more lip language models are included in the database The ability of machines to "see" and understand language will also become stronger and stronger
.

　　Original title: Technology allows machines to "see" and understand what you say

This article is an English version of an article which is originally in the Chinese language on echemi.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to service@echemi.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.