Introduction to the principle of speech recognition technology

Introduction to the principle of speech recognition technology

This article refers to the address: http://

The problem that Auto Speech Recognize (ASR) solves is to enable the computer to "understand" human speech and "extract" the text information contained in the speech. ASR technology plays an important role in the "speakable" intelligent computer system, which is equivalent to installing "ears" on the computer system, so that it has the function of "listening", and thus realizes the use of "voice" in the information age. The most natural and convenient means for human-computer communication and interaction.

The problems faced by speech recognition technology are very difficult and difficult. Although as early as the 1950s, countries around the world began to study the technology tirelessly. Especially in the past two decades, many research institutions and enterprises at home and abroad have joined the research field of speech recognition technology. Great efforts have been made and fruitful results have been achieved, but until today, there is still a huge gap between the perfect solution of this technology, but this does not prevent the progressive speech recognition system from being obtained in many relatively limited situations. Successful application.

Today, speech recognition technology has evolved into a comprehensive technology involving multidisciplinary technologies such as acoustics, linguistics, digital signal processing, and statistical pattern recognition. Modern speech recognition systems based on speech recognition technology have been successfully applied in many scenarios, and the technologies used under different mission conditions will be different. The figure below is a schematic diagram of a speech recognition system under a relatively general task condition. The speech recognition system construction process as a whole includes two major parts: training and recognition. The training is usually done offline, and the signal processing and knowledge mining of the pre-collected massive speech and language databases are performed to obtain the “acoustic model” and “language model” required by the speech recognition system; the identification process is usually completed online. Automatic recognition of the user's real-time voice. The identification process can usually be divided into two modules: “front end” and “back end”: the main function of the “front end” module is to perform endpoint detection (remove unnecessary mute and non-speech), noise reduction, feature extraction, etc.; The function of the "end" module is to use the trained "acoustic model" and "language model" to perform statistical pattern recognition (also called "decoding") on the feature vector of the user's speech, and obtain the text information contained therein. In addition, the back-end module also There is an “adaptive” feedback module that can self-learn the user's voice to perform the necessary “correction” on the “acoustic model” and “speech model” to further improve the accuracy of recognition.

History and current status of speech recognition technology

Speech recognition research began around the 1950s, when AT&T Bell Labs implemented the first speech recognition system that recognizes ten English numbers, the Audry system, based on formant extraction.

In the 1960s, the application of computers promoted the development of speech recognition. The important result of this period is the introduction of dynamic time planning (DP) and linear predictive analysis (LPC). The latter solves the problem of speech signal generation model and has a profound impact on the development of speech recognition.

In the 1970s, great progress was made in the field of speech recognition. In theory, the LP technology has been further developed, and the Dynamic Time Correction Technique (DTW) is basically mature, especially the vector quantization (VQ) and hidden Markov model (HMM) theories. In practice, a specific human isolated speech recognition system based on linear predictive cepstrum and DTW technology is implemented.

In the 1980s, the MFCC's parameter extraction technology and the in-depth use of the HMM model made the speech recognition technology further developed. The problem of speech recognition was gradually and completely described in the theoretical system, and the efficiency was gradually developed in practice. A higher resolution algorithm.

Since the 1990s, under the impetus of the US Department of Defense's Darpa test, Ears plan, the recent Gales plan, and China's 863 plan, a large number of high-level research institutions and enterprises have joined the research field of speech recognition, which has greatly promoted the voice. Identification technology development and application. The speech recognition system has gradually evolved from simple tasks such as small vocabulary, isolated word recognition, specific person recognition, and quiet environment to large vocabulary, continuous speech, non-specific people, and recognition tasks in noisy environments, from simple speech recognition. The task evolved into a voice translation task, moving from a lab system to a commercial system.

Lithium Battery

S/N
Project
General Parameter
1
Number of series
15S
2
Rated voltage
48V
3
End of discharge voltage
40V
4
Charging voltage
Recommend 51V (50.5V – 51.5V) for floating charge
Recommend 54V (53.5V – 54.5V) for equation charge
5
Continuous charge and discharge curren
≤100A
6
Internal resistance (battery pack)
≤100mΩ
7
Self-discharge rate
≤2%/month
8
range of working temperature
(≤95%R.H.)
0~65℃ charge
-20~65℃ discharge
9
Storage temperature range(≤95%R.H.)
-40~70℃
10
Positive and negative lead way
Fence Terminal 2P*2
11
Display screen
LED display, four physical buttons
12
Protective function
Overcharge, over discharge, short circuit, overload, over temperature, etc.
13
certificate
MSDS,ISO9001,CE,UN38.3,ROSH

LIFEPO4 Battery For Home Energy Storage

Jiangsu Zhitai New Energy Technology Co.,Ltd , https://www.zttall.com