What is the potential of ReRAM What is the role of ReRAM in artificial intelligence

As AI functions gradually develop to the edge, they will promote more AI applications, and these applications will increasingly require more powerful analysis capabilities and intelligence, so that the system can make operational decisions locally, regardless of part It's still fully autonomous, just like in self-driving cars.

Machine learning has two basic stages: training and inference. Artificial neural networks, designed to imitate the way the brain works, first have to face a large amount of known data (such as pictures of dogs and cats), so that they can learn to recognize the appearance of each object and their differences. Then, the trained neural network or model can use the learned knowledge to reason about the new data presented in front of you to infer new things, such as determining whether an image is a dog or a cat.

At present, most of the training is carried out in the data center, but a few are carried out on the edge. Big companies like Google, Facebook, Amazon, Apple, and Microsoft all have massive amounts of user data, so they can provide enough data for their server farms to conduct industrial-scale AI training in order to improve their algorithms. The training phase requires a very fast processor, such as a GPU or Google TPU (tensor processor).

After the edge device collects the data (such as building pictures or facial photos) and transmits it to the inference engine for classification, inference occurs. Cloud-based AI is unacceptable for many applications because of its inherent latency shortcomings. For example, self-driving cars need to make real-time decisions about the objects they see, which is impossible for cloud-based AI architectures.

As AI functions gradually develop to the edge, they will promote more AI applications, and these applications will increasingly require more powerful analysis capabilities and intelligence, so that the system can make operational decisions locally, regardless of part It's still fully autonomous, just like in self-driving cars.

Traditional CPUs are not very good at these tasks, and high-end GPUs consume a lot of energy and are expensive. Edge-side reasoning requires cheaper and lower-power chips, which can quickly recognize an animal through a neural network, recognize a face, locate a tumor, or translate German into English.

Today, more than 30 companies are developing dedicated AI hardware for use in smartphones, tablets, and other edge devices to improve the efficiency of these professional computing tasks.

According to market analysis forecasts, from 2017 to 2021, the global AI chip market will grow at a compound annual growth rate of 54%. The key driver of this growth is the powerful hardware performance that can meet the requirements of machine learning.

Eliminate memory bottlenecks

All AI processors rely on data sets, which represent models of "learned" object categories (such as images and sounds, etc.) to identify objects. The identification and classification of each object requires multiple memory accesses. The biggest challenge facing today's engineers is how to overcome the memory access speed and power consumption bottlenecks in the existing architecture to obtain faster data access while reducing the energy cost of data access.

By storing training data as close as possible to the AI ​​processor core, the fastest speed and maximum energy efficiency can be obtained. However, the storage architecture used in the current design was created a few years ago when there were no other practical solutions. It is still a traditional combination of fast but small-capacity embedded SRAM and large-capacity but slower external DRAM. When the training model is stored in this way, frequent and large-scale data exchanges between embedded SRAM, external DRAM, and neural networks will increase energy consumption and transmission delays. In addition, both SRAM and DRAM are volatile memories, which limit the ability to save energy in the standby state.

Figure 1: The memory is at the center of the AI ​​architecture.

Use high-density, high-speed and low-power non-volatile memory to store the entire training model directly on the AI ​​processor die, so that higher energy efficiency and speed can be achieved. By enabling the new memory-centric architecture, the entire training model or knowledge base can be placed on the chip and directly connected to the neural network, which has the potential to achieve large-scale energy saving and significant performance improvement, thereby greatly extending battery life and providing more Good user experience. Today, several new generation memory technologies are competing to achieve this goal.

The potential of ReRAM

The ideal non-volatile embedded memory for AI applications should have the following characteristics: easy to manufacture, easy to integrate into the back end of the well-known CMOS process, easy to expand to advanced nodes, can be supplied in large quantities, and can meet the energy requirements of these applications Consumption and speed requirements.

Resistive RAM (ReRAM) has stronger scalability than magnetic RAM (MRAM) or phase change memory (PCM) solutions, which is an important factor when considering 14, 12 or even 7nm wafer processes. Other technologies require a more complex and expensive manufacturing process than ReRAM, and they are more energy intensive.

Figure 2: ReRAM can fill the gap in memory technology.

For example, Crossbar's ReRAM nanowire technology can be reduced to below 10nm without affecting performance. ReRAM is based on a simple device structure, uses materials suitable for CMOS technology and standard manufacturing processes, and can be produced in existing CMOS fabs. Because it is a low-temperature, back-end process integration, a multi-layer ReRAM array can be integrated on a CMOS logic wafer to build a 3D ReRAM storage space.

AI needs the best performance per watt, especially for low-power edge devices. The energy efficiency of ReRAM can reach five times that of DRAM-up to 1,000 reads per nanojoule-while showing better overall read performance than DRAM, up to 12.8GB/s, with a random delay of less than 20ns.

Memory-centric architecture

Scientists have been exploring various novel brain-inspired thinking paradigms, trying to achieve higher energy efficiency by mimicking the interaction of neurons and synapses in the central nervous system. Artificial neural synapses based on ReRAM technology are a very promising method that can be used to implement these high-density and scalable synaptic arrays in neuromorphic structures. By enabling AI at the edge, ReRAM is likely to play an important role in current and new AI exploration.

Shenzhen Kate Technology Co., Ltd. , https://www.katevape.com