Efficient PIM Architectures for Data-Intensive Applications
Data-Intensive Applications have special characteristics that General-Purpose Processors can not efficiently support. We have found out that PIM architectures can provide the tremendous amount of memory bandwidth that is required by these applications. The PIM architectures can also provide the parallel computational environment that these applications can exploit greatly.
However, in order to efficiently utilize the effectiveness of PIM modules, a new computational paradigm is necessary. We are investigating architectural techniques to efficiently exploit the huge memory bandwidth and parallel computations inside of PIM modules.
Data-Intensive applications such as media applications and database applications are ever increasingly demanding applications in the computing society. Supporting these applications is one of the top design considerations for the future microprocessors.
However, these Data-Intensive applications require a huge amount of data transfers and have some unique operation characteristics that are quite different from the usual characteristics that are expected from a General-Purpose Processor.
Data-Intensive Applications require a tremendous amount of data accesses and because of the memory-wall problem and the aggregated penalty due to the huge amount of accesses, the performance is severely degraded.
Data-Intensive Applications special characteristics that are not quite suitable for a General-Purpose Processor to adapt. Some of the controversial characteristics are shown in the below table.
¡¡ Characteristics Data-Intensive Applications General-Purpose Processing Operations Add, Sub, Abs, And, ¡¦ Add, Sub, Mult, Div, ¡¦ Operand Size on 8 bits data 32 or 64 bits data Parallelism Massively data parallel limited parallelism supported Data Amount Massive amount of data Very small cache Execution Style Stream-based (data-flow) Control-flow
Inside of these Data-Intensive applications, there are core operations that consumes the majority of the overall execution time. These core operations take up to 90% of the total execution times of the applications.
Our approach is a Hardware/Software Co-Design approach using PIM (Processor-In Memory) Module to execute the memory access-intensive and computation-intensive core operations. Our computing paradigm is different from the conventional Hardware/Software Co-Design paradigm since the operations are happening where the data are located instead of supplying data to a Hardware module. And by efficiently organizing the PIM structure, we can truly benefit from the huge memory bandwidth improvement (and also drastically reduced data transfers between the processor and memory.) and parallel executions.
Our expected performance improvement can be described as in the below diagram.
PIM (Processor-In-Memory) Architecture for Motion
Jung-Yup Kang, Sandeep Gupta, Saurabh Shah, and Jean-Luc Gaudiot
Proceedings of IEEE 14th International Conference on Application-Specific Systems, Architectures and Processors (ASAP2003), pp. 273-283, The Hague, The Netherlands, June 24-26, 2003
An Efficient PIM (Processor-In-Memory) Archtiecture for
Jung-Yup Kang, Saurabh Shah, Sandeep Gupta and Jean-Luc Gaudiot
Technical Report PASCAL-2003-01, University of California at Irvine, February 2003
NSF Interim Report