Summary: Researchers have developed the first sort-in-memory hardware system capable of tackling complex, nonlinear sorting tasks without traditional comparators. Using a novel Digit Read mechanism and Tree Node Skipping algorithm, the team demonstrated a fast, energy-efficient, and scalable architecture built on memristors.
Benchmark tests showed dramatic gains in speed, energy, and area efficiency compared to conventional ASIC-based sorters. This advance paves the way for smarter, high-performance hardware in AI, big data, and edge computing.
Key Facts:
- Comparator-Free Design: New Digit Read and TNS algorithms eliminate comparator bottlenecks.
- Impressive Gains: Achieved up to 7.7× speed and 160× energy efficiency over ASIC sorters.
- Versatile Applications: Validated in pathfinding, neural networks, and big data workloads.
Source: Peking University
A research team led by Prof. Yang Yuchao from the School of Electronic and Computer Engineering at Peking University Shenzhen Graduate School has achieved a global breakthrough by developing the first sort-in-memory hardware system tailored for complex, nonlinear sorting tasks.
Published in Nature Electronics, the study titled “A fast and reconfigurable sort-in-memory system based on memristors” proposes a comparator-free architecture, overcoming one of the toughest challenges in the field of processing-in-memory (PIM) technology.

Background
Sorting is a fundamental computing task, but its nonlinear nature makes it difficult to accelerate using traditional hardware. While memristor-based PIM architectures have shown promise for linear operations, they have long struggled with sorting.
Prof. Yang’s team addressed this by eliminating the need for comparators and introducing a novel Digit Read mechanism, along with a new algorithm and hardware design that reimagines how sorting can be performed within memory.
Why It Matters
This work represents a significant step forward in the evolution of PIM technology, from linear matrix operations to nonlinear, high-complexity tasks like sorting.
By proposing a scalable and reconfigurable sorting framework, the team provides a high-throughput, energy-efficient solution that meets the performance demands of modern big data and AI applications.
Key findings
The study presents a comparator-free sorting system built on a one-transistor–one-resistor (1T1R) memristor array, using a Digit Read mechanism that replaces traditional compare-select logic and significantly enhances computational efficiency.
The team also developed the Tree Node Skipping (TNS) algorithm, which speeds up sorting by reusing traversal paths and reducing unnecessary operations. To scale performance across diverse datasets and configurations, three Cross-Array TNS (CA-TNS) strategies were introduced.
The Multi-Bank strategy partitions large datasets across arrays for parallel processing; Bit-Slice distributes bit widths to enable pipelined sorting; and Multi-Level leverages memristors’ multi-conductance states to enhance intra-cell parallelism.
Together, these innovations form a flexible and adaptable sorting accelerator capable of handling varying data widths and complexities.
Application Demonstrations
To validate real-world performance, the team fabricated a memristor chip and integrated it with FPGA and PCB hardware to build a complete, end-to-end demonstration system. In benchmark tests, it delivered up to 7.70× faster speed, 160.4× higher energy efficiency, and 32.46× greater area efficiency compared to leading ASIC-based sorting systems.
The system also proved effective in practical applications: in Dijkstra path planning, it successfully computed the shortest paths between 16 Beijing Metro stations with low latency and power consumption.
In neural network inference, it enabled run-time tunable sparsity by integrating TNS with memristor-based matrix-vector multiplication in the PointNet++ model, achieving 15× speed and 67.1× energy efficiency improvements.
These results highlight the system’s broad applicability in both conventional and AI-driven workloads.
Future implications
This work redefines what’s possible in processing-in-memory systems. By demonstrating a flexible, efficient, and scalable sorting system, Prof. Yang’s team has opened the door for next-generation intelligent hardware capable of powering AI, real-time analytics, and edge computing. It lays the foundation for future nonlinear computation acceleration, pushing the boundaries of what memristor-based systems can achieve.
About this AI and computational neuroscience research news
Author: Jiang Zhang
Source: Peking University
Contact: Jiang Zhang – Peking University
Image: The image is credited to Neuroscience News
Original Research: Closed access.
“A fast and reconfigurable sort-in-memory system based on memristors” by Yang Yuchao et al. Nature Electronics
Abstract
A fast and reconfigurable sort-in-memory system based on memristors
Sorting is a fundamental task in modern computing systems. Hardware sorters are typically based on the von Neumann architecture, and their performance is limited by the data transfer bandwidth and CMOS memory.
Sort-in-memory using memristors could help overcome these limitations, but current systems still rely on comparison operations so that sorting performance remains limited.
Here we describe a fast and reconfigurable sort-in-memory system that uses digit reads of one-transistor–one-resistor memristor arrays.
We develop digit-read tree node skipping, which supports various data quantities and data types. We extend this approach with the multi-bank, bit-slice and multi-level strategies for cross-array tree node skipping.
We experimentally show that our comparison-free sort-in-memory system can improve throughput by ×7.70, energy efficiency by ×160.4 and area efficiency by ×32.46 compared with conventional sorting systems.
To illustrate the potential of the approach to solve practical sorting tasks, as well as its compatibility with other compute-in-memory schemes, we apply it to Dijkstra’s shortest path search and neural network inference with in situ pruning.