Intel Customer Spotlight
When the front-end application needs to query certain data, it accesses the hash table one or more times by querying the Key value, obtains the offset of the data file storage location where the Value is located and finally accesses the data file to get the desired Value.
Although the Baidu Feed-Cube that is built on DRAM has always had excellent performance in the environment of high/large concurrency (millions of queries per second) and massive data storage (petabyte level), it has to face-up to new and emerging challenges along with the continuous expansion of the Baidu Feed Stream services. The use of expensive DRAM to build a large memory pool results in Baidu’s TCO soaring while the limited capacity of DRAM also restricts further enhancement of the processing capability of Feed-Cube streaming.
Enhancement with Intel® Optane™ DC persistent memory
In response to these challenges, Baidu tried higher-performance non-volatile memory (NVM)-based storage devices, such as NVMe* SSDs, to store data files and hash tables in Feed-Cube. To verify the system performance with NVMe SSDs, Baidu conducted comparative testing with two Feed-Cube clusters based on DRAM and NVMe SSD respectively.
The test results show that there are three key issues with Feed-Cube using NVMe SSD compared to Feed-Cube using DRAM:
- In a scenario where a large concurrent application was used, the NVMe SSD experienced serious queuing delay and % QoS guarantee could not be achieved in a high queue depth (for example, greater than 1,);
- In a scenario where large-capacity data storage was tested, the marginal effect of the NVMe SSD was poor. The more data that was deployed, the longer the query execution time was, and the disk space utilization rate was lower as well;
- There is still a big gap between the I/O speed of the NVMe SSD and that of DRAM. Therefore, it is still necessary to deploy a large amount of DRAM as a cache in the system to ensure performance.
Intel Optane DC persistent memory provides a new way to resolve these issues. Compared to SSDs, this new product, which revolutionizes memory and storage architecture, has higher read-and-write performance, lower latency and higher endurance and has comprehensive application advantages in a multi-user, high-concurrency and high-capacity environment.
In view of this, Baidu first introduced Intel Optane DC persistent memory to store data files in Feed-Cube, while still using DRAM to store the hash tables. The purpose of this hybrid configuration was to verify the performance of Intel Optane DC persistent memory in Feed-Cube while at the same time minimizing impact on the performance of Feed-Cube. This was achieved by replacing the memory that stores data files first since the number of times that Feed-Cube reads hash tables is much higher than its reading of data files when querying values.
To enable Intel Optane DC persistent memory to be successfully applied to Feed-Cube, Baidu and Intel carried out all-round optimization of the system hardware, the operating system, cores and other components. Both parties first deployed Feed-Cube on a platform built with 2nd Generation Intel® Xeon® Scalable processors which not only offer strong computing power, but also are a “good match” for Intel Optane DC persistent memory. Secondly, Intel introduced a driver to support Intel Optane DC persistent memory to the BIOS of the server according to the Feed-Cube application requirements and added related patches to the foundation of Baidu’s self-developed Linux* kernel 4x to be able to fully unleash the performance potential of the new hardware.
After completing this series of optimizations, Baidu carried out a comparison test between the configuration using DRAM only and the hybrid configuration, simulating the large-scale concurrent access that can happen in a real-life scenario. In the test a setting of , QPS (Queries Per Second) was used with sets of Key-Value pairs being retrieved per access and therefore, the total access pressure on the system was 20 million. The test results are shown in Figure 3 and Table 1.
How to download from Pan baidu at high speed, with no account and works 100%?