Background: Large-scale data has brought more challenges in the aspects of efficient
storage and access requirements. Due merely to differences in the programming interface and database
schema, the emerging new database cannot replace RDBMS completely. Therefore, in a longer
period in the future, schema-free databases that will assist RDBMS to address the access bottleneck
is a broad solution of big data access in industry and academia.
Objective: Since schema-free data has the features of high performance and extendibility, it is generally
used as the storage of data cache. But there are few effective solutions to keep high cache hit.
The frequent access data is not always guaranteed in the cache.
Method: This paper describes Patent Publication Number CN103631972A, titled "Method and System
for column-aware data caching", issued by the State Intellectual Property Office of the P.R.C. on
December 23, 2013. The caching process includes judging cache hit or miss, updating column access
frequency, and change data capture. In order to increase the cache hit rate, the patent is related to
cache replacement using column access frequency. There are three circumstances to update column
access frequency and maintain cache replacement: transactional updates, non-transactional query,
and cache listener. Transactional updates will synchronize the updates of the database to the cache
system. Non-transactional query and cache listener will rectify the column access frequency using
Results: There are four results. Firstly, column-aware data caching has the features of low query
time and high throughput. Secondly, dynamic cache replacement using column access frequency improves
the cache hit rate and guarantees eventual cache consistency. Thirdly, cache listener can clean
the expired data to guarantee the hot data in the cache. Finally, this column-aware data caching system
is transparent to the developers. Cache consistency in this paper is slightly different from the
cache coherency issue in distributed environment.
Conclusion: The idea and a disclosed embodiment of a patent (Patent CN103631972A, issued by
the State Intellectual Property Office of the P.R.C.) are presented, which is based on the distribution
of cache management system. In one disclosed embodiment, this method contains access judge, frequency
counter, change data collector and data cache. The patent's applicability has been illustrated
by efficiently solving automatic cache management.