一些待解决的疑问
In Kafka’s paper, it is mentioned that it does not have a built-in cache, but uses the file system’s page cache. Is this design applicable to other software? For example, can InnoDB not have a built-in cache and also use the page cache?
Kafka indeed relies heavily on the file system for storing and caching messages1. Instead of maintaining as much data as possible and flushing it to the file system when memory space runs out, all data is immediately written to a persistent log on the file system without flushing to disk. This means the data is essentially transferred into the kernel’s page cache1.
As for InnoDB, it’s a bit different. InnoDB has its own buffer pool for caching data, and it’s recommended to use innodb_flush_method=O_DIRECT
in most cases, which won’t use the Operating System File Cache2. This is because InnoDB’s design and usage patterns are different from Kafka’s. While Kafka is a messaging system that deals with a large number of small messages, InnoDB is a database engine that needs to efficiently handle transactions and maintain ACID properties.
So, while it’s theoretically possible for other software to use the file system’s page cache like Kafka does, whether or not it’s a good idea depends on the specific requirements and characteristics of the software. It’s always important to consider the trade-offs involved when designing or choosing a caching strategy.
最后更新于