speedb-io / speedb

A RocksDB compliant high performance scalable embedded key-value store

Home Page:https://www.speedb.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Potential Memory Consistency Issues with the SharedOptions class API

mrambacher opened this issue · comments

The SharedOptions class as redesigned exposes some internal objects as raw pointers:
const Cache* GetCache() const { return cache_.get(); }

This can lead to inconsistent memory issues. For example:
Cache *cache;
{
SharedOptions so;
cache = so.GetCache();
}

Now cache would point to an invalid pointer.

Additionally, this requires things like DBOptions and ColumnFamilyOptions to be friend classes. This means that a generic UseCase could not access these fields.

Instead, the SharedOptions class should have APIs like "const std::shared_ptr GetCache() const"

The exposed objects are not internal, they are usually set by the user and always accessible via the Options.

Your example wouldn't compile as you must declare you cache variable as const Cache* cache;

The API you suggest:
const std::shared_ptr GetCache() const;

Returns a shared_ptr to a NON CONST Cache*, thus, allowing the caller to modify the wrapped Cache pointer, which is what the current API prevents.

It does make sense to consider (in the next phase) returning:
std::shared_ptr<const Cache> GetCache() const;

Which will prevent the caller from modifying the wrapped Cache*, and avoid the potential issue of holding a dangling Cache*.

Having said that, this interface is available mainly to allow users to query the configuration of the exposed objects and I do not expect users to hold the pointer for lengthy periods (the potential risk exists, but very improbable to actually have an issue).

The example compiles fine if you change Cache* to const Cache*, but that is beside the point.

These APIs should not expose raw pointers but should expose the shared ones instead.

And there is no reason why they should be std::shared_ptr either. For example, I can create a SharedOptions and then use that SharedOptions to populate (via EnableSpeedbFeatures) an Options. I can then get from that Options the Cache. For example, the following code:

TEST_F(SharedOptionsTest, MJR) {
  size_t total_ram_size_bytes =
      4 * SharedOptions::kWbmPerCfSizeIncrease * 4 + 1;
  size_t delayed_write_rate = 256 * 1024 * 1024;
  int total_threads = 8;

  Options opts;
  SharedOptions so(total_ram_size_bytes, total_threads, delayed_write_rate);
  auto cache = so.GetCache();
  printf("MJR: CacheSize was = %d\n", (int) cache->GetCapacity());
  opts.EnableSpeedbFeatures(so);
  auto bbto = opts.table_factory->GetOptions<BlockBasedTableOptions>();
  ASSERT_NE(bbto, nullptr);
  bbto->block_cache->SetCapacity(2*1024*1024);
  printf("MJR: CacheSize is now = %d\n", (int) cache->GetCapacity());
}

compiles and works fine:
[ RUN ] SharedOptionsTest.MJR
MJR: CacheSize was = 1
MJR: CacheSize is now = 2097152
[ OK ] SharedOptionsTest.MJR (1 ms)

What you are trying to prevent is someone from changing the actual pointer in the Cache, which is already restricted by returning a const&

Obviously you may access the Cache* via the Options. One may do whatever one wishes to the entities of a specific CF. However, a single SharedOptions instance is meant to configure a set of CF-s to share the same entities in a well-defined manner (as Speedb deems best for its users). Tampering with these entities at the CF level is unavoidable, but explicitly discouraged by Speedb in the guidelines for using the feature. Nevertheless, when designing an interface (the interface of SharedOptions), we aim to have an interface that is easy to use correctly, and hard to use incorrectly.

There are 2 reasons for moving the data members out of the public interface of SharedOptions and preventing their modifications:

  • There are dependencies among the entities contained by ShreadOptions => their modification needs to be in a controlled manner.
  • Being a public data member, the shared pointers themselves may be replaced altogether, thus violating the whole design principle of this feature.

Thanks for your comments. We will take them into consideration when we move on to the next phase of this feature's development.