flosse / rust-json-file-store

A simple JSON file store written in Rust.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Q: Is the single file json store mode safe for use through multiple processes at once?

theawless opened this issue · comments

Thank you for the awesome library!

I am trying to write a CLI where JFS is used. Users might invoke this CLI in parallel.
I can see unit tests for validating multi threaded access, but I couldn't see anything concrete for multi process access.

To be honest I don't know, I never used this crate in such a context.

So if you don't mind, write some tests and if necessary we can look together how to solve it.

I wrote the following code using jfs = "0.7.0".

use std::env;
use std::io::Error;

use jfs::{Config, Store};

fn build() -> Result<Store, Error> {
    let config = Config {
        indent: 2,
        pretty: true,
        single: true,
    };
    Store::new_with_cfg("test", config)
}

fn main() -> Result<(), Error> {
    let args = env::args().collect::<Vec<String>>();
    let n = args[1].parse::<u32>().unwrap();

    let store = build()?;
    for i in 0..n {
        let key = i.to_string() + "_" + "count";
        let val = store.get::<u32>(&key).unwrap_or_default();
        let val = val + 1;
        store.save_with_id(&val, &key)?;
    }

    Ok(())
}

It should get keys 0_count...n-1_count, add one to the value, then store the value with the same key.

I ran this code 100 times with n = 100 first without parallelism and then with 8 parallel jobs.

parallel -j1 ./target/release/rust-test 100 ::: {1..100}

{
  "0_count": 100,
  "10_count": 100,
  "11_count": 100,
  "12_count": 100,
  "13_count": 100,
  "14_count": 100,
  "15_count": 100,
  "16_count": 100,
  "17_count": 100,
  "18_count": 100,
  "19_count": 100,
...

parallel -j8 ./target/release/rust-test 100 ::: {1..100}

{
  "0_count": 27,
  "10_count": 22,
  "11_count": 18,
  "12_count": 21,
  "13_count": 26,
  "14_count": 30,
  "15_count": 22,
  "16_count": 18,
  "17_count": 18,
  "18_count": 18,
  "19_count": 26,
...

I also noticed that fs2 locking is being used just before reads and writes and it prevents other processes from locking the file when one process already has an exclusive lock. Hence, I think in both cases data races couldn't happen and the json file didn't get corrupted. This probably means that jfs is safe for access from multiple processes.

It does not however hold the file locks for the entire duration that the store object lives. Hence in between get call and save call, there could be other processes that could have already updated the file. It can be a feature to hold the lock for the whole duration but I think it'll needlessly complicate the cases that aren't related to multi-processing. Also in such cases, using a real DB would be a better option anyway.

I see. Good to know! Thanks a lot for that extensive explanation!!