redis / jedis

Redis Java client

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sometimes observing high latency when using JedisCluster with large number of threads.

gondsourabh40 opened this issue · comments

Expected behavior

I am using the JedisCluster client in my application, where sometimes I encounter latency greater than or equal to 1000ms. I tried to check the slow logs of Redis but didn't find any.
So, I went ahead and tried to reproduce it by running a large number of queries in parallel

Actual behavior

Ideally, there should be minimal latency at the client side (<=100ms) when we are not observing any latency at the Redis server side.

What Happened Instead

Sometimes, we experience high latency (>=1000ms), and there have also been instances where the latency exceeded 3 to 4 seconds.

Steps to reproduce:

You can use below code to reproduce the issue.

  public static void main(String args[]) throws ExecutionException, InterruptedException {
    JedisCluster jedisCluster = createJedisCluster();
    //Performance Run Configs
    int sample = 10; //Total Samples
    int TOTAL_THREAD = 1000; //For Parallelism
    int TOTAL_ITERATION = 1000; //Total Iteration For each Thread
    int MIN_THRESHOLD =  1000; //Min Threshold for monitoring latency

    boolean isUseJedisCluster = true; // flag for testing the JedisCluster(true)/Jedis(false)

    ExecutorService executorService = Executors.newFixedThreadPool(TOTAL_THREAD);
    System.out.println("Test Started with isUseJedisCluster : "+isUseJedisCluster);
    for (int i = 0; i < sample; i++) {
      AtomicInteger counter = ptOnJedis(jedisCluster, executorService, TOTAL_THREAD, TOTAL_ITERATION, MIN_THRESHOLD, isUseJedisCluster);
      int TOTAL_COMMANDS = TOTAL_ITERATION*TOTAL_THREAD;
      System.out.println(String.format("Total Latency : (%d/%d)", counter.get(), TOTAL_COMMANDS));
    }
    System.out.println("Test Ended");
  }

  private static AtomicInteger ptOnJedis(JedisCluster jedisCluster, ExecutorService executorService, int TOTAL_THREAD, int TOTAL_ITERATION, int MIN_THRESHOLD, boolean isUseJedisCluster)
          throws ExecutionException, InterruptedException {
    String REDIS_KEY = "key"; //Key for operation
    AtomicInteger counter = new AtomicInteger(0); //Keeps the number of counter where latency is greater than MIN_THRESHOLD(1000ms)
    List<Future> response = new ArrayList<>();

    for (int i = 0; i < TOTAL_THREAD; i++) {
      final Jedis jedis;
      if (isUseJedisCluster) {
        jedis = null;
      } else {
        Connection connection = jedisCluster.getConnectionFromSlot(JedisClusterCRC16.getSlot(REDIS_KEY));
        jedis = new Jedis(connection);
      }
      Future<Object> submit = executorService.submit(() -> {
          IntStream.range(0, TOTAL_ITERATION).forEach(x -> {
          if (isCommandInvokedWithLatency(jedisCluster, jedis, REDIS_KEY, MIN_THRESHOLD, isUseJedisCluster)) {
            counter.incrementAndGet();
          }
        });
        if(!isUseJedisCluster) {
          //Close only when we are using Jedis
          jedis.close();
        }
        return null;
      });
      response.add(submit);
    }
    for (Future future : response) {
      future.get();
    }
    return counter;
  }

  private static boolean isCommandInvokedWithLatency(JedisCluster jedisCluster, Jedis jedis, String KEY, int MIN_THRESHOLD, boolean isUseJedisCluster) {
    long currTime = System.currentTimeMillis();
      if (isUseJedisCluster) {
        jedisCluster.set(KEY, "value");
        jedisCluster.get(KEY);
      } else {
        jedis.set(KEY, "value");
        jedis.get(KEY);
      }
      long diff = System.currentTimeMillis() - currTime;
    if (diff >= MIN_THRESHOLD) {
      //If the command is taking more than MIN_THRESHOLD(1000ms)
      return true;
    }
    return false;
  }

  private static JedisCluster createJedisCluster() {
    HostAndPort localhost = new HostAndPort("127.0.0.1", 7001);
    Set<HostAndPort> jedisClusterNode = new HashSet<>();
    jedisClusterNode.add(localhost);
    ConnectionPoolConfig config = new ConnectionPoolConfig();
    config.setMinIdle(128);
    config.setMinIdle(128);
    config.setMaxTotal(-1);
    JedisCluster jedisCluster = new JedisCluster(jedisClusterNode,config);
    return jedisCluster;
  }

I tested both JedisCluster and Jedis and found out that Jedis is Performing quite well as compare to JedisCluster, but we don't wanna use Jedis by asking the connection from JedisCluster as JedisCluster do all these internally.

Run result when using JedisCluster

Test Started with isUseJedisCluster : true
Total Latency : (69/1000000)
Total Latency : (24/1000000)
Total Latency : (61/1000000)
Total Latency : (38/1000000)
Total Latency : (0/1000000)
Total Latency : (40/1000000)
Total Latency : (0/1000000)
Total Latency : (0/1000000)
Total Latency : (10/1000000)
Total Latency : (0/1000000)
Test Ended

Run result when using Jedis using getConnectionFromSlot

Test Started with isUseJedisCluster : false
Total Latency : (0/1000000)
Total Latency : (0/1000000)
Total Latency : (0/1000000)
Total Latency : (0/1000000)
Total Latency : (0/1000000)
Total Latency : (0/1000000)
Total Latency : (0/1000000)
Total Latency : (0/1000000)
Total Latency : (0/1000000)
Total Latency : (0/1000000)
Test Ended

Redis / Jedis Configuration

Jedis version: 4.3.2

Redis version: 7.0.0

Java version: 1.8

Your testing and comparison process has several flaws:

  1. You are only operating only only one key String REDIS_KEY = "key";. This is not what is usually intended.

  2. For jedis.set(KEY, "value"); jedis.get(KEY);, there is already a created connection that is selected for operations. It just executes the commands.
    For jedisCluster.set(KEY, "value"); jedisCluster.get(KEY);, the process goes through selecting hash slot, selecting connection for that hash slot, sometimes even creating the connection and executing a command; and all these processes TWICE. It'll definitely take more time.

  3. Your TOTAL_THREAD is 1000 while maximum connection pool size is 128. Threads will definitely be waiting for a long time for resources (connections).
    Even worse, jedis.set, jedis.get are not affected by this waiting because the connection was selected before time calculation.

Github issues should be used to report bugs and for detailed feature requests.
Everything else belongs in the Jedis Google Group
or Jedis Github Discussions.

Please post general questions to Google Groups or Github discussions.
These can be closed without response when posted to Github issues.