sewenew / redis-plus-plus

Redis client written in C++

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[QUESTION] Holding pipeline objects in a container in RedisCluster mode

srgorti opened this issue · comments

When trying to trying to replace hiredis-vip with Redis-plus-plus in our app, am observing a crash that is described below. Before that, our application uses Redis-cluster with (only) Lua scripts being run in pipeline mode. Currently, with hiredis-vip, a bunch of commands are batched using redisClusterAppendCommand and then redisClusterGetReplies is called for all of them, where each batch may target multiple cluster nodes. Based on my understanding of Redis-plus-plus, I am trying to prototype this as below.

  1. Create a pipeline based on the key. In the first version, the Pipeline is being created based on the slot-id, for which an extra get_cluster_slot function was added to ShardsPool - on the lines described here. Eventually, I plan to base Pipeline creation on Node after I get a basic test working.
  2. Run the eval command on this pipeline
  3. Save a pointer to the pipeline object in a map, with slot-id being the key.
  4. Continue this till the batch size is reached and then call pipeline.exec on each of the pipelines that have commands.
  5. Collect the replies and process them.

While this works for a batch-size of 1, with batch-size of 2, Step 1 crashes when creating the second pipeline. I would like to understand if this approach of saving the Pipeline objects is even supported - because somewhere Redis-plus-plus documentation recommends to limit usage of pipeline objects within a block. If it is supported (even if not recommended), any suggestions/best-practices on this usage approach.

Environment:

  • OS: AmazonLinux on EC2
  • Compiler: gcc 7.3.1
  • hiredis version: v1.2.0
  • redis-plus-plus version: 1.3.10

Additional context

From the stack bt, it is seen to crash when copying (from assignment= operator) the unordered set in the QueuedRedis class. This was happening after the second redisCluster.pipeline(key) call but before the assignment to the object on the map completed.

(gdb) bt
#0  std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<unsigned long, false> > >::_M_deallocate_nodes (this=<optimized out>, __n=0x726f70736e617274)
    at /usr/include/c++/7/bits/hashtable_policy.h:2096
#1  std::_Hashtable<unsigned long, unsigned long, std::allocator<unsigned long>, std::__detail::_Identity, std::equal_to<unsigned long>, std::hash<unsigned long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_move_assign (__ht=..., this=0x174d150) at /usr/include/c++/7/bits/hashtable.h:1148
#2  std::_Hashtable<unsigned long, unsigned long, std::allocator<unsigned long>, std::__detail::_Identity, std::equal_to<unsigned long>, std::hash<unsigned long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::operator= (__ht=..., this=0x174d150) at /usr/include/c++/7/bits/hashtable.h:468
#3  std::unordered_set<unsigned long, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<unsigned long> >::operator= (this=0x174d150) at /usr/include/c++/7/bits/unordered_set.h:96
#4  sw::redis::QueuedRedis<sw::redis::PipelineImpl>::operator= (this=0x174d120) at /home/XXXXX/tmp/rpp/redis++/include/sw/redis++/queued_redis.h:43
...

thanks,

Made a small program to repeat the behavior. With this, the crash happens on the first key itself.

#include <iostream>
#include <map>
#include <sw/redis++/redis++.h>

struct Context {
    sw::redis::RedisCluster *prc;
} ctx;

struct SlotPipeline {
    sw::redis::Pipeline  p;
    std::size_t          s;
};

static std::map<std::size_t, SlotPipeline*> slotPipelines;

void process_pair(const std::string& k, const std::string& v, bool do_exec) {
    std::cout << "key= " << k << " val= " << v << std::endl;

    struct SlotPipeline *sp = NULL;
    std::size_t slotId = sw::redis::ShardsPool::get_cluster_slot(k);
    std::cout << "cluster-slot: " << slotId << std::endl;
    std::map<std::size_t, SlotPipeline*>::iterator it = slotPipelines.find(slotId);
    if (it == slotPipelines.end()) {
        sp = (SlotPipeline *)malloc(sizeof(*sp));
        std::cout << "sp->p: type=" << typeid(sp->p).name() << std::endl;
        sp->p = ctx.prc->pipeline(k);
        slotPipelines[slotId] = sp;
    } else {
        sp = it->second;
    }
    sp->p.set(k, v);

    if (do_exec) {
        // TBD: fill this in
    }
}

int main()
{
    ctx.prc = new sw::redis::RedisCluster("tcp://127.0.0.1:30001");

    std::string keys[] = {
        "violet", "indigo", "blue", "green", "yellow", "orange", "red", ""
    };
    std::string values[] = {
        "flowers", "dyes", "skies", "trees", "bananas", "oranges", "apples"
    };

    for (int i = 0; !keys[i].empty(); i++) {
        process_pair(keys[i], values[i], false);
    }
    process_pair("black", "nights", true);
}

Output

 ./rpp-ex3
key= violet val= flowers
cluster-slot: 4135
sp->p: type=N2sw5redis11QueuedRedisINS0_12PipelineImplEEE
zsh: segmentation fault  ./rpp-ex3

Since I don't know how you implement get_cluster_slot, I cannot reproduce your problem. However, your code has a bug. The length of keys is different from the length of values, and I'm not sure if that caused the problem.

Also please try-catch possible exceptions thrown by redis-plus-plus (you can have a try-catch block to protect the process_pair method), and print the exception error to see what happens.

Thanks for your response. I fixed the difference in the lengths of keys and values and fixed it by adding an empty string "" at the end of values. However, am still seeing the crash.

The diff for get_cluster_slot (wrt 1.3.10) is pasted below. Changes are limited to shards_pool.h and shards_pool.cpp.

--- shards_pool.h	2023-07-09 12:21:34.000000000 +0000
+++ ../../../../../redis-plus-plus/src/sw/redis++/shards_pool.h	2023-09-23 09:36:57.450514555 +0000
@@ -63,6 +63,10 @@

     Shards shards();

+    static std::size_t get_cluster_slot(const StringView &key) {
+        return ShardsPool::_cluster_slot(key);
+    }
+
 private:
     void _move(ShardsPool &&that);

@@ -80,6 +84,8 @@

     std::pair<SlotRange, Node> _parse_slot_info(redisReply &reply) const;

+    static std::size_t _cluster_slot(const StringView &key);
+
     // Get slot by key.
     std::size_t _slot(const StringView &key) const;


--- shards_pool.cpp	2023-07-09 12:21:34.000000000 +0000
+++ ../../../../../redis-plus-plus/src/sw/redis++/shards_pool.cpp	2023-09-23 09:38:34.619876365 +0000
@@ -277,7 +277,7 @@
     }
 }

-Slot ShardsPool::_slot(const StringView &key) const {
+std::size_t ShardsPool::_cluster_slot(const StringView &key) {
     // The following code is copied from: https://redis.io/topics/cluster-spec
     // And I did some minor changes.

@@ -307,6 +307,10 @@
     return crc16(k + s + 1, e - s - 1) & SHARDS;
 }

+Slot ShardsPool::_slot(const StringView &key) const {
+    return ShardsPool::_cluster_slot(key);
+}
+
 Slot ShardsPool::_slot() const {
     return _random(0, SHARDS);
 }

regards,

Sorry, but I tried your code, and cannot reproduce your problem. Can you wrap your code with a try-catch block to see if any exception thrown by redis-plus-plus? You can output the error message, to see if something bad happens.

Regards

Thanks for your response. If it does not crash for you, that's a good sign for me - maybe some difference in the compiler version.
I have added the exception handling and removed the slot-id stuff. With my g++ version, it still crashes.

#include <iostream>
#include <map>
#include <sw/redis++/redis++.h>

struct Context {
    sw::redis::RedisCluster *prc;
} ctx;

struct SlotPipeline {
    sw::redis::Pipeline  p;
    std::size_t          s;
};

static std::map<std::size_t, SlotPipeline*> slotPipelines;

void process_pair(const std::string& k, const std::string& v, bool do_exec) {
    std::cout << "key= " << k << " val= " << v << std::endl;

    struct SlotPipeline *sp = NULL;
    sp = (SlotPipeline *)malloc(sizeof(*sp));
    std::cout << "sp->p: type=" << typeid(sp->p).name() << std::endl;
    sp->p = ctx.prc->pipeline(k);   // crashes here
    sp->p.set(k, v);

    if (do_exec) {
        // TBD: fill this in
    }
}

int main()
{
    ctx.prc = new sw::redis::RedisCluster("tcp://127.0.0.1:30001");

    std::string keys[] = {
        "violet", "indigo", "blue", "green", "yellow", "orange", "red", ""
    };
    std::string values[] = {
        "flowers", "dyes", "skies", "trees", "bananas", "oranges", "apples", ""
    };

    for (int i = 0; !keys[i].empty(); i++) {
        try {
            process_pair(keys[i], values[i], false);
        }
        catch (const sw::redis::ReplyError &err) {
            std::cout << "process_pair: ReplyError: %s"  << err.what() << std::endl;
        }
        catch (const sw::redis::TimeoutError &err) {
            // reading or writing timeout
            std::cout << "process_pair: TimeoutError: %s" <<  err.what() << std::endl;
        } catch (const sw::redis::ClosedError &err) {
            // the connection has been closed.
            std::cout << "process_pair: ClosedError: %s" <<  err.what() << std::endl;
        } catch (const sw::redis::IoError &err) {
            // there's an IO error on the connection.
            std::cout << "process_pair: IoError: %s" <<  err.what() << std::endl;
        }
        catch (const sw::redis::Error &err) {
           // other errors
           std::cout << "process_pair: Error: %s" <<  err.what() << std::endl;
        }
    }
    process_pair("black", "nights", true);
}

Compiler version:

/usr/bin/g++ -v
Using built-in specs.
COLLECT_GCC=/usr/bin/g++
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/7/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-libmpx --enable-libsanitizer --enable-gnu-indirect-function --enable-libcilkrts --enable-libatomic --enable-libquadmath --enable-libitm --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
gcc version 7.3.1 20180712 (Red Hat 7.3.1-17) (GCC)

Does it throw any exception when it crashes? i.e. any of your try-catch block catches any exception? If it does, please show me the error info.

Also please check if you install multiple version of hiredis. If you do, uninstall all but keep only one of them. Otherwise, you might get some wired problem.

I'm on vacation, and sorry for the late reply...

Regards

Does it throw any exception when it crashes? i.e. any of your try-catch block catches any exception? If it does, please show me the error info.

No error/exception seen. That is what make it harder to debug this issue.

Also please check if you install multiple version of hiredis. If you do, uninstall all but keep only one of them. Otherwise, you might get some wired problem.

To avoid this situation, I have stopped linking both the test app and the real app with hiredis-vip. And also moved to static linking with libredis++.a and libhiredis.a. With this, no other version of hiredis influences this executable.

Menwhile, without using pipeline, our application is now working fine with Redis-plus-plus. The only issue is that we are not getting the performance that we were getting by using pipelining with hiredis-vip.

I'd suggest you have a clean environment without hiredis-vip installed. Otherwise, you might get some wired problem. Like this issue mentioned.

Regards

Close the issue, since there's no update.

Regards