Scylla coredumped (on_internal_error) after log message "table - Unable to load SSTable" (to be solved with newer Manager release)
timtimb0t opened this issue · comments
Packages
Scylla version: 2025.1.0~rc2-20250216.6ee17795783f
with build-id 8fc682bcfdf0a8cd9bc106a5ecaa68dce1c63ef6
Kernel Version: 6.8.0-1021-aws
Issue description
Scylla core dumped during disrupt_mgmt_repair_cli nemesis. Not sure what sequence led to it (but nemesis created repair tasks for scylla)
2025-02-19T00:48:06.703+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !ERR | scylla[22048]: [shard 13:strm] table - Unable to load SSTable /var/lib/scylla/data/keyspace1/standard1-690fe690ee0d11ef98ea7312299a522c/me-3gnx_01wb_28l8g2kppu311ynbvt-big-Data.db that belongs to tablets 36 and 37, at: 0x257814e 0x2577be0 0x2b367d7 0x2eb6124 0x207d681 0x207cd1b 0x2a694a1 0x2a692be 0x1f85d90 0x30851e7 0x4dcbf7f 0x4d8f2ca /opt/scylladb/libreloc/libc.so.6+0x97c16 /opt/scylladb/libreloc/libc.so.6+0x11bb0b
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}, seastar::future<void>::then_impl_nrvo<streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}, seastar::future<void> >(streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#4}, seastar::future<void>::then_impl_nrvo<streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#4}, seastar::future<void> >(streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#4}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#4}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
seastar::internal::coroutine_traits_base<void>::promise_type
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<seastar::smp::submit_to<mutation_writer::multishard_writer::consume(unsigned int)::$_0>(unsigned int, seastar::smp_submit_to_options, mutation_writer::multishard_writer::consume(unsigned int)::$_0&&)::{lambda()#1}, false>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void>::finally_body<seastar::smp::submit_to<mutation_writer::multishard_writer::consume(unsigned int)::$_0>(unsigned int, seastar::smp_submit_to_options, mutation_writer::multishard_writer::consume(unsigned int)::$_0&&)::{lambda()#1}, false> >(seastar::future<void>::finally_body<seastar::smp::submit_to<mutation_writer::multishard_writer::consume(unsigned int)::$_0>(unsigned int, seastar::smp_submit_to_options, mutation_writer::multishard_writer::consume(unsigned int)::$_0&&)::{lambda()#1}, false>&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::finally_body<seastar::smp::submit_to<mutation_writer::multishard_writer::consume(unsigned int)::$_0>(unsigned int, seastar::smp_submit_to_options, auto:1&&)::{lambda()#1}, false>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
N7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_6futureIvE16handle_exceptionIZN15mutation_writer17multishard_writer7consumeEjE3$_1Qoooooosr3stdE16is_invocable_r_vINS4_IT_EETL0__NSt15__exception_ptr13exception_ptrEEaaeqsr3stdE12tuple_size_vINSt11conditionalIXsr3stdE9is_same_vINS1_18future_stored_typeIJSA_EE4typeENS1_9monostateEEESt5tupleIJEESL_IJSJ_EEE4typeEELi0Esr3stdE16is_invocable_r_vIvSD_SF_Eaaeqsr3stdE12tuple_size_vISP_ELi1Esr3stdE16is_invocable_r_vISA_SD_SF_Eaagtsr3stdE12tuple_size_vISP_ELi1Esr3stdE16is_invocable_r_vISP_SD_SF_EEES5_OSA_EUlSQ_E_ZNS5_17then_wrapped_nrvoIS5_SR_EENS_8futurizeISA_E4typeEOT0_EUlOS3_RSR_ONS_12future_stateISK_EEE_vEE
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::internal::complete_when_all<seastar::internal::extract_values_from_futures_vector<seastar::future<void> >, seastar::future<void> >(std::vector<seastar::future<void>, std::allocator<seastar::future<void> > >&&, std::vector<seastar::future<void>, std::allocator<seastar::future<void> > >::iterator)::{lambda(auto:1)#1}, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::internal::complete_when_all<seastar::internal::extract_values_from_futures_vector<seastar::future<void> >, seastar::future<void> >(std::vector<seastar::future<void>, std::allocator<seastar::future<void> > >&&, std::vector<seastar::future<void>, std::allocator<seastar::future<void> > >::iterator)::{lambda(auto:1)#1}>(seastar::internal::complete_when_all<seastar::internal::extract_values_from_futures_vector<seastar::future<void> >, seastar::future<void> >(std::vector<seastar::future<void>, std::allocator<seastar::future<void> > >&&, std::vector<seastar::future<void>, std::allocator<seastar::future<void> > >::iterator)::{lambda(auto:1)#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::internal::complete_when_all<seastar::internal::extract_values_from_futures_vector<seastar::future<void> >, seastar::future<void> >(std::vector<auto:2, std::allocator<auto:2> >&&, std::vector<auto:2, std::allocator<auto:2> >::iterator)::{lambda(auto:1)#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<mutation_writer::multishard_writer::operator()()::$_0, true>::operator()(seastar::future<void>&&)::{lambda(auto:1&&)#1}, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void>::finally_body<mutation_writer::multishard_writer::operator()()::$_0, true>::operator()(seastar::future<void>&&)::{lambda(auto:1&&)#1}>(seastar::future<void>::finally_body<mutation_writer::multishard_writer::operator()()::$_0, true>::operator()(seastar::future<void>&&)::{lambda(auto:1&&)#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::finally_body<mutation_writer::multishard_writer::operator()()::$_0, true>::operator()(seastar::future<void>&&)::{lambda(auto:1&&)#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<unsigned long>, mutation_writer::multishard_writer::operator()()::$_1, seastar::future<void>::then_impl_nrvo<mutation_writer::multishard_writer::operator()()::$_1, seastar::future<unsigned long> >(mutation_writer::multishard_writer::operator()()::$_1&&)::{lambda(seastar::internal::promise_base_with_type<unsigned long>&&, mutation_writer::multishard_writer::operator()()::$_1&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<unsigned long>, seastar::future<unsigned long>::finally_body<mutation_writer::distribute_reader_and_consume_on_shards(seastar::lw_shared_ptr<schema const>, dht::sharder const&, mutation_reader, std::function<seastar::future<void> (mutation_reader)>, utils::phased_barrier::operation&&)::$_0::operator()(mutation_writer::multishard_writer&, utils::phased_barrier::operation&) const::{lambda()#1}, true>, seastar::future<unsigned long>::then_wrapped_nrvo<seastar::future<unsigned long>, seastar::future<unsigned long>::finally_body<mutation_writer::distribute_reader_and_consume_on_shards(seastar::lw_shared_ptr<schema const>, dht::sharder const&, mutation_reader, std::function<seastar::future<void> (mutation_reader)>, utils::phased_barrier::operation&&)::$_0::operator()(mutation_writer::multishard_writer&, utils::phased_barrier::operation&) const::{lambda()#1}, true> >(seastar::future<unsigned long>::finally_body<mutation_writer::distribute_reader_and_consume_on_shards(seastar::lw_shared_ptr<schema const>, dht::sharder const&, mutation_reader, std::function<seastar::future<void> (mutation_reader)>, utils::phased_barrier::operation&&)::$_0::operator()(mutation_writer::multishard_writer&, utils::phased_barrier::operation&) const::{lambda()#1}, true>&&)::{lambda(seastar::internal::promise_base_with_type<unsigned long>&&, seastar::future<unsigned long>::finally_body<mutation_writer::distribute_reader_and_consume_on_shards(seastar::lw_shared_ptr<schema const>, dht::sharder const&, mutation_reader, std::function<seastar::future<void> (mutation_reader)>, utils::phased_barrier::operation&&)::$_0::operator()(mutation_writer::multishard_writer&, utils::phased_barrier::operation&) const::{lambda()#1}, true>&, seastar::future_state<unsigned long>&&)#1}, unsigned long>
--------
seastar::internal::do_with_state<std::tuple<mutation_writer::multishard_writer, utils::phased_barrier::operation>, seastar::future<unsigned long> >
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, repair_writer_impl::create_writer(seastar::lw_shared_ptr<repair_writer>)::$_0, seastar::future<unsigned long>::then_impl_nrvo<repair_writer_impl::create_writer(seastar::lw_shared_ptr<repair_writer>)::$_0, seastar::future<void> >(repair_writer_impl::create_writer(seastar::lw_shared_ptr<repair_writer>)::$_0&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, repair_writer_impl::create_writer(seastar::lw_shared_ptr<repair_writer>)::$_0&, seastar::future_state<unsigned long>&&)#1}, unsigned long>
--------
N7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_6futureIvE16handle_exceptionIZN18repair_writer_impl13create_writerENS_13lw_shared_ptrI13repair_writerEEE3$_1Qoooooosr3stdE16is_invocable_r_vINS4_IT_EETL0__NSt15__exception_ptr13exception_ptrEEaaeqsr3stdE12tuple_size_vINSt11conditionalIXsr3stdE9is_same_vINS1_18future_stored_typeIJSC_EE4typeENS1_9monostateEEESt5tupleIJEESN_IJSL_EEE4typeEELi0Esr3stdE16is_invocable_r_vIvSF_SH_Eaaeqsr3stdE12tuple_size_vISR_ELi1Esr3stdE16is_invocable_r_vISC_SF_SH_Eaagtsr3stdE12tuple_size_vISR_ELi1Esr3stdE16is_invocable_r_vISR_SF_SH_EEES5_OSC_EUlSS_E_ZNS5_17then_wrapped_nrvoIS5_ST_EENS_8futurizeISC_E4typeEOT0_EUlOS3_RST_ONS_12future_stateISM_EEE_vEE
--------
seastar::internal::when_all_state_component<seastar::future<void> >
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<std::tuple<> >::discard_result()::{lambda((auto:1&&)...)#1}, seastar::future<std::tuple<> >::then_impl_nrvo<seastar::future<std::tuple<> >::discard_result()::{lambda((auto:1&&)...)#1}, seastar::future<void> >(seastar::future<std::tuple<> >::discard_result()::{lambda((auto:1&&)...)#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::future<std::tuple<> >::discard_result()::{lambda((auto:1&&)...)#1}&, seastar::future_state<std::tuple<> >&&)#1}, std::tuple<> >
--------
N7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_6futureIvE16handle_exceptionIZN13repair_writer20wait_for_writer_doneEvE3$_0Qoooooosr3stdE16is_invocable_r_vINS4_IT_EETL0__NSt15__exception_ptr13exception_ptrEEaaeqsr3stdE12tuple_size_vINSt11conditionalIXsr3stdE9is_same_vINS1_18future_stored_typeIJS9_EE4typeENS1_9monostateEEESt5tupleIJEESK_IJSI_EEE4typeEELi0Esr3stdE16is_invocable_r_vIvSC_SE_Eaaeqsr3stdE12tuple_size_vISO_ELi1Esr3stdE16is_invocable_r_vIS9_SC_SE_Eaagtsr3stdE12tuple_size_vISO_ELi1Esr3stdE16is_invocable_r_vISO_SC_SE_EEES5_OS9_EUlSP_E_ZNS5_17then_wrapped_nrvoIS5_SQ_EENS_8futurizeIS9_E4typeEOT0_EUlOS3_RSQ_ONS_12future_stateISJ_EEE_vEE
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<repair_meta::stop()::{lambda()#1}::operator()() const::{lambda()#1}, true>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void>::finally_body<repair_meta::stop()::{lambda()#1}::operator()() const::{lambda()#1}, true> >(seastar::future<void>::finally_body<repair_meta::stop()::{lambda()#1}::operator()() const::{lambda()#1}, true>&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::finally_body<repair_meta::stop()::{lambda()#1}::operator()() const::{lambda()#1}, true>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
2025-02-19T00:48:06.703+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: Aborting on shard 13, in scheduling group streaming.
2025-02-19T00:48:06.703+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: Backtrace:
2025-02-19T00:48:06.703+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x4daf1eb
2025-02-19T00:48:06.703+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x4da64b7
2025-02-19T00:48:06.703+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x4dcb9e8
2025-02-19T00:48:06.703+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: /opt/scylladb/libreloc/libc.so.6+0x40d0f
2025-02-19T00:48:06.703+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: /opt/scylladb/libreloc/libc.so.6+0x99bd3
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: /opt/scylladb/libreloc/libc.so.6+0x40c5d
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: /opt/scylladb/libreloc/libc.so.6+0x28901
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x2b36852
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x2eb6124
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x207d681
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x207cd1b
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x2a694a1
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x2a692be
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x1f85d90
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x30851e7
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x4dcbf7f
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: 0x4d8f2ca
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: /opt/scylladb/libreloc/libc.so.6+0x97c16
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: /opt/scylladb/libreloc/libc.so.6+0x11bb0b
2025-02-19T00:48:06.704+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | systemd[1]: Created slice system-systemd\x2dcoredump.slice - Slice /system/systemd-coredump.
decoded:
2025-02-19T00:48:06.703+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]: Backtrace:
2025-02-19T00:48:06.703+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 !INFO | scylla[22048]:
[Backtrace #0]
void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:69
seastar::print_with_backtrace(seastar::backtrace_buffer&, bool) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:805
seastar::print_with_backtrace(char const*, bool) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:850
(inlined by) seastar::sigabrt_action() at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:4005
(inlined by) seastar::install_oneshot_signal_handler<6, (void (*)())(&seastar::sigabrt_action)>()::{lambda(int, siginfo_t*, void*)#1}::operator()(int, siginfo_t*, void*) const at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:3982
(inlined by) seastar::install_oneshot_signal_handler<6, (void (*)())(&seastar::sigabrt_action)>()::{lambda(int, siginfo_t*, void*)#1}::__invoke(int, siginfo_t*, void*) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:3977
/data/scylla-s3-reloc.cache/by-build-id/8fc682bcfdf0a8cd9bc106a5ecaa68dce1c63ef6/extracted/scylla/libreloc/libc.so.6: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=b871aacf7b252210b87d2e5dbea81bda8ada61f1, for GNU/Linux 3.2.0, not stripped
__GI___sigaction at :?
__pthread_kill_implementation at ??:?
__GI_raise at :?
__GI_abort at :?
seastar::on_internal_error(seastar::logger&, std::basic_string_view<char, std::char_traits<char> >) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/on_internal_error.cc:56
replica::tablet_storage_group_manager::compaction_group_for_sstable(seastar::lw_shared_ptr<sstables::sstable> const&) const at ././replica/table.cc:1191
replica::table::compaction_group_for_sstable(seastar::lw_shared_ptr<sstables::sstable> const&) const at ././replica/table.cc:1209
(inlined by) replica::table::do_add_sstable_and_update_cache(seastar::lw_shared_ptr<sstables::sstable>, seastar::bool_class<sstables::offstrategy_tag>, bool) at ././replica/table.cc:1330
replica::table::add_sstable_and_update_cache(seastar::lw_shared_ptr<sstables::sstable>, seastar::bool_class<sstables::offstrategy_tag>) at ././replica/table.cc:1340
streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}::operator()() const at ././streaming/consumer.cc:67
seastar::future<void> std::__invoke_impl<seastar::future<void>, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&>(std::__invoke_other, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/invoke.h:61
(inlined by) std::__invoke_result<streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&>::type std::__invoke<streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&>(streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/invoke.h:96
(inlined by) std::invoke_result<streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&>::type std::invoke<streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&>(streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/functional:120
(inlined by) auto seastar::internal::future_invoke<streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&, seastar::internal::monostate>(streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&, seastar::internal::monostate&&) at ././seastar/include/seastar/core/future.hh:1149
(inlined by) seastar::future<void>::then_impl_nrvo<streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}, seastar::future<void> >(streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&, seastar::future_state<seastar::internal::monostate>&&)#1}::operator()(seastar::internal::promise_base_with_type<void>&&, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&, seastar::future_state<seastar::internal::monostate>&&) const::{lambda()#1}::operator()() const at ././seastar/include/seastar/core/future.hh:1465
(inlined by) _ZN7seastar8futurizeINS_6futureIvEEE22satisfy_with_result_ofITkSt9invocableZZNS2_14then_impl_nrvoIZZZN9streaming23make_streaming_consumerENS_13basic_sstringIcjLj15ELb1EEERNS_7shardedIN7replica8databaseEEERNS9_IN2db4view12view_builderEEEmNS6_13stream_reasonENS_10bool_classIN8sstables15offstrategy_tagEEEN5utils11tagged_uuidIN7service14session_id_tagEEEENK3$_0clE15mutation_readerENKUlSU_E_clESU_EUlvE1_S2_EET0_OT_ENKUlONS_8internal22promise_base_with_typeIvEERSW_ONS_12future_stateINS10_9monostateEEEE_clES13_S14_S18_EUlvE_EEvS13_SZ_ at ././seastar/include/seastar/core/future.hh:1996
(inlined by) seastar::future<void>::then_impl_nrvo<streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}, seastar::future<void> >(streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&, seastar::future_state<seastar::internal::monostate>&&)#1}::operator()(seastar::internal::promise_base_with_type<void>&&, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&, seastar::future_state<seastar::internal::monostate>&&) const at ././seastar/include/seastar/core/future.hh:1461
(inlined by) seastar::continuation<seastar::internal::promise_base_with_type<void>, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}, seastar::future<void>::then_impl_nrvo<streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}, seastar::future<void> >(streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, streaming::make_streaming_consumer(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::sharded<replica::database>&, seastar::sharded<db::view::view_builder>&, unsigned long, streaming::stream_reason, seastar::bool_class<sstables::offstrategy_tag>, utils::tagged_uuid<service::session_id_tag>)::$_0::operator()(mutation_reader) const::{lambda(mutation_reader)#1}::operator()(mutation_reader) const::{lambda()#3}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>::run_and_dispose() at ././seastar/include/seastar/core/future.hh:727
seastar::reactor::run_some_tasks() at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:2617
seastar::reactor::do_run() at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:3257
std::_Function_handler<void (), seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0>::_M_invoke(std::_Any_data const&) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:4566
seastar::posix_thread::start_routine(void*) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:591
start_thread at ??:?
__clone3 at :?
Important that nodes 1, 2, 3, 5, 6, 11 were affected at the same time
Impact
Node core dumped
How frequently does it reproduce?
Describe the frequency with how this issue can be reproduced.
Installation details
Cluster size: 8 nodes (i4i.4xlarge)
Scylla Nodes used in this run:
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-9 (18.171.60.228 | 10.3.2.214) (shards: 2)
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-8 (18.170.54.71 | 10.3.3.0) (shards: 14)
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-7 (3.9.139.192 | 10.3.2.238) (shards: 14)
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-6 (35.177.132.251 | 10.3.0.23) (shards: 14)
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-5 (13.40.43.146 | 10.3.1.215) (shards: 14)
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-4 (3.249.249.173 | 10.4.3.103) (shards: 14)
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-3 (34.254.221.111 | 10.4.3.73) (shards: 14)
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-2 (54.75.24.62 | 10.4.1.180) (shards: 14)
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-12 (3.10.140.46 | 10.3.3.217) (shards: 2)
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-11 (13.40.55.250 | 10.3.0.254) (shards: 14)
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-10 (13.61.181.244 | 10.0.1.234) (shards: 2)
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-1 (34.240.85.84 | 10.4.2.182) (shards: 14)
OS / Image: ami-089e047033a16995a ami-0c34f939e95d0c640 ami-06abf4cbfd7fb8f20
(aws: undefined_region)
Test: longevity-multi-dc-rack-aware-with-znode-dc-test
Test id: 17b48f6e-2559-4fc7-b6ee-7c717c3acda7
Test name: scylla-2025.1/longevity/longevity-multi-dc-rack-aware-with-znode-dc-test
Test method: longevity_test.LongevityTest.test_custom_time
Test config file(s):
Logs and commands
- Restore Monitor Stack command:
$ hydra investigate show-monitor 17b48f6e-2559-4fc7-b6ee-7c717c3acda7
- Restore monitor on AWS instance using Jenkins job
- Show all stored logs command:
$ hydra investigate show-logs 17b48f6e-2559-4fc7-b6ee-7c717c3acda7
Logs:
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-6 - https://cloudius-jenkins-test.s3.amazonaws.com/17b48f6e-2559-4fc7-b6ee-7c717c3acda7/20250218_151815/multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-6-17b48f6e.tar.zst
- multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-12 - https://cloudius-jenkins-test.s3.amazonaws.com/17b48f6e-2559-4fc7-b6ee-7c717c3acda7/20250218_151815/multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-12-17b48f6e.tar.zst
- db-cluster-17b48f6e.tar.zst - https://cloudius-jenkins-test.s3.amazonaws.com/17b48f6e-2559-4fc7-b6ee-7c717c3acda7/20250219_011024/db-cluster-17b48f6e.tar.zst
- sct-runner-events-17b48f6e.tar.zst - https://cloudius-jenkins-test.s3.amazonaws.com/17b48f6e-2559-4fc7-b6ee-7c717c3acda7/20250219_011024/sct-runner-events-17b48f6e.tar.zst
- sct-17b48f6e.log.tar.zst - https://cloudius-jenkins-test.s3.amazonaws.com/17b48f6e-2559-4fc7-b6ee-7c717c3acda7/20250219_011024/sct-17b48f6e.log.tar.zst
- loader-set-17b48f6e.tar.zst - https://cloudius-jenkins-test.s3.amazonaws.com/17b48f6e-2559-4fc7-b6ee-7c717c3acda7/20250219_011024/loader-set-17b48f6e.tar.zst
- monitor-set-17b48f6e.tar.zst - https://cloudius-jenkins-test.s3.amazonaws.com/17b48f6e-2559-4fc7-b6ee-7c717c3acda7/20250219_011024/monitor-set-17b48f6e.tar.zst
- builder-17b48f6e.log.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/17b48f6e-2559-4fc7-b6ee-7c717c3acda7/upload_20250219_011312/builder-17b48f6e.log.tar.gz
@denesb - please assign
@timtimb0t please provide more context. What the test was doing? Load and stream? Repair?
@raphaelsc , latest nemesis before failure was disrupt_destroy_data_then_rebuild. It stops the Scylla service on the target node and remove non system sstables files, then it starts scylla and initiates the rebuild process (it was successful).
Then (after 0,5 hour) disrupt_mgmt_repair_cli nemesis started. It created repair task on target node and failed (seems like due to target node coredumped)
the problem might be the test using old unsafe repair API, which is inheritenly racy with other tablet-related activities like split. @asias we're going to deprecate the old repair api, and tablet tests should all switch to new one right?
should we wire our native nodetool repair tool into the new api?
should we wire our native nodetool repair tool into the new api?
#22905 ?
But this is not good enough. We can't have 'unsafe' repair API - either we fix it (and get the above into 2025.1.0!) or we make it safe.
should we wire our native nodetool repair tool into the new api?
#22905 ?
But this is not good enough. We can't have 'unsafe' repair API - either we fix it (and get the above into 2025.1.0!) or we make it safe.
AFAIK, the plan is to deprecate the old unsafe api. if repair doesn't go through coordinator like it does with new api, we're susceptible to the races.
@timtimb0t - please confirm that the repair was done via the Manager. Also - which manager version was used here?
@karol-kokoszka , @Michal-Leszczynski - when will Manager use the new Tablets friendly API? We need to get it out there, soon.
Here is the related issue in manager repo scylladb/scylla-manager#4188
PR is not set for it yet.
@timtimb0t - please confirm that the repair was done via the Manager. Also - which manager version was used here? @karol-kokoszka , @Michal-Leszczynski - when will Manager use the new Tablets friendly API? We need to get it out there, soon.
Repair was initiated via Manager but never finished.
SCT log:
< t:2025-02-19 00:53:59,418 f:nemesis.py l:5569 c:sdcm.nemesis p:DEBUG > sdcm.nemesis.SisyphusMonkey: <<<<<<<<<<<< Finished disruption disrupt_mgmt_repair_cli (MgmtRepairCli nemesis) with status 'failed' <<<<<<<<<<<<
Node log (1 min after the coredump):
2025-02-19T00:49:10.708+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-5 !INFO | scylla-manager-agent[5642]: {"L":"INFO","T":"2025-02-19T00:49:10.417Z","M":"http: proxy error: context canceled"}
2025-02-19T00:49:10.708+00:00 multi-dc-rackaware-with-znode-dc-20-db-node-17b48f6e-5 !INFO | scylla-manager-agent[5642]: {"L":"ERROR","T":"2025-02-19T00:49:10.417Z","N":"http","M":"GET /gossiper/endpoint/live/","from":"10.4.1.113:48894","status":502,"bytes":0,"duration":"97ms","S":"github.com/scylladb/go-log.Logger.log\n\tgithub.com/scylladb/go-log@v0.0.7/logger.go:101\ngithub.com/scylladb/go-log.Logger.Error\n\tgithub.com/scylladb/go-log@v0.0.7/logger.go:84\nmain.(*logEntry).Write\n\tgithub.com/scylladb/scylla-manager/v3/pkg/cmd/agent/log.go:53\nmain.newRouter.RequestLogger.RequestLogger.func5.1.1\n\tgithub.com/go-chi/chi/v5@v5.1.0/middleware/logger.go:52\nmain.newRouter.RequestLogger.RequestLogger.func5.1\n\tgithub.com/go-chi/chi/v5@v5.1.0/middleware/logger.go:56\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2220\ngithub.com/go-chi/chi/v5.(*Mux).ServeHTTP\n\tgithub.com/go-chi/chi/v5@v5.1.0/mux.go:90\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:3210\nnet/http.(*conn).serve\n\tnet/http/server.go:2092"}
Manager version according to log:
"Scylla Manager Agent","version":"3.4.1-0.20250107.48d43ab3e"
@karol-kokoszka , @Michal-Leszczynski - when will Manager use the new Tablets friendly API? We need to get it out there, soon.
We will make SM release shortly after the Scylla version which exposes new tablets friendly API is released.
I am finishing the work on the PR locally and will send it for review during this week.
For 2025.1.0, we will use it only for the full tablet table repair (#22418).
For 2025.1.1, we will use it for the partial tablet table repair as well (#22417).
@karol-kokoszka , @Michal-Leszczynski - when will Manager use the new Tablets friendly API? We need to get it out there, soon.
We will make SM release shortly after the Scylla version which exposes new tablets friendly API is released. I am finishing the work on the PR locally and will send it for review during this week. For 2025.1.0, we will use it only for the full tablet table repair (#22418). For 2025.1.1, we will use it for the partial tablet table repair as well (#22417).
After makes less sense than before the release...
I opened #23008 for refusing tablet repair via the old repair.
Any other action item for my team here?
I opened #23008 for refusing tablet repair via the old repair. Any other action item for my team here?
Document this?
I opened #23008 for refusing tablet repair via the old repair. Any other action item for my team here?
Document this?
AFAIK we don't have documentation for our API, other than the self-generated one, which will be updated to mention that repair of tablets will not be supported for tablet tables.
Had the same issue in a scale test with nemesis.
First had a manager repair that never completed (as part of the disrupt_run_unique_sequence nemesis), then started to hit coredumps with the messages:
!ERR | scylla[5434]: [shard 0:strm] table - Unable to load SSTable /var/lib/scylla/data/keyspace1/standard1-f404a0b0f2ae11ef9821afbf3cfb5e1e/me-3go2_17sw_4e6g02mq6pg2pzdoja-big-Data.db that belongs to tablets 100 and 101, at: 0x257814e 0x2577be0 0x2b367d7 0x2eb6124 0x207d681 0x207cd1b 0x2a694a1 0x2a692be 0x1f85d90 0x30851e7 0x3083efc 0x2c51b79 0x2c51552 0x29273de 0x292f48d /opt/scylladb/libreloc/libc.so.6+0x2a087 /opt/scylladb/libreloc/libc.so.6+0x2a14a 0x309dd84
IIUC we need a new manager release, don't we?
@mykaul what's the plan for testing? shall we wait for new manger release?
Also, what happens if one tries old manager with 2025.1?
BTW, why is it still "triage" and how come is it only P2?
BTW, why is it still "triage" and how come is it only P2?
Because no on touched the labels since the issue was created.
I can up the priority but its not clear what it would help. All coding has been already done, we are waiting for releases.
There is #23008 too, but that is just a safety-net, it will not have an affect on manager testing.
Had the same issue in a scale test with nemesis. First had a manager repair that never completed (as part of the disrupt_run_unique_sequence nemesis), then started to hit coredumps with the messages:
!ERR | scylla[5434]: [shard 0:strm] table - Unable to load SSTable /var/lib/scylla/data/keyspace1/standard1-f404a0b0f2ae11ef9821afbf3cfb5e1e/me-3go2_17sw_4e6g02mq6pg2pzdoja-big-Data.db that belongs to tablets 100 and 101, at: 0x257814e 0x2577be0 0x2b367d7 0x2eb6124 0x207d681 0x207cd1b 0x2a694a1 0x2a692be 0x1f85d90 0x30851e7 0x3083efc 0x2c51b79 0x2c51552 0x29273de 0x292f48d /opt/scylladb/libreloc/libc.so.6+0x2a087 /opt/scylladb/libreloc/libc.so.6+0x2a14a 0x309dd84
IIUC we need a new manager release, don't we? @mykaul what's the plan for testing? shall we wait for new manger release?
This was already discussed above - #22954 (comment)
Also, what happens if one tries old manager with 2025.1?
It's not discussed.
The plan according to that comment is to release after 2025.1.0, but we need to test 2025.1.
It's not discussed. The plan according to that comment is to release after 2025.1.0, but we need to test 2025.1.
Luckily I participated in the Manager planning meeting that ended a minute ago, where we did discuss this.
!ERR | scylla[5434]: [shard 0:strm] table - Unable to load SSTable /var/lib/scylla/data/keyspace1/standard1-f404a0b0f2ae11ef9821afbf3cfb5e1e/me-3go2_17sw_4e6g02mq6pg2pzdoja-big-Data.db that belongs to tablets 100 and 101, at: 0x257814e 0x2577be0 0x2b367d7 0x2eb6124 0x207d681 0x207cd1b 0x2a694a1 0x2a692be 0x1f85d90 0x30851e7 0x3083efc 0x2c51b79 0x2c51552 0x29273de 0x292f48d /opt/scylladb/libreloc/libc.so.6+0x2a087 /opt/scylladb/libreloc/libc.so.6+0x2a14a 0x309dd84
Looks similar to #22707