Sql syntax error:
lamaeldo opened this issue · comments
What happens?
When running training sessions of M, I get the following error:
splink.exceptions.SplinkException: Error executing the following sql for table __splink__m_u_counts
(__splink__m_u_counts_f8b550e08):
CREATE TABLE __splink__m_u_counts_f8b550e08
AS
(WITH __splink__df_comparison_vectors as (select * from __splink__df_comparison_vectors_2b87b76bf),
__splink__df_match_weight_parts as (
select "source_dataset_l","source_dataset_r","unique_id_l","unique_id_r",match_key
from __splink__df_comparison_vectors
),
__splink__df_predict as (
select
log2(cast(0.0006382379750372257 as float8) * ) as match_weight,
CASE WHEN THEN 1.0 ELSE (cast(0.0006382379750372257 as float8) * )/(1+(cast(0.0006382379750372257 as float8) * )) END as match_probability,
"source_dataset_l","source_dataset_r","unique_id_l","unique_id_r",match_key
from __splink__df_match_weight_parts
order by 1
)
select 0 as comparison_vector_value,
sum(match_probability * 1) /
sum(1) as m_count,
sum((1-match_probability) * 1) /
sum(1) as u_count,
'_probability_two_random_records_match' as output_column_name
from __splink__df_predict
)
Error was: Parser Error: syntax error at or near ")"
It seems like there's a term missing in the calculation of ther match_weight
To Reproduce
I am using Splink 3.9.14, DuckDB 10.2. I was actually debugging another issue with performance (basically taking 10mins for inference over ~40m comparisons when i expected to be able to do around a billion in that amount of time following the benchmarks. I have done some edits to splink's code, so i uninstalled and installed the package again
I have a list of rules, and I loop over these doing linker.estimate_parameters_using_expectation_maximisation(rule)
. Full notebook in attachment
error_reproduce.ipynb.txt
OS:
WIndows 11
Splink version:
3.9.14
Have you tried this on the latest master
branch?
- I agree
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- I agree
Fixed randomly on my end with seemingly no change, so i assume you won't be able to reproduce it