A simple length ratio filter shouldn't require manually updating pyhash submodules
kpu opened this issue · comments
I wanted to do a length ratio filter.
From the UI it looks like this is only available from opusfilter. That means I did pip install opusfilter
except that failed because pyhash won't compile.
building '_pyhash' extension
x86_64-pc-linux-gnu-gcc -Wsign-compare -DNDEBUG -march=native -O3 -pipe -fPIC -DSUPPORT_INT128=1 -Isrc/pybind11/include -Isrc/highwayhash -I/tmp/pip-build-env-5gdw27bb/overl ay/lib/python3.11/site-packages/pybind11/include -I/home/kpu/hplt/opuscleaner/include -I/usr/include/python3.11 -c src/Hash.cpp -o build/temp.linux-x86_64-cpython-311/src/Hash.o - std=c++17 -fvisibility=hidden -g0 -march=native -std=c++14
In file included from src/pybind11/include/pybind11/cast.h:16,
from src/pybind11/include/pybind11/attr.h:13,
from src/pybind11/include/pybind11/pybind11.h:13,
from src/Hash.h:8,
from src/Hash.cpp:1:
src/pybind11/include/pybind11/detail/type_caster_base.h: In function ‘std::string pybind11::detail::error_string()’:
src/pybind11/include/pybind11/detail/type_caster_base.h:482:26: error: invalid use of incomplete type ‘PyFrameObject’ {aka ‘struct _frame’}
482 | frame = frame->f_back;
| ^~
In file included from /usr/include/python3.11/Python.h:42,
from src/pybind11/include/pybind11/detail/common.h:186,
from src/pybind11/include/pybind11/pytypes.h:12,
from src/pybind11/include/pybind11/cast.h:13:
/usr/include/python3.11/pytypedefs.h:22:16: note: forward declaration of ‘PyFrameObject’ {aka ‘struct _frame’}
22 | typedef struct _frame PyFrameObject;
| ^~~~~~
In file included from /usr/include/python3.11/Python.h:38:
src/pybind11/include/pybind11/pybind11.h: In function ‘pybind11::function pybind11::detail::get_type_override(const void*, const type_info*, const char*)’:
src/pybind11/include/pybind11/pybind11.h:2348:54: error: ‘PyCodeObject’ {aka ‘struct PyCodeObject’} has no member named ‘co_varnames’; did you mean ‘co_names’?
2348 | locals, PyTuple_GET_ITEM(f_code->co_varnames, 0)
| ^~~~~~~~~~~
/usr/include/python3.11/pyport.h:24:38: note: in definition of macro ‘_Py_CAST’
24 | #define _Py_CAST(type, expr) ((type)(expr))
| ^~~~
/usr/include/python3.11/cpython/tupleobject.h:30:38: note: in expansion of macro ‘_PyTuple_CAST’
30 | #define PyTuple_GET_ITEM(op, index) (_PyTuple_CAST(op)->ob_item[index])
| ^~~~~~~~~~~~~
src/pybind11/include/pybind11/pybind11.h:2348:29: note: in expansion of macro ‘PyTuple_GET_ITEM’
2348 | locals, PyTuple_GET_ITEM(f_code->co_varnames, 0)
| ^~~~~~~~~~~~~~~~
In file included from src/Halftime.h:9,
from src/Hash.cpp:16:
src/halftime/halftime-hash.hpp: In instantiation of ‘struct halftime_hash::advanced::{anonymous}::RepeatWrapper<halftime_hash::advanced::{anonymous}::BlockWrapper256, 2>’:
src/halftime/halftime-hash.hpp:842:9: required from ‘void halftime_hash::advanced::{anonymous}::Hash(const uint64_t*, const char*, size_t, uint64_t*) [with BlockWrapper = RepeatWrapper<BlockWrapper256, 2>; unsigned int dimension = 5; unsigned int in_width = 3; unsigned int encoded_dimension = 9; unsigned int out_width = 5; uint64_t = long unsigned int; size_t = long unsigned int]’
src/halftime/halftime-hash.hpp:1039:1: required from ‘void halftime_hash::advanced::V4Avx2(const uint64_t*, const char*, size_t, uint64_t*) [with unsigned int dimension = 5; unsigned int in_width = 3; unsigned int encoded_dimension = 9; unsigned int out_width = 5; uint64_t = long unsigned int; size_t = long unsigned int]’
src/halftime/halftime-hash.hpp:1092:1: required from here
src/halftime/halftime-hash.hpp:869:9: warning: ignoring attributes on template argument ‘halftime_hash::advanced::{anonymous}::BlockWrapper256::Block’ {aka ‘__m256i’} [-Wign ored-attributes]
869 | using Block = Repeat<InnerBlock, count>;
| ^~~~~
src/halftime/halftime-hash.hpp: In static member function ‘static halftime_hash::advanced::{anonymous}::EhcBadger<BlockWrapper, dimension, in_width, encoded_dimension, out_w idth, fanout>::Block halftime_hash::advanced::{anonymous}::EhcBadger<BlockWrapper, dimension, in_width, encoded_dimension, out_width, fanout>::MixOne(Block, Block, uint64_t) [with BlockWrapper = halftime_hash::advanced::{anonymous}::RepeatWrapper<halftime_hash::advanced::{anonymous}::BlockWrapper256, 2>; unsigned int dimension = 6; unsigned int in_width = 3; unsigned int encoded_dimension = 7; unsigned int out_width = 2; unsigned int fanout = 8]’:
src/halftime/halftime-hash.hpp:468:16: note: the ABI for passing parameters with 64-byte alignment has changed in GCC 4.6
468 | static Block MixOne(Block accum, Block input, uint64_t entropy) {
| ^~~~~~
error: command '/usr/bin/x86_64-pc-linux-gnu-gcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pyhash
Failed to build pyhash
It appears pyhash
has an ancient pybind11
so I updated that submodule.
git clone https://github.com/flier/pyfasthash
cd pyfasthash/
git submodule init
git submodule update
cd src/pybind11
git pull https://github.com/pybind/pybind11.git
cd ../..
pip3 install .
Seems a bit much for a length ratio filter.
You should use this one:
https://github.com/hplt-project/OpusCleaner/blob/main/opuscleaner/filters/src_trg_ratio.json
D'oh