facebookresearch / LASER

Language-Agnostic SEntence Representations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

embed_sentences Path parameters

julianpollmann opened this issue · comments

embed.py embed_sentences() takes a pathlib.Path() for some parameters (e.g. ifname), passing a Path/PosixPath will result in errors.

Steps to reproduce:

embed.embed_sentences(
    ifname=Path("some/overlaps/path.en"),
    encoder=encoder_model,
    token_lang=token_lang,
    spm_model=Path("spm/model/path/laser2.spm")),
    output=Path("some/output/path.emb"),
    verbose=True,
)

This leads to following error:
can only concatenate str (not "PosixPath") to str
caused by passing a Path to Token() and running a subprocess with this Path.

Python 3.10.9
Laser latest

Hi @julianpollmann, just commented on the PR!