pzelasko / daseg

Dialog Acts SEGmentation: Tools for dialog act research

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Simple Inference from a transcript

YoussephAhmed opened this issue · comments

First of all thank you for your awesome work.. second I am asking if there is a sample example for doing a simple inference
input the full transcript and outputs the DA boundaries and Classes?
All I found here in the command line examples was for doing evaluation which needs the entire datasets.
Thanks in advance.

Hmm I don't think I have a script that processes raw transcript, but you should be able to get away with writing sth like:

from pathlib import Path

from daseg.data import DialogActCorpus, Call, FunctionalSegment
from daseg import TransformerModel

dialog = DialogActCorpus(dialogues={
        'call0': Call([
            FunctionalSegment(text='<your-full-transcript>', dialog_act=None, speaker='A'),
        ])
    }
)

model = TransformerModel.from_path(Path(model_path), device="cpu")

results = model.predict(
    dataset=dialog,
    batch_size=1,
)

dialog_recognized = results["dataset"]

Thanks for the quick response .. when I passed None to the dialog_act argument it gave me a key error so I inserted this condition in the utils_ner.py file in convert_examples_to_features function
tokens.extend(word_tokens)
if label == None:
label = label_list[0]
word_labels = [label_map[label]] + [pad_token_label_id] * (len(word_tokens) - 1)

not sure if this is the correct fix or not but it is working well now, also not sure about which key I should select from the label_list array of keys so I selected the first one I hope this not affecting the results of the inference :)

It should be OK, the input value of dialog_act is discarded during inference anyway.

okay Thanks.