maszhongming / MatchSum

Code for ACL 2020 paper: "Extractive Summarization as Text Matching"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Plug & Play with "model = torch.load('MatchSum_cnndm_bert.ckpt').to(device)"

chrisdoyleIE opened this issue · comments

Hi all,

I can load the model into a python environment with the line model = torch.load('MatchSum_cnndm_bert.ckpt').to(device), provided your model.py file is in the same directory, and device is cuda.

I want to run some forward passes on a sample document, but I am confused by your input format. For example the code snippet below yields the following error:

import torch
import transformers
from transformers import BertTokenizer  # transformers>=3.0.0
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tok = BertTokenizer.from_pretrained("bert-base-uncased")
model = torch.load('MatchSum_cnndm_bert.ckpt').to(device)

with open("some_test_file.txt", "r") as handler:
    input_ids = tok( [handler.read()] )["input_ids"]

test_forward = model(input_ids, candidate_id=None, summary_id=None). # Error, see below
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-424e3757a895> in <module>
----> 1 model.forward(torch.tensor([0]))

TypeError: forward() missing 2 required positional arguments: 'candidate_id' and 'summary_id'

Any clarification on what the jsonl headers refer to would be greatly appreciated. Specifically, how to use the plug and play line of code included in your README.md

Sorry, before you test, you still need to train a model or use other methods to select some important sentences for each document, use our preprocessing code to get the candidate summary, and then use our trained model to get the final summary. Maybe you can refer to the discussion in issue#4 and issue#5.