fhamborg / Giveme5W1H

Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Results are not encouraging

iitrsamrat opened this issue · comments

Describe the bug
The output of who did what to whom when and where and how is not really encouraging..
Who - should not have ner tag O
Whom - should not have ner tag O
Where - should not have ner tag O

Also when run on whole news corpus the result does not make any sense. Running on individual sentence produce better result.

To Reproduce
After filtering by NER on Who, Whom, Where etc below results are achieved.
Run on following sample text to see the output.
Extractor run on sentence by sentence generate little better result but does not generate all the events properly.
Article 1:
Indian Navy Commander Abhilash Tomy, who suffered a severe back injury in September after his yacht was hit by a vicious storm with 14-metre-high waves mid-way across south Indian Ocean, says it will take him another couple of years to fully recover from the injury.

He was participating in the Golden Globe Race 2018(GGR) representing India in the historic race without modern navigation aids. He had to drop out of the competition due to the severity of the hit and was rescued after three days from Indian Ocean.

In an exclusive interview to CNN-News18, Commander Tomy says the experience of going through the storm was a once-in-a-lifetime moment and a complete ‘paisa vasool’. Excerpts:

Q. Yours has been an incredible journey. How does it feel to be alive?

It's good.

Not yet, not completely. My neurosurgeon has said that I've recovered, but I think I'm not back to being normal yet. It'll take another couple of years to be normal and then I'll start working on my fitness.

Q. You're the only Asian to have competed in the Golden Globe. Do you remember the moment when it happened? What you were thinking back then?

Very well. I've been through storms before…I usually look forward to storms. It's a once-in-a-life moment. You get a 30 knot storm almost every week, but a 70 knot storm is...

Q. It doesn't sound fun at all. Commander Tomy: For me, it's paisa vasool. I get my money back if I go through a storm like that. Otherwise I can always stay at home, you know what I mean. We were given some sort of warning about the storm. Nobody had any idea that this would turn out to be so bad. So the night before the 20th, I prepared the boat, took the main sail down and latched it completely. I checked the mast, checked all the split pins, I checked for cracks, damages or anything that could go wrong. I did full inspection of the boat and was prepared for the storm the next day.

When you're in a storm like this, you adjust the boat in the direction of the waves so that the boat would sail. With waves coming from two directions, it became very difficult to judge which the better way to put the boat is. If you put it in one direction, one wave slaps you. The other way round, the other wave slaps you. The first wave got to me close to noon (India time). The boat had a knock down, its boom broke. I went inside and it was a complete mess, everything was thrown out everywhere. I put things back in place ... the gas was leaking and I put the galley back in place, switched off the gas.

The side glass broke and diesel was leaking out, float boats were stuck on the roof. I put everything back, all the charts, everything back in place. Then I went out and started sailing the boat again and there was a second knock down and I found myself on top of the mast. I fell from there on top of the boom. The mast was about nine meters high and the waves were 15 meters high. I don't know by what height I fell. But I landed on the boom, fell on the deck and I thought there was something wrong with my back. I went inside and again started cleaning up the mess. Finally, after 30 minutes, I stood up and my knees were not obeying me. I was collapsing. I thought I would lie down for some time and maybe after 30 minutes I'll sail again. But it didn't improve. So I pulled myself on to the bunk and secured myself there.

Q. What happened in those three days when you were in the ocean all by yourself? It was like a black hole...how did you survive? What do you drink, eat? What was going on in your mind?

I'm lying in my bunk and the race organisers (are) asking questions and they want me to activate my EPER, which is the emergency processing indication radiator beacon. I don't know the extent of my injury — I think it's the lower back as my back has become stiff. And I decide to wait for a day, 24 hours. And maybe if the back is better, I can take the boat to Mauritius or to Australia.

Q. At any point during those three days did you think, ‘Ab toh main gaya’ (I may not survive)?

Not a chance. I'm a reconnaissance pilot in the navy. I've done this exercise so many times — gone looking for survivors, been the survivor, you know, in simulated drills. I know how this plays out.

Q. You clearly have recovered from the trauma. Has your family given an ultimatum of no more sailing?

There’s nothing like that. In fact, I was speaking to my wife one day and was kind of testing the waters. I told her, ‘Five years ago if I would have introduced myself, say as an IT engineer or something like that, would you have married me?’ She said no. I told her, ‘See, I'm a sailor, that's why you like me. So maybe I should go back to sailing’. Then she got the drift and said, ‘You get better, get fit, then maybe you can’.

Q. What about your mom?

We were having this conversation day before yesterday. She asked me, ‘I hope you're not planning to do something like this again’. And I said, ‘Maybe if I get fit I could think of these things’. And mom said that's

Output:
-- who --
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
('Tomy', 1.0)
-- what --
('was hit by a vicious storm', 1.0)
('has said that I', 1.0)
("'ve recovered", 1.0)
('think I', 1.0)
("'m not back to being normal yet", 1.0)
('was participating in the Golden Globe Race 2018 in the historic race without modern navigation aids', 1.0)
('will take him', 1.0)
("'ll take another couple and then I", 1.0)
("'ll start working on my fitness", 1.0)
('had to drop out of the competition and was rescued after three days', 1.0)
('has been an incredible journey', 1.0)
("'s good", 1.0)
("'ve been through storms before … I", 1.0)
('look forward to storms', 1.0)
("'re the only Asian", 1.0)
('happened', 1.0)
('get my money', 1.0)
('go through a storm like that', 1.0)
('were thinking back then', 1.0)
('can always stay at home', 1.0)
('mean', 1.0)
("'s a once-in-a-life moment", 1.0)
('prepared the boat , took the main sail and latched it', 1.0)
('get a 30 knot storm', 1.0)
('checked the mast , checked all the split pins , I', 1.0)
('checked for cracks', 1.0)
("does n't sound fun", 1.0)
('did full inspection and was prepared for the storm', 1.0)
("'s paisa vasool", 1.0)
('know what I', 1.0)
('went inside', 1.0)
('put things', 1.0)
('put the galley', 1.0)
('would sail', 1.0)
("'re in a storm", 1.0)
('adjust the boat', 1.0)
('became very difficult to judge', 1.0)
('put everything', 1.0)
('put it in one direction', 1.0)
('went out and started sailing the boat and there was a second knock down and I', 1.0)
('found myself on top of the mast', 1.0)
('slaps you', 1.0)
('fell from there', 1.0)
('had a knock down', 1.0)
('broke', 1.0)
('got to me close to noon ( India time', 1.0)
("do n't know by what height I", 1.0)
('fell', 1.0)
('was a complete mess', 1.0)
('landed on the boom , fell on the deck and I', 1.0)
('thought there', 1.0)
('was thrown out everywhere', 1.0)
('went inside and again started cleaning up the mess', 1.0)
('was leaking', 1.0)
('stood up', 1.0)
('were not obeying me', 1.0)
('was collapsing', 1.0)
('thought I', 1.0)
('would lie down for some time and maybe after 30 minutes', 1.0)
("'ll sail again", 1.0)
('pulled myself on to the bunk and secured myself', 1.0)
('was about nine meters', 1.0)
("'m lying in my bunk", 1.0)
('to activate my EPER', 1.0)
("do n't know the extent", 1.0)
('think it', 1.0)
('has become stiff', 1.0)
('decide to wait for a day , 24 hours', 1.0)
('were in the ocean all by yourself', 1.0)
('can take the boat or to Australia', 1.0)
('may not survive', 1.0)
("'m a reconnaissance pilot", 1.0)
("'ve done this exercise", 1.0)
('know how this', 1.0)
("'s the lower back as my back", 1.0)
('is better', 1.0)
('was speaking to my wife and was kind', 1.0)
('told her', 1.0)
('would have introduced myself', 1.0)
("'m a sailor", 1.0)
('have recovered from the trauma', 1.0)
('should go back to sailing', 1.0)
('hope you', 1.0)
('get fit I', 1.0)
('could think of these things ’', 1.0)
('get better , get fit', 1.0)
('can ’', 1.0)
('got the drift and said , ‘ You , then maybe you', 1.0)
("'re not planning to do something again", 1.0)
('asked me', 1.0)
('expects of me', 1.0)
-- where --
('Indian Ocean', 0.7)
('India', 0.7)
-- why --
('his yacht', 0.5860000000000001)
('it', 0.5860000000000001)
('He', 0.5860000000000001)
('Commander Tomy', 0.5860000000000001)
('the experience of going through the storm', 0.5860000000000001)
('Q. Yours', 0.5860000000000001)
('you', 0.5860000000000001)
('I', 0.5860000000000001)
('It', 0.5860000000000001)
('You', 0.5860000000000001)
('We', 0.5860000000000001)
('The mast', 0.5860000000000001)
('there', 0.5860000000000001)
('me', 0.5860000000000001)
-- how --
('a severe back injury in September after his yacht was', 1.0)
('a vicious storm with 14-metre-high waves mid-way across south Indian', 1.0)
('with 14-metre-high waves mid-way across south Indian Ocean , says', 1.0)
('waves mid-way across south Indian Ocean , says it will', 1.0)
('across south Indian Ocean , says it will take him', 1.0)
('to fully recover from the injury .', 1.0)
('the historic race without modern navigation aids .', 1.0)
('without modern navigation aids .', 1.0)
('from Indian Ocean .', 1.0)
('not back to being normal yet .', 1.0)
('competition due to the severity of the hit and was', 1.0)
('a once-in-a-lifetime moment and a complete ‘ paisa vasool ’', 1.0)
('a complete ‘ paisa vasool ’ .', 1.0)
('an exclusive interview to CNN-News18 , Commander Tomy says the', 1.0)
('an incredible journey .', 1.0)
('be alive ?', 1.0)
('thinking back then ?', 1.0)
("'s good .", 1.0)
('Not yet , not completely .', 1.0)
('not completely .', 1.0)
('being normal yet .', 1.0)
('normal yet .', 1.0)
("be normal and then I 'll start working on my", 1.0)
('only Asian to have competed in the Golden Globe .', 1.0)
('money back if I go through a storm like that', 1.0)
('Very well .', 1.0)
('I usually look forward to storms .', 1.0)
('look forward to storms .', 1.0)
('a once-in-a-life moment .', 1.0)
('storm almost every week , but a 70 knot storm', 1.0)
('Otherwise I can always stay at home , you know', 1.0)
('can always stay at home , you know what I', 1.0)
('so bad .', 1.0)
('the main sail down and latched it completely .', 1.0)
('it completely .', 1.0)
('go wrong .', 1.0)
('did full inspection of the boat and was prepared for', 1.0)
('very difficult to judge which the better way to put', 1.0)
('the better way to put the boat is .', 1.0)
('things back in place ... the gas was leaking and', 1.0)
('galley back in place , switched off the gas .', 1.0)
('everything back , all the charts , everything back in', 1.0)
('everything back in place .', 1.0)
('went inside and it was a complete mess , everything', 1.0)
('a complete mess , everything was thrown out everywhere .', 1.0)
('out everywhere .', 1.0)
('meters high and the waves were 15 meters high .', 1.0)
('meters high .', 1.0)
('something wrong with my back .', 1.0)
('went inside and again started cleaning up the mess .', 1.0)
("and maybe after 30 minutes I 'll sail again .", 1.0)
('Finally , after 30 minutes , I stood up and', 1.0)
('the lower back as my back has become stiff .', 1.0)
('lower back as my back has become stiff .', 1.0)
('a black hole ... how did you survive ?', 1.0)
('And maybe if the back is better , I can', 1.0)
('become stiff .', 1.0)
('I pulled myself on to the bunk and secured myself', 1.0)
('maybe I should go back to sailing', 1.0)
('is better , I can take the boat to Mauritius', 1.0)
("me , it 's paisa vasool", 1.0)
('toh main gaya ’ ( I may not survive )', 1.0)
("I 've recovered , but I think I 'm not", 1.0)
('so many times — gone looking for survivors , been', 1.0)
('in simulated drills .', 1.0)
('You clearly have recovered from the trauma .', 1.0)
('So maybe I should go back to sailing ’ .', 1.0)
('go back to sailing ’ .', 1.0)
('get better , get fit , then maybe you can', 1.0)
('then maybe you can ’ .', 1.0)
('‘ Maybe if I get fit I could think of', 1.0)
('get fit I could think of these things ’ .', 1.0)
("'s exactly what she expects of me .", 1.0)
('the', 1.0)
('this would turn out to be so bad', 1.0)
('some time and maybe after', 1.0)
('the storm', 1.0)
('cracks , damages or anything that could go wrong', 1.0)
('survivors , been the survivor , you know , in', 1.0)
('his yacht was hit by a vicious storm with 14-metre-high', 1.0)
('my back has become stiff', 1.0)
('an IT engineer or something like that , would you', 1.0)
('that the boat would sail', 1.0)

Expected behavior
Expecting more meaningful who-did-what-to-whom listing one by one.

Log
Add a log to help explain your problem.

Versions (please complete the following information):
OS: MacOS 10.13.6
Python Version 3.6

Have a look at our paper, which you can find in our readme. We gladly accept PRs improving the extraction performance.