Supporting multiple instances of a built-in entity

Question

Supporting multiple instances of a built-in entity

srozga opened this issue 6 years ago · comments

I have a conversational flow that supports collecting data to execute a stock trade. It gathers about six pieces of data, such as an action (buy/sell), expiration (today, good until canceled), type (market, limit), quantity and ticker. If the type is limit, I also need a limit price.

Turns out that limit price and the quantity are both number entities (duh). This is great because I get to use the power that LUIS number entities bring to the picture. However, it gets weird in the API Callback because I get an array of 2 values for that entity. What is the best way to "tag" the builtin.number values. My approach is to create entities called LimitPrice and Quantity and have my trade code resolve the value from the number entity by matching the value of LimitPrice to the right builtin.number entity by text. Same for Quantity. Is this the best approach?

Thanks!
-s

Lars Liden · Answer 1 · Thu Jun 14 2018 05:02:35 GMT+0800 (China Standard Time)

I'd experiment with training your own custom entities to recognize Limit Price and Quantity and avoid the builtin.number altogether. With the surrounding context, it the custom entities should be able to learn the number extraction pretty easily.

Szymon Rozga · Answer 2 · Thu Jun 14 2018 20:19:53 GMT+0800 (China Standard Time)

I'll go with that approach. Should be easy for sure.

One of the benefits of using a builtin.number, or even builtin.datetime, is that I can write the number as numerical or words and it resolves it correctly. Likewise, builtin.datetime lets us do fun things like "tomorrow at 1pm" or "a week from this friday" and resolves the utterance to the right date, time or datetime representation. This is particularly relevant if integrating with voice systems like Alexa (If you use ASK's LITERAL entity, it passes through the utterance as words).

Now, I can definitely import libraries like Chrono or Words To Numbers, but that's extra dev work and devs are lazy. May be worth considering making this easier.

Thanks!
-s

Szymon Rozga · Answer 3 · Thu Jun 14 2018 20:36:03 GMT+0800 (China Standard Time)

This also brings up a second issue, which is. I have trained quite a few dialogs using the builtin.number entity. It seems I can't go into the UI to edit any of the initial entity extraction. Might be a bug. I'll open an issue in the UI repo.

Lars Liden · Answer 4 · Thu Jun 14 2018 23:28:18 GMT+0800 (China Standard Time)

builtin.datetime is very powerful I wouldn't suggest trying to learn that. For number however, you should be able to learn written numbers (i..e. "two") with only a handful of examples.

Lars Liden · Answer 5 · Thu Jun 14 2018 23:30:46 GMT+0800 (China Standard Time)

With regard to editing, it's one of the reasons we are currently in limited "Labs" release. There are still some larger feature (including editing) that we are currently working on. That being said, you should be able to change the entity extraction (but then you'll lose the remaining half of the dialog). Are you unable to do that?

Szymon Rozga · Answer 6 · Thu Jun 14 2018 23:37:43 GMT+0800 (China Standard Time)

@LarsLiden I opened an issue in the UI repo for it. microsoft/ConversationLearner-UI#605

Szymon Rozga · Answer 7 · Fri Jun 15 2018 03:53:23 GMT+0800 (China Standard Time)

I would guess though that training all types of numbers with utterances like "four hundred fifty two thousands and thirty five" would take a bit of time. Likewise, although I'm using numbers for now, I expect to include the currency entity as well. For example, a trade limit price might be communicated as "fifty four dollars and thirty two cents".

This is for sure a bit of a contrived example, but it seems it should be easier to correlate multiple prebuilt instances of the same type with their intent. Maybe a named Entity can be associated to a prebuilt type in certain turns?

Lars Liden · Answer 8 · Fri Jun 15 2018 04:21:10 GMT+0800 (China Standard Time)

We'll put some more thought to this. It's definitely something we need to address

Szymon Rozga · Answer 9 · Fri Jun 15 2018 04:24:02 GMT+0800 (China Standard Time)

Thanks! I'll work around for now.

Lars Liden · Answer 10 · Sat Jun 16 2018 01:23:42 GMT+0800 (China Standard Time)

Incidentally, the full text string is passed back to the EntityDetectionCallback, so you could probably use a simple adjacency algorithm to predict the association.

Let's say the user types: "4 apples and 3 pears"
You'd get: Number: [4,3] Fruit: [apple, pear]

You can then associate "4" with "apple" and "3" with "pear" based on their position in the user utterance

Matt Mazzola · Answer 11 · Sat Jun 16 2018 02:09:37 GMT+0800 (China Standard Time)

@srozga Can you explain a bit more about the steps you went through to create these train dialogs?
Specifically did you do any manual manipulation of entities such as copying/modifying them using memoryManager or if you created entity used it in a train dialog, and deleted it later?

I was looking through the data you sent and saw something suspicious.

On one of your training dialogs there was input: "quote aapl" and appl was extracted as entity QuoteItem which is all correct, but somehow the data saved was:

{
    "userText": "aapl",
    "displayText": "aapl",
    "builtinType": "LUIS",  <-- This line is incorrect, and we're not sure how this happened.
    "resolution": {}
}

Other than this odd piece of data, we have confirmed the general issue with viewing pre-builts in train dialogs and are working on fix.

Szymon Rozga · Answer 12 · Sat Jun 16 2018 04:28:35 GMT+0800 (China Standard Time)

@mattmazzola sure. The API Callbacks that I created either remove entities and add "unknown" entities to symbolize an invalid value or do some sort of processing on the entities and then forget the values. I never did anything value or do some sort of processing on the entities and then forget the values. I never did anything to modify the values themselves. During the development using the UI I did run into a quite a few cases where I had to change some validation logic midway into building out a train dialog, so I decided to abandon the training, rebuild my app, start it again, refresh UI and build the dialog. I did this a lot. However, I only did this for the trade dialogs. The quote one was pretty vanilla/easy so not sure what happened.

Just played around with UI a bit. Create a trained dialog and didn't unmarked QuoteItem, saved that, then retrained with the right QuoteItem and the export is looking fine.

Hope this helps.

Szymon Rozga · Answer 13 · Sat Jun 16 2018 07:24:57 GMT+0800 (China Standard Time)

One more thought comes up. I started this up I think ok the 621.0 version of the SDK and updated to latest once I was mostly done with the model. So not sure if it’s something related to that upgrade.

Matt Mazzola · Answer 14 · Thu Jun 28 2018 04:16:42 GMT+0800 (China Standard Time)

We have deployed a hotfix to the service to fix issue with pre-built information.
In order to make use of the changes you would also have to use the updated SDK. The develop branch has example with the latest SDK, or you can try simply updating the @conversationlearner/sdk package to the next tag. It does have some other breaking changes though.

I think one of them is changing of variable name: CONVERSATION_LEARNER_APP_ID to CONVERSATION_LEARNER_MODEL_ID. We also add new onSessionStart and onSessionEndCallback signatures, but if you weren't using those I think it shouldn't require changes.

We're still working on if we want to hotfix to master branch here or simply merge, but at least you should be unblocked.

Szymon Rozga · Answer 15 · Thu Jun 28 2018 07:12:30 GMT+0800 (China Standard Time)

Thank you I’ll take a look this week and provide feedback if something comes up!