benbogin / spider-schema-gnn-global

Author implementation of Global Reasoning over Database Structures for Text-to-SQL Parsing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is the input during decoding if previous step selects a column header / table name?

entslscheia opened this issue · comments

commented

I saw that there is an option self._decoder_use_graph_entities in your code. Based on this part of code, I guess this option basically means that you are using the specific embedding for a column header (table name) or just using a type embedding for all columns (tables) as the input during decoding. I think in your previous paper, you are using type embedding instead of entity embedding. Am I right about this?

What I don't understand is why there is such an option. I mean why should we consider using only type embedding as input? It seems too coarse to only use a type embedding. Intuitively, I feel it's essential for us to know what exact entity has been selected in the previous step, not just its type. For example, selecting column_a and selecting column_b should have different influences on the subsequent prediction. Also, I think all other papers on spider I've read use embedding of specific entities instead of type embedding.

I am really confused about this. Any suggestions will be greatly appreciated.

Hey, actually in both papers this setting is set to true (previous one - https://github.com/benbogin/spider-schema-gnn/blob/master/train_configs/defaults.jsonnet#L32)

Indeed the parser will work worse if not using this option, but actually the difference will not be too big. The reason is that the decoder decodes keywords based on the entity linker score. The embedding you're referring to is only given to the decoder after a column/table was decoded, but most of the decision is based on the entity linker - which is basing its score on the score given to each column/table with the attended words.

Hope that helps

commented

That makes sense. Many thanks!