Few comments
freud14 opened this issue · comments
Hi,
Here is a few comments on the library.
Let's start with the big one. I noticed that the documentation web site documents almost everything in the library. Is it on purpose? I mean, in software engineering, if you document it, it means that you intend to support all these different parts and that their interfaces should remain stable. In my opinion, you should document only the parts that are essential to you the library and keep the other parts only as backends. Maybe it is because you want to support training as we can see in issue #11, so you intend for the user to use these parts for training? Anyway, just some thoughts for you. I would like to know what are your intentions.
Now, a few comments on the code I looked at.
I think there should be default values for the parameters of the AddressParser
class. What I would suggest is that it should use the device 0 by default if it exists, otherwise just use the CPU. Maybe choose a default model too.
When tagging an address, it would be nice if the return was a dictionary where the keys are the tags and the values are the words. For instance, instead of this:
{'350 rue des Lilas Ouest Québec Québec G1L 1B6': {'350': 'StreetNumber',
'rue': 'StreetName',
'des': 'StreetName',
'Lilas': 'StreetName',
'Ouest': 'StreetName',
'Québec': 'Province',
'G1L': 'PostalCode',
'1B6': 'PostalCode'}}
it could be something like this:
{'350 rue des Lilas Ouest Québec Québec G1L 1B6': {'StreetNumber': '350',
'StreetName': 'rue des Lilas Ouest',
'Province': 'Québec',
'PostalCode': 'G1L 1B6'}}
Notice how some tags where merged and the keys and values are inverted. Maybe there could be a flag if you want the other way. Or, better yet, you could return an object.
Alright, that's it for now.