Table to Graph Importer CSV example
karrtikiyer opened this issue · comments
Memgraph version 2.6.1
I am trying to reproduce the one to many example shown here: https://memgraph.com/docs/gqlalchemy/how-to-guides/table-to-graph-importer
address: [] # currently needed, leave [] if no relations to define
individuals:
- foreign_key: # foreign key used for mapping;
column_name: add_id # specifies its column
reference_table: address # name of table from which the foreign key is taken
reference_key: add_id # column name in reference table from which the foreign key is taken
label: LIVES_IN # label applied to relationship created
from_entity: False # (optional) define direction of relationship created
Can someone please help me with CSV for the same?
Also how should ONE to ONE relationship be specified? Please help.
Thanks in advance.
Also the -
at the start of foreign_key
seems to be misplaced in the example of yaml above, can someone help with a proper config file for one to many relationship configuration?
Just to add, this is related to discussion on Discord.
Also logically in the above example of individuals
& address
, it is many to one, wherein 1 individual
can have multiple addresses
, so shouldn't the foreign key be present in the address table? And so the configuration address: []
should have the foreign key to individual ID
?
Request you to also please provide example for Many to Many Configuration and CSV?
For Many to Many, should the relations be in a separate CSV file? or should we duplicate rows in the referenced table?
E.g. if we have Student to Teacher relation ship, it is many to many, one student can have many teachers and one teacher can have many students. Where should we place the relationship between student and teacher? Should it be placed in a separate CSV file?
Hello @karrtikiyer !
For CSV files you can use the CSVLocalFileSystemImporter
.
One to one relationships can be defined with the one_to_many_relations
field. Let me know if you want me to provide an example, but you simply declare the foreign_key
in the config file.
You are right about the wrong formatting, the proper way to write it would be:
one_to_many_relations:
address: []
individual:
- foreign_key:
column_name: add_id
reference_table: address
reference_key: add_id
label: LIVES_IN
Thanks @brunos252 , an example of CSV along with corresponding config would help.
For many_to_many_relations
, the config part would look like:
many_to_many_relations:
incident_individual:
foreign_key_from:
column_name: inc_id
reference_table: incident
reference_key: inc_id
foreign_key_to:
column_name: ind_id
reference_table: individuals
reference_key: ind_id
label: INCIDENT
Where you would need to have a separate associative table (incident_individual.csv) which would contain fields of foreign keys like:
inc_id,ind_id,relation
72,23,DRIVER
12,21,PASSENGER
...
The individual.csv
for the above example of one_to_many_relations
would look like:
ind_id,name,surname,add_id
1,Tomislav,Petrov,1
2,Ivan,Horvat,3
3,Marko,Horvat,3
4,John,Doe,2
5,John,Though,4
While the address.csv
would be:
add_id,street,street_num,city
1,Ilica,2,Zagreb
2,Death Valley,0,Knowhere
3,Horvacanska,3,Horvati
4,Broadway,12,New York
Hope this answers all your questions, and let me know if there is anything else, so we can update the how-to guide :)
Thanks @brunos252 : I think for the below example the foreign key should be in the address table and not individual. Like every address should have link to the individual.
one_to_many_relations:
address: []
individual:
- foreign_key:
column_name: add_id
reference_table: address
reference_key: add_id
label: LIVES_IN
Also @brunos252 in this link:
It states Loading a CSV file from the local file system
But the code shown is of ParquetLocalFileSystemImporter, I think this should be CSVLocalFileSystemImporter:
importer = ParquetLocalFileSystemImporter(
data_configuration=parsed_yaml,
path="/home/user/table_data",
)
importer.translate(drop_database_on_start=True)
@karrtikiyer
As you can see above I created a PR to fix the bugs in the how-to, and will try to add more stuff. Thank you for your inputs!
Regarding the individual/address foreign keys, would it not be preferable in that case to have an associative table? Because although an individual could have multiple addresses, there could also be multiple individuals on a single address, so it feels wrong to have a foreign key in the address table. This is an example anyway so I don't consider it much of an issue, but feel free to comment if you think differently.
Thanks @brunos252 , I understand that it is just an example, I only pointed it out since it was an example quoted for One to many. Ideally I agree it should be many to many. But if we want to quote it as one to many, I think the right way would be to have foreign key in address. Hence I suggested. But I am okay overall.
I am closing this issue because Bruno improved the how-to guide and provided the example. @karrtikiyer if you have more questions please open a new issue or ask on Discord.