safing / mmdbmeld

Build your own .mmdb geoip database.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Any change for array/slice merging feature?

nonetallt opened this issue · comments

Background:

I'm working on a system that processes a large number of ip blacklists. I would like to take these blacklists and add them together into one mmdb file which I could then query once to get all the blacklists associated with the given ip.

I'm thinking that an array/slice column type would be ideal for this purpose (add "blacklists" field to each record). The problem is that it doesn't seem like this library supports the array column type. It would be awesome if there was a feature for merging arrays together using this tool.

Unfortunately, I'm not particularly familiar with Go but I read from the maxmind docs that they support the array type and looked around the mmdbwriter issues and source code where I saw "slices" mentioned in the context of arrays so I'd imagine this should be possible.

Request:

Any chance of array merging or something similar getting implemented? If not, I would also appreciate any alternative ideas on how to implement what I'm after. I also considered programmatically generating a new field in the configuration for each blacklist but that would probably end up being a pain to work with during deserialization, especially when using a type strict language like java.

Edit: in my case, a delimited string could also be an option but I don't see a way of merging strings like that either.

Lastly, thank you for creating this amazing tool in the first place 👍

Yes, slices (a Go type) is a variable size array (which is fixed size in Go). So, yes, support would be possible.

This is also what I jumped to: How would you merge these blacklists? This also needs a merger for the array type.

Also, how would you split a value in a csv into fields?

This is also what I jumped to: How would you merge these blacklists? This also needs a merger for the array type.

Say I have a blacklist csvs that looks something like:

from,to,blacklist_name
127.0.0.1, 127.0.0.1, foo

from,to,blacklist_name
127.0.0.1, 127.0.0.1, bar

I would add the names foo and bar into an array type column so the resulting blacklist is something like

from,to,blacklist_names
127.0.0.1, 127.0.0.1, "[foo,bar]"

Also, how would you split a value in a csv into fields?

Using csv escape characters / custom value type denotation using brackets or something similar I'd imagine. My personal use case wouldn't necessarily require any csv values to be read as arrays, only the result should be an array with one or more values.

Try this: #6

type: array:string
delimiter is whitespace for now, but in your case is does not matter anyway.

config: merge.mergeArrays: true

Just braindumped my ideas right now, don't have time for testing.
Biggest issue is that there is no way to sort the values and thus no way to remove easily remove duplicates.

Try this: #6

type: array:string delimiter is whitespace for now, but in your case is does not matter anyway.

config: merge.mergeArrays: true

Just braindumped my ideas right now, don't have time for testing. Biggest issue is that there is no way to sort the values and thus no way to remove easily remove duplicates.

Thank you so much!

It's taken me a while to get into testing this out but the current solution seems to at least work with a minimal test of merging two files with 1 row each.

Great to hear it worked! I merged the PR, so you can switch back to master.