facebookarchive / flashback

Capture and replay real mongodb workloads

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pcap_converter (possibly) does not output compatible bson file

timvaillancourt opened this issue · comments

Hey guys,

It's very possible I'm lost/did something incorrect but using HEAD today I could not get a usable file out of pcap_converter for use for replay with the 'flashback --ops_filename=' option.

It seems the 'pcap_converter' go code outputs a JSON file of operations whereas the go 'flashback' tool expects BSON, as seen here: https://github.com/ParsePlatform/flashback/blob/master/ops_reader.go#L67. My guess is maybe JSON was the previous way of doing things before 'replay' moved to go?

When I use the json file outputted from pcap_converter as flashback's --ops_filename= option, flashback seems to ignore my file and execute zero operations although I have many (100s) in the file.

If this can be resolved by moving pcap_converter to output BSON instead of JSON, I am happy to make a PR for that, but I wanted to sanity check what didn't work here first and if more than just BSON/JSON changed.

Thanks!

An update: I've made a simple script to convert the output file from json -> bson however the captured ops still seems to be ignored when running "flashback -style=real" with the bson file. Here is an example of the converted bson file I have now:

$ bsondump mongo.2016-10-14.flashback.bson 2>/dev/null | head -n8
{"ntoskip":0,"ts":{"$date":"1970-01-18T02:07:51.739Z"},"command":{"ismaster":1},"ntoreturn":-1,"ns":"admin.$cmd","op":"command"}
{"ntoskip":0,"ts":{"$date":"1970-01-18T02:07:51.739Z"},"command":{"ismaster":1},"ntoreturn":-1,"ns":"admin.$cmd","op":"command"}
{"ntoskip":0,"ts":{"$date":"1970-01-18T02:07:51.739Z"},"command":{"ping":1},"ntoreturn":-1,"ns":"admin.$cmd","op":"command"}
{"ntoskip":0,"ts":{"$date":"1970-01-18T02:07:51.739Z"},"command":{"buildinfo":1},"ntoreturn":-1,"ns":"admin.$cmd","op":"command"}
{"ntoskip":0,"ts":{"$date":"1970-01-18T02:07:51.739Z"},"command":{"serverStatus":1},"ntoreturn":-1,"ns":"admin.$cmd","op":"command"}
{"ntoskip":0,"ts":{"$date":"1970-01-18T02:07:51.739Z"},"command":{"isMaster":1},"ntoreturn":-1,"ns":"admin.$cmd","op":"command"}
{"ntoskip":0,"ts":{"$date":"1970-01-18T02:07:51.739Z"},"command":{"getlasterror":1},"ntoreturn":-1,"ns":"test.$cmd","op":"command"}
{"ntoskip":0,"ts":{"$date":"1970-01-18T02:07:51.739Z"},"command":{"replSetGetStatus":1},"ntoreturn":-1,"ns":"admin.$cmd","op":"command"}

Yes, pcap_converter was added before we switched everything to BSON. Reading the ops in as JSON in go was problematic because, among other things, it broke the ordering of fields in the document, and made it impossible to replay some commands. pcap_converter was just overlooked when we made the change. Just an FYI, pcap_converter was kind of a one-off effort we did to capture a workload on a server that was too busy to do profiling. It was never used extensively and has probably more bugs than anything else, but I'm happy to review any PRs!

Resolved by merge of #39. Thanks @tredman!