bediger4000 / reservoir-sampling

Reservoir sampling as a Unix filter program

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reservoir Sampling Filter

This code makes a Unix-style filter program that chooses some number of lines from stdin at random, then prints those lines to stdout.

Building and Running

$ go build reservoir.go
$ ./reservoir 7 < /usr/share/dict/words
escutcheons
insinuator's
bone
cc
identification's
segues
comer
$

Input is from stdin, output on stdout. The mandatory command line argument is the number of lines to select from stdin.

This code based on Algorithm R from Wikipedia.

Daily Coding Problem: Problem #911 [Medium]

This problem was asked by Facebook.

Given a stream of elements too large to store in memory, pick a random element from the stream with uniform probability.

Analysis

Isn't this just reservoir sampling, with a reservoir size of 1?

About

Reservoir sampling as a Unix filter program

License:GNU General Public License v3.0


Languages

Language:Go 100.0%