mstefaniuk / experiment4j

A Java port of https://github.com/github/dat-science

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

experiment4j

Summary

A Java port of Github's Science framework.

Motivation

I read "Move Fast, Break Nothing" by Zack Holman, and I really wanted the capabilities of Github's Science framework in Java 8.

I work with a large production system where we are actively trying to replace a bunch of legacy code with newer implementations. Our core goals are to:

  • Improve performance
  • Maintain backwards compatibility

With those objectives, Science is exactly the thing we need. The ability to allow developers to easily and consistently express the capability to quickly perform these kinds of experiments with a DSL is vital.

Methodology

I based this implementation on reading the Science documentation, and building out the capabilities from the functionality expressed there. In other words, I did not focus on porting the implementation. No doubt, greater insight could be gained from spending more time on grokking the Ruby implementation, but it was faster for me to work directly from the docs, and build what I needed from my own understanding of the pattern.

Domain Concepts

There are only a few main domain concepts for this framework:

  • Experiment: A wrapper around 2 implementations of the same business logic: the Control and the Candidate, along with how to run trials.
  • Trial: A run of an Experiment that:
    1. Runs the Control and Candidate implementations in parallel (using a thread executor)
    2. Times the execution of both the Control and Candidate
    3. Compares the results for equality
    4. Pushes the comparison data (generated in 2 and 3 above) to a Publisher
    5. Returns the response of the Control to the calling code
  • Science: A Cache for Experiments.
  • Control: The default version of business logic to be run during an Experiemnt. The response from the Control will always be returned by an Experiment
  • Candidate: The experimental version of business logic to be run in an Experiment. The response from the Candidate will never be returned by an Experiment
  • Publisher: An interface for outputting the data generated by an Experiment, such as Control and Candidate response times, and whether the output of the two functions match. Usually, this will be an adapter to a Metrics framework, such as CodaHale Metrics.

DSL Syntax

Some examples of the DSL syntax are found here: ExperimentTest.java

Here is a contrived example Experiment that has been annotated expose all of the bells and whistles:

Person.java

    public class Person {
        private final String firstName;
        private final String lastName;

        public Person(String fname, String lname) {
            this.firstName = fname;
            this.lastName = lname;
        }

        public String getFirstName() { return firstName; }
        public String getLastName() { return lastName; }
    }

ExperimentExample.java

    // The type Parameters for an Experiment are
    // #1) <I> The input type of the candidate and control functions
    // #2) <O> The ouptut type of the candidate and control functions
    Experiment<Person, String> experiment = 

        // This experiment is named "my experiment".
        // This is important, as the name will be passed to 
        // the publisher, which can help distinguish it from other experiments.
        Experiment.<Person, String>named("my experiment")

        // The "control" method will always be performed 
        // The input type is the first type parameter (eg, Person)
        // The output type is the second type parameter (eg, String)
        .control( (Person p) ->  p.getFirstName() + " " + p.getLastName() ) 

        // The "candidate" method will be performed when 
        // "doExperimentWhen" BooleanSupplier returns true
        .candidate( (Person p) -> String.format("%s %s", p.getFirstName(), p.getLastName()) )

        // this Clock instance is used to generate the duration of the 
        // candidate and control executions.
        // It defaults to Clock.systemUTC(), but can be overridden to make testing easier
        .timedBy(Clock.systemUTC())

        // the experiment will be performed (that is, both the control and the candidate will be run)
        // when the doExperimentWhen BooleanSupplier returns true.
        // Otherwise, only the control will be run
        // In this case, the experiment will always be run 
        // See Selectors for some pre-implemented BooleanSelectors, or code your own 
        .doExperimentWhen(Selectors.always())  

        // the candidate and control are considered "equal" when Object.equals() 
        // returns "true" on the result of the simplifiedBy function
        .sameWhen(Objects::equals) 

        // By default, an Experiment returns the control result.
        // by setting the returnChoice function, you can determine
        // which result you want to use. 
        // See ReturnChoices for some pre-implemented Functions, or code your own 
        .returnChoice(ReturnChoices.alwaysCandidate())

        // if both the underlying Candidate and Control methods were to throw exceptions, 
        // this is how it would be determined if they were equal
        // See SameWhens for more choices
        .exceptionsSameWhen(SameWhens.classesMatch()) 

        // this will print the timing and the match status to System.out,
        .publishedBy(new PrintStreamPublisher<String>());


    // Science is a cache for experiments.
    // It takes a Supplier<Experiment> as its second argument.
    // Since a ExperimentBuilder is a Supplier<Experiment>, you can pass
    // an ExperimentBuilder as your 2nd argument
    Science.science().experiment("my experiment", () -> experiment);

    // For one experiment you run different trials for every input.
    // A trial is a Function<I, O> that mirrors the input and output
    // type paramenter of the candidate and control functions
    String presidentName = experiment.trial()
                                     .apply(new Person("George", "Washington"));
    assert presidentName.equals("George Washington");

    // If you put an experiment in the Science cache, you can also get
    // it from there.
    Experiment<Person, String> myExperiment = Science.science()
                                                     .experiments()
                                                     .get("my experiment");
    String authorName = myExperiment.trial()
                                    .apply(new Person("Raymond", "Carver"));

    assert authorName.equals("Raymond Carver");

About

A Java port of https://github.com/github/dat-science


Languages

Language:Java 100.0%