inspiring71 / batcore_base

Baselines and testing framework for code reviewer recommendation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BaT CoRe: Baselines and testing framework for code reviewer recommendations

Framework

This repository provides a framework for testing code review recommendation algorithms.

Model creation

  • RecommenderBase is a simple interface for a generic recommender algorithm. It has a fit method that trains model on a given data and a predict method that predicts reviewers for the given pulls.
  • BanRecommenderBase is an in interface derived from RecommenderBase that implements by candidate filtering by their activity and ownership of the pull request

Dataset creation

  • DatasetBase is an interface that encapsulates any preprocessing needed for the dataset with a method preprocess.

  • GerritLoader is a class that deals with the data loader from the PullExtractor tool: loads it, reformats it, and performs some high-level preprocessing (empty data removal, bot removal, alias matching, user encoding)

  • StandardDataset an implementation of DatasetBase. Receives GerritLoader object and performs all the preprocessing that can be needed for a specific model

  • SpecialDatasets has implementations of datasets that are unique to a specific recommender

  • get_gerrit_dataset helping function for dataset creation. Dataset for a specific model can be created as get_gerrit_dataset(gerritloader_object, model_cls=RecommenderBase_class)

  • StreamLoaderBase is an interface that encapsulates iteration over data. The interface views data as a temporal stream of events and yields pairs of consecutive segments of the stream - train (any set of events), test (a pull request event)

Testing of models

  • TesterBase is class that has test_recommender method. This method takes RecommenderBase and StreamLoaderBase and iterates over the dataset, retrains model on new train data, and calculates its predictions.
  • RecTester an implementation of TesterBase that calculates metrics standard for recommendation systems (mrr, accuracy, etc)
  • SimulTester an implementation of TesterBase that tester for project-based metrics on a simulated history (Mirsaeedi, Rigby, 2020). Metrics for the simulated testing can be found in Counter.

An example of usage can be found in example.py. There are two modes to run the application; please comment/uncomment the method of your choice in the file accordingly.

Baselines

Framework contains implementations of the following models (baselines\models)

  • RevFinder, Who Should Review My Code?, Thongtanunam et al., 2015
  • Tie, Who Should Review This Change?, Xin Xia et al., 2015
  • ACRec, Who Should Comment on This Pull Request?, Jiang et al. 2017
  • cHRec, Automatically Recommending Peer Reviewers in Modern Code Review, Zanjani et al., 2015
  • CN, What Can We Learn from Code Review and Bug Assignment?, Yu et al., 2015
  • RevRec, Search-Based Peer Reviewers Recommendation in Modern Code Review, Ouni et al., 2016
  • WRC, Automatically Recommending Code Reviewers Based on Their Expertise: An Empirical Comparison, Hannebauer et al., 2016
  • xFinder, Assigning change requests to software developers, Kagdi et al., 2011

About

Baselines and testing framework for code reviewer recommendation

License:Apache License 2.0


Languages

Language:Python 100.0%