WordPress / gutenberg-license

Stores the grants of consent and tooling for Gutenberg's dual license

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Readme

This repository holds the information relevant to the requests for consent from Gutenberg's contributors to dual-license their past contributions to Gutenberg under the GPL and MPL. For more information, you can look at the initial discussion issue on GitHub as well as the initial proposal post on 21 September 2020, and updated proposal post on Make/Core.

Format of dual-license-responses.json

The dual-license-responses.json file records responses relevant to the work to update Gutenberg to be dual-licensed from #31893 and #31913.

The json has two top level nodes. The claimedEmails node includes an object in claimedEmails.responses for each email address that has contributed to Gutenberg's trunk branch (i.e., by being an author or co-author of a commit on that branch) but was not associated with any GitHub account.

The gitHubUserContributors node includes an object in gitHubUserContributors.responses for each GitHub user who has contributed to Gutenberg's trunk branch. Once a grant or denial of consent is received for a particular contributor, the relevant gitHubUserContributors.responses object will be updated with a consent boolean, with true representing a grant of consent to dual-licensing the user's past contributions and false representing a denial of consent. If there is no consent field for a particular gitHubUserContributions.responses object, that means we have not received either a grant or denial of consent.

When a relevant comment is noted for either the claimedEmails or gitHubUserContributors, information about that comment is recorded in the comment field on the relevant node.

Checking the Latest Responses

Execute node scripts/update-dual-license-responses.js to update dual-license-responses.json with the latest comments on the relevant GitHub issues. This script has been tested with node v14.16.0. In order to run this script, you must have the GitHub CLI installed and configured (the GitHub CLI handled pagination of the search results).

Running that script will download all the comments on each issue, allow the user to process each new comment individually (previously processed comments are skipped), and update the dual-license-responses.json file appropriately based on the user's input.

This script will automatically run summarize-dual-license-responses.js, which prints out some current status of the requests for consent and updates the status.md file.

How Was dual-license-responses.json Initialized?

The dual-license-responses.json json file was initialized with:

  1. All the GitHub users who have contributed to Gutenberg; and
  2. All of the email addresses that authored a commit on Gutenberg's trunk branch but was not associated with any GitHub account. We want to identify the GitHub users behind these commits and ask those GitHub users to consent to dual-license their past contributions.

How Gutenberg’s Contributors Were Identified

This data was collected on April 16, 2021

The easiest way to execute the GraphQL queries listed below is by using the GitHub CLI to execute a command along these lines gh api graphql --paginate -f query='query(...', which will handle authentication and pagination for you.

Contributors With GitHub Logins

All Merged and Open PRs

Saved as 2021-04-16_merged-or-open-prs.json

GraphQL query
query($endCursor: String) {
  repository(owner: "WordPress", name: "gutenberg") {
    pullRequests(states:[MERGED, OPEN] first: 100, after: $endCursor) {
      totalCount
      pageInfo {
        endCursor
        hasNextPage
      }
      nodes {
        state
        number
        author {
          login
        }
        mergedBy {
          login
        }
      }
    }
  }
}

Note that this query returns all PRs through the date the script is run.

Although this query captures logins of both the "author" and the person that merged each PR, the data shows that there is no one who has merged a PR who has not also authored at least one PR. In other words, including logins from the "mergedBy" field does not add any additional contributors beyond those captured by the "author" field.

The authors of the PRs were extracted by piping the output of the above api request for all PRs to jq as follows:

cat 2021-04-16_merged-or-open-prs.json \
  | jq --raw-output '.data.repository.pullRequests.nodes[].author.login' \
  | sort \
  | uniq

All Authors and Co-Authors of commits on trunk

Saved as all_commits_before_2021-04-16.json with 743 unique logins.

GraphQL query
query($endCursor: String) {
  repository(owner: "WordPress", name: "gutenberg") {
    object(expression:"trunk") {
      ... on Commit {
        history(first:100 after:$endCursor, until:"2021-04-16T00:00:00") {
          pageInfo {
            hasNextPage
            endCursor
          }
          nodes {
            committedDate
            oid
            author {
              email
              name
              user {
                login
              }
            }
            authors(first:100) {
              pageInfo {
                hasNextPage
              }
              nodes {
                email
                name
                user {
                  login
                }
              }
            }
          }
        }
      }
    }
  }
}'

Note that although that query only retrieves the first 100 "authors" on each commit (and does not paginate on that if there are more), I confirmed that there were not more than 100 authors on any commit (the highest was actually 8):

$ cat all_commits_before_2021-04-16.json \
    | jq '.data.repository.object.history.nodes[].authors.pageInfo.hasNextPage' \
    | sort \
    | uniq
false

The list of all "authors" on all these commits included every "author" returned here, so we are fine just including everyone listed as in the "authors" object in the above response:

cat all_commits_before_2021-04-16.json \
  | jq --raw-output '.data.repository.object.history.nodes[].authors.nodes[].user.login' \
  | sort \
  | uniq

Resulting data

All contributors for which we have GitHub logins were determined by combining the users with commits on trunk with the users who had opened or merged any Gutenberg PR.

These users were used to initialize the gitHubUserContributors.responses array in dual-license-responses.json.

Contributors Without An Associated GitHub Account

These users were extracted from the "co-authors" of any PRs on trunk (every commit "author" was covered by the following "coauthors" query), where the email did not have an associated github account (the "user" was null).

cat all_commits_before_2021-04-16.json \
  | jq '.data.repository.object.history.nodes[].authors.nodes[]
         | select(.user.login == null)
         | .email' --raw-output \
  | sort \
  | uniq

These emails were used to initialize the claimedEmails.responses array in dual-license-responses.json.

About

Stores the grants of consent and tooling for Gutenberg's dual license


Languages

Language:JavaScript 100.0%