google / flax

Flax is a neural network library for JAX that is designed for flexibility.

Home Page:https://flax.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Track Github metrics over time

marcvanzee opened this issue · comments

We currently have little insight into how well we are maintaining our Github page.

It would be useful to have some way of tracking some metrics over time, to see whether we are improving / getting worse.

Some things we could track:

  • Issue resolution time (how long does it take before we close an issue) (e.g., as in isitmaintained.com)
  • Number of open issues (isitmaintained.com)
  • Issue response time (how long does it take before we reply to an issue)

As a motivation: when querying isitmaintained.com on April 2022, we get the following scores for "issue resolution time":

  • Flax: 21d
  • JAX: 4d
  • Tensorflow 8d
  • Pytorch: 6d

Clearly we can improve here as Flax!

Some suggestions from @cgarciae:

  • We could write a script that gets statistics per month using the Github API.
  • It could save the results in a CSV.
  • We could then run a Github action as cronjob and retrieve these numbers automatically ever week/month.

Assigning this to @cgarciae since he would like to look into this and ask some other folks who have experience with this.

Someone from the Numpy team recommended us to look at this script:

https://github.com/scientific-python/devstats-data/blob/4c022961abc4ca6061f8719d9c3387e98734b90c/query.py

It feeds this page where they have some stats about various packages:

https://devstats.scientific-python.org/

Adapting that script I could get the following info.

Issues

[
    {
        "cursor": "Y3Vyc29yOnYyOpHOIPZ9Dw==",
        "node": {
            "number": 5,
            "title": "Flattening parameters",
            "createdAt": "2020-01-21T17:31:37Z",
            "state": "CLOSED",
            "closedAt": "2020-03-27T07:47:35Z",
            "updatedAt": "2020-03-27T07:47:35Z",
            "url": "https://github.com/google/flax/issues/5",
            "labels": {
                "edges": []
            },
            "timelineItems": {
                "totalCount": 4,
                "edges": [
                    {
                        "node": {
                            "__typename": "IssueComment",
                            "author": {
                                "login": "avital"
                            },
                            "createdAt": "2020-01-22T09:42:42Z"
                        }
                    },
                    {
                        "node": {
                            "__typename": "IssueComment",
                            "author": {
                                "login": "avital"
                            },
                            "createdAt": "2020-03-06T09:16:43Z"
                        }
                    },
                    {
                        "node": {
                            "__typename": "IssueComment",
                            "author": {
                                "login": "marcvanzee"
                            },
                            "createdAt": "2020-03-27T07:47:35Z"
                        }
                    },
                    {
                        "node": {
                            "__typename": "ClosedEvent",
                            "actor": {
                                "login": "marcvanzee"
                            }
                        }
                    }
                ]
            }
        }
    },
   ...
]

PRs

[
    {
        "cursor": "Y3Vyc29yOnYyOpHOFYqJWQ==",
        "node": {
            "number": 1,
            "state": "CLOSED",
            "title": "Project directory restructure.",
            "createdAt": "2020-01-10T11:11:17Z",
            "baseRefName": "prerelease",
            "mergeable": "CONFLICTING",
            "author": {
                "login": "Britefury"
            },
            "authorAssociation": "CONTRIBUTOR",
            "mergedBy": null,
            "mergedAt": null,
            "reviews": {
                "totalCount": 0
            },
            "participants": {
                "totalCount": 4
            }
        }
    },
    ...
}

This is a very good start. We need to properly define what metrics we want to report. I'll create a couple of suggestions next.

Metrics

During the last N (6?) months:

  • issue-response-time: Time between creation and the first label assignment or conversion to a discussion. This means that if a regular user responds it doesn't count. (Can users select labels?)
  • issue-resolution-time: Time between creation and closed. Not sure what happens to issues that are converted to discussion, maybe just ignore those and have a separate metric for discussions.
  • pr-response-time: Time between creation and reviewer is assigned.
  • discussion-response-time: Time between creation and first comment.
  • discussion-resolution-time: Time between creation and marked answered.