Better name for core class

Question

Better name for core class

afbarnard opened this issue 12 years ago · comments

I've been working on improving the documentation, and I think we need a better name for the core class, CurveData. A better name would communicate its purpose better, which is to use a ranking to compute ROC and PR curves and statistics. It is not just data. It is not a curve. What do you think @kboyd, @finnkuusisto? Any ideas?

The best I can come up with so far is CurveRepresentation but I believe a much better name exists. What about showing our penchant for humor and calling it CurveDaddy?

I think it is important to take care of this prior to the first release, self-documentation and all.

Aubrey Barnard · Answer 1 · Sat Mar 02 2013 02:22:06 GMT+0800 (China Standard Time)

What about CurveSource or CurveGenerator?

Captures the ideas that the object is not itself a curve and that it can generate curves.
Raises questions about multiple curves.

Maybe plain, old "Curve" is better?

Ignores the distinction between the actual points of a curve and the abstract concept of something that can calculate the points. This can be mitigated by context and distinguishing when referring to points.
Is much more clearly the center of attention.
Thinking about multiple curves/aggregates is more natural.

I had also thought of "Ranking" but @kboyd had a counterexample that I don't remember.

Expresses more precisely what the class is but requires more understanding (i.e. that curves are generated from a ranking).
Not clearly a central class in a library about curves.

Aubrey Barnard · Answer 2 · Tue Apr 02 2013 00:41:30 GMT+0800 (China Standard Time)

@kboyd and I were talking the other day about design and the idea of how R handles linear models came up causing us to consider modeling curves, so what about 'CurveModel' as the name for the core class?

Kendrick Boyd · Answer 3 · Tue Apr 02 2013 05:53:12 GMT+0800 (China Standard Time)

'CurveModel' sounds good, certainly better than 'CurveData' for what it does. Also makes it more natural to subclass for different estimation/model types such as fitting a binormal ROC curve.

Aubrey Barnard · Answer 4 · Tue Apr 02 2013 06:01:18 GMT+0800 (China Standard Time)

I am in favor of "Curve" until the complexity of things suggests otherwise. Will "CurveModel" help us understand the design/code at this point? We may eventually want both a "Curve" and a "CurveModel". At this point I just want to keep the API simple and make a release.

Kendrick Boyd · Answer 5 · Tue Apr 02 2013 06:22:04 GMT+0800 (China Standard Time)

'Curve' makes it sound like each instance defines a general curve and nothing more. But there is more going on, the curves are generated from ranking data (not the typical function used to define a curve), and both ROC and PR curves can be obtained from an instance.

Aubrey Barnard · Answer 6 · Tue Apr 02 2013 06:25:30 GMT+0800 (China Standard Time)

But perhaps it is clear from the project context that any curves are ROC/PR curves and not general mathematical functions?

Kendrick Boyd · Answer 7 · Tue Apr 02 2013 06:26:30 GMT+0800 (China Standard Time)

Probably true, I don't really have a strong preference either way so if you prefer 'Curve' let's use that.

Aubrey Barnard · Answer 8 · Tue Apr 02 2013 06:30:14 GMT+0800 (China Standard Time)

OK, at least for now (version 0.1.0) we'll go with "Curve". We'll see if there are better names as we do design down the road. I don't mind changing names, especially if the names increase clarity/documentation. "CurveModel" is a good name so we'll keep it in our pocket for later.

Aubrey Barnard · Answer 9 · Tue Apr 02 2013 06:57:35 GMT+0800 (China Standard Time)

BTW, I felt that this was a good and productive discussion. So thanks.

Aubrey Barnard · Answer 10 · Tue Apr 02 2013 06:58:06 GMT+0800 (China Standard Time)

Resolved in commit 4915a58.

Aubrey Barnard · Answer 11 · Sun Jul 13 2014 09:41:01 GMT+0800 (China Standard Time)

This has been bugging me again. "Curve" just is not descriptive enough. "Ranking"? "CurveModel"? "ConfusionMatrixRanking"? "ClassificationThresholds"? "ClassificationAnalysis"? Surely there's something better?

Kendrick Boyd · Answer 12 · Sun Aug 10 2014 03:00:14 GMT+0800 (China Standard Time)

I agree "Curve" is not descriptive and is a bit misleading. I think it would be useful to have "Ranking" in the name. So I'd lean towards "Ranking" for simplicity, or else "ConfusionMatrixRanking" or "PredictionRanking" for something more specific.