List.duplicates to find duplicate elements in a List or Iterable
jonasfj opened this issue · comments
It would be nice to have a .duplicates
method that given a List
returns the Set
of elements that appear more than once in the List
.
My particular use-case was to check if a configuration list had duplicates.
If so, throw a FormatException
.
It's easy to check for duplicates with list.length != list.toSet().length
.
But for good error messages, it's nice to have the list of duplicates.
It could be something like:
extension<T> on List<T> {
/// Return elements that appear more than once in this [List].
Set<T> duplicates() {
final duplicates = <T>{};
final N = length;
for (var i = 0; i < N; i++) {
final candidate = this[i];
for (var j = i + 1; j < N; j++) {
if (candidate == this[j]) {
duplicates.add(candidate);
break;
}
}
}
return duplicates;
}
}
It could probably be written more efficiently. Maybe, one could sort by hashCode
or something, but it's not obvious that allocating a new list and sorting is any faster than doing the naive O(n²) thing. Maybe, if T
is an object more complex than String
or int
, it might make sense to sort and being smart.
Open questions:
- Should this operate on
List<T>
orIterable<T>
?
(Maybe,Iterable<T>
if we are to allocate and list for sorting, anyways) - Should this be an extension method or a top-level function?
(maybe, a function is fine, it's not exactly something you'll frequently need) - Should this accept an
Equality
as an optional argument? - Should this return a
List<T>
,Set<T>
orIterable<T>
?
(maybe,Iterable<T>
offers best flexibility if using a sort to find duplicates)
This is fairly specialized.
Removing duplicates makes sense.
That's what toSet
does.
Counting occurrences makes some sense, of you want to account for all occurrences, but want to handle sale the dollars together, instead of in the original list order. Can use groupBy
for that.
Handling only the ones that have duplicates, and not the rest, seems more uncommon.
I'd just tell people to do
list.groupBy(id).values.where((e) => e.length > 1)
This is a pretty decent solution!
But hard to discover. Maybe, I'll submit an example in documentation for the groupBy
function :D