RamezIssac / django-slick-reporting

The Reporting Engine for Django. Create dashboards and standalone Reports and Charts.

Home Page:https://django-slick-reporting.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to specify distinct=True in aggregate

oaaoaa opened this issue · comments

commented

I need to compute Count("id", distinct=True) in my report. There seems to be no variable available in ComputationField that can help me specify "distinct=True". In fact the apply_aggregation method has

annotation = self.calculation_method(self.calculation_field)
queryset = queryset.aggregate(annotation)

The only way out I can see is that I subclass ComputationField and override the apply_aggregation method. It will be better if the CompuationField is changed in Slick Reporting so that it leverages the power of django aggregates. Why not simply add one more member ComputationField.annotation so that while creating a ComputationField the developer can specify the annotation using the Django aggregate syntax, e.g.

ComputationField.create(annotation=Count("id", distinct=True), verbose_name="Total Count")
ComputationField.create(annotation=Count("id", distinct=True, filter=Q(books__name__in=('Tom', 'John'))), verbose_name="Funky Count")

This way there is no need to even have q_filters, kwargs_filters while calling ComputationField.create. You can also get rid of calculation_mothod and calculation_field.

Hi @oaaoaa
Subclassing the computation field is the answer for your case, Correct. And it's not really hard.
Recommend way is to reuse your created ComputationField rather then creating one on the fly each time you need it .
As for the annotation, you can submit a PR I would be happy to review it.
Q_filters and kwargs_filtera are used in different contexts and are crucial, Any solution need to take them into account.

commented

While I have not had the time to work on PR, I have hacked a subclass in my code and it appears to be working well, allowing me to specify really complex aggregates.

class AggregateField(ComputationField):
    """
    Computation field where in the django aggregate expression is specified
    """
    
    """ The fully specified django aggregate object """
    aggregate_field = None

    @classmethod
    def create(cls, aggregate, method=None, field=None, name=None, verbose_name=None, is_summable=True):
        assert aggregate
        assert name
        assert not method
        assert not field 
        
        verbose_name = verbose_name or name
        report_klass = type(
            f"ReportField_{name}",
            (cls,),
            {
                "name": name,
                "verbose_name": verbose_name,
                "aggregate_field": aggregate,
                "calculation_field": None,
                "calculation_method": None,
                "is_summable": is_summable,
            },
        )
        return report_klass

    def apply_aggregation(self, queryset, group_by=""):
        annotation = {self.name: self.aggregate_field}
        if self.group_by_custom_querysets:
            return queryset.aggregate(**annotation)
        elif group_by:
            queryset = queryset.values(group_by).annotate(**annotation)
        else:
            queryset = queryset.aggregate(**annotation)
        return queryset

    def get_annotation_name(self):
        return self.name

    @classmethod
    def get_time_series_field_verbose_name(cls, date_period, index, dates, pattern):
        """
        Get the name of the verbose name of a computation field that's in a time_series.
        should be a mix of the date period of the column and it's verbose name.
        :param date_period: a tuple of (start_date, end_date)
        :param index: the index of the current field in the whole dates to be calculated
        :param dates a list of tuples representing the start and the end date
        :param pattern it's the pattern name. monthly, daily, custom, ...
        :return: a verbose string
        """

        if pattern == "quarterly":
            month = int(date_filter(date_period[0], "n"))
            qtr = ((month - 1) // 3) + 1
            year = date_filter(date_period[0], "Y")
            return f"{cls.verbose_name} Q{qtr}-{year}"
        return ComputationField.get_time_series_field_verbose_name(date_period, 
                                                                   index, 
                                                                   dates, 
                                                                   pattern)

With this field replacing ComputationField class, we can get rid of method and field completely. I can rewrite ComputationField along the lines of AggregateField removing all the superfluous references to field and method. If you think this along the right lines then I can work on PR when I get time.

I have experimented with really complex aggregates such as:

    aggregate_field = Round((Count("id", filter=Exists(Action.objects.filter(action__in=('T1', 'T2'),
                                                                             attrib=OuterRef("attrib"),
                                                                             fk1__fk2=OuterRef("d1__fk2"))))) * 1.0 /
                             NullIf(Count("id", distinct=True), 0), 
                             precision=2)

and

    aggregate_filed = Round(Count("d1__fk2__fk1__action", 
                                  distinct=True,
                                  filter=Q(d1__fk2__fk1__action__action__in=('P1', 'P2'))) * 1.0 /
                            NullIf(Count("id", distinct=True), 0), 
                            precision=1)