rpm-software-management / rpmlint

Tool for checking common errors in rpm packages

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Speed up message filtering

marxin opened this issue · comments

Right now, we have quite a long list of regular expressions that are used for message filtering (Filters in TOML configuration).
This can be very slow when a huge number of warnings/errors is emitted:

$ ./lint.py /tmp/binaries/python310-botocore-1.29.45-1.1.noarch.rpm -t -c configs/openSUSE
...
    Check                            Duration (in s)   Fraction (in %)  Checked files
    FilesCheck                                   9.5              95.9               

if I print all messages, I get 35K messages of cross-directory-hard-link titles. All these are filtered out by:

Filters = [
...
    '.*cross-directory-hard-link.*',

we might want to consider having a separate list of message titles that can be quickly searched.

A prototype patch:

diff --git a/configs/openSUSE/opensuse.toml b/configs/openSUSE/opensuse.toml
index fdb7ad1c..67b3937d 100644
--- a/configs/openSUSE/opensuse.toml
+++ b/configs/openSUSE/opensuse.toml
@@ -31,6 +31,10 @@ DisallowedDirs = [
     "/etc/NetworkManager/dispatcher.d",
 ]
 
+FilterErrorTitles = [
+    'cross-directory-hard-link',
+]
+
 Filters = [
 # Stuff autobuild takes care about
     '.*invalid-version.*',
@@ -41,7 +45,6 @@ Filters = [
     '.*non-versioned-file-in-library-package.*',
     '.*hardcoded-path-in-buildroot-tag.*',
     '.*no-buildroot-tag.*',
-    '.*cross-directory-hard-link.*',
 
 # Do not validate package rpm groups
     '.*devel-package-with-non-devel-group.*',
diff --git a/rpmlint/filter.py b/rpmlint/filter.py
index db1a2c94..3519dacf 100644
--- a/rpmlint/filter.py
+++ b/rpmlint/filter.py
@@ -32,6 +32,7 @@ class Filter:
         self.strict = config.strict
         # list of filter regexes
         self.filters_regexes = [re.compile(f) for f in config.configuration['Filters']]
+        self.filter_titles = set(config.configuration['FilterErrorTitles'])
         # list of blocked filters
         self.blocked_filters = set(config.configuration['BlockedFilters'])
         # set of filters that are actually used in add_info
@@ -153,6 +154,8 @@ class Filter:
         result_no_color = f'{filename}{arch}:{line} {level}: {rpmlint_issue}{detail_output}'
         # unused-rpmlintrc-filter warnings should be skipped
         if rpmlint_issue != 'unused-rpmlintrc-filter' and rpmlint_issue not in self.blocked_filters:
+            if rpmlint_issue in self.filter_titles:
+                return
             for f in self.filters_regexes:
                 if f.search(result_no_color):
                     self.used_filters.add(f.pattern)

with the patch applied, I get to:

Check time report (>1% & >0.1s):
    Check                            Duration (in s)   Fraction (in %)  Checked files
    FilesCheck                                   0.5              55.6