saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:

Home Page:https://repo.saltproject.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] file.blockreplace does not work with UTF-8

MAH69IK opened this issue · comments

Description
Well, to be precise - sometimes it works. If you're lucky. I was lucky before, so I was very surprised when adding a couple of new lines to the file broke everything.

I use file.blockreplace for generating the nftables configuration. I have a core - a file with rules for all servers. And using file.blockreplace I insert server-specific rules. Everything worked for several weeks until I wanted to insert a couple of additional rules.

I started getting this error:

    Function: file.blockreplace
        Name: /etc/nftables.conf
      Result: False
     Comment: Encountered error managing block: Cannot perform string replacements on a binary file: /etc/nftables.conf. See the log for details.

And from /var/log/salt/minion:

2022-06-24 16:09:51,014 [salt.loaded.int.states.file:5756][ERROR   ][4021129] Encountered error managing block
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/states/file.py", line 5741, in blockreplace
    changes = __salt__["file.blockreplace"](
  File "/usr/lib/python3/dist-packages/salt/loader/lazy.py", line 149, in __call__
    return self.loader.run(run_func, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/salt/loader/lazy.py", line 1201, in run
    return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/salt/loader/lazy.py", line 1216, in _run_as
    return _func_or_method(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/salt/modules/file.py", line 2860, in blockreplace
    raise SaltInvocationError(
salt.exceptions.SaltInvocationError: Cannot perform string replacements on a binary file: /etc/nftables.conf
2022-06-24 16:09:51,021 [salt.state       :317 ][ERROR   ][4021129] Encountered error managing block: Cannot perform string replacements on a binary file: /etc/nftables.conf. See the log for details.

I started to figure it out and this is what I found. This is where the error occurred in salt/modules/file.py:

    try:
        file_encoding = __utils__["files.get_encoding"](path)
    except CommandExecutionError:
        file_encoding = None

    if __utils__["files.is_binary"](path):
        if not file_encoding:
            raise SaltInvocationError(
                "Cannot perform string replacements on a binary file: {}".format(path)
            )

files.get_encoding causes an exception but why? This is function from salt/utils/files.py. It uses several ways to determine the encoding of the file: checks ASCII, BOM and UTF-8.

It may seem that it should work because my file is UTF-8:

$ file /srv/salt/base/packages/nftables/nftables.conf 
/srv/salt/base/packages/nftables/nftables.conf: a /usr/sbin/nft -f script, UTF-8 Unicode text executable

But this is not the case. The fact is that for verification, this function reads the first 2048 bytes and tries to decode them (if we are talking about checking for UTF-8). And this is the only suitable method in my case because there is no BOM in my file and besides ASCII it contains comments in Russian.

It so happened that when I added new lines, when reading 2048 bytes, they ended with these characters: \xd1\x80\xd1\x83 \xd0\xbf\xd0\xbe \xd0\xb2\xd0\xbd\xd1. But it doesn't make sense, because in order to decode this string in UTF-8, you need to have one more byte and then you will get: \xd1\x80\xd1\x83 \xd0\xbf\xd0\xbe \xd0\xb2\xd0\xbd\xd1\x83 -> ру по вну.

I offer three options. Firstly, it is possible to decode not 2048 bytes, but the entire file. Then it will work successfully.

You can not check at all, and if the user wants to replace strings in a binary file, consider that these are his problems :)

And you can rely on the work of file which successfully understands that this is a UTF-8 file. Here is a short python code that is based on this utility and in my case successfully works on both 2048 bytes and 2049 and on the entire file:

>>> textchars = bytearray({7,8,9,10,12,13,27} | set(range(0x20, 0x100)) - {0x7f})
>>> is_binary_string = lambda bytes: bool(bytes.translate(None, textchars))
>>> with open('/srv/salt/base/packages/nftables/nftables.conf', 'rb') as fp_:
...     data = fp_.read(2048)
>>> is_binary_string(data)
False

Versions Report

salt --versions-report
$ salt --versions-report
Salt Version:
          Salt: 3004.2
 
Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: 4.0.5
     gitpython: 3.1.14
        Jinja2: 2.11.3
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.0
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: Not Installed
  pycryptodome: 3.9.7
        pygit2: Not Installed
        Python: 3.9.2 (default, Feb 28 2021, 17:03:44)
  python-gnupg: Not Installed
        PyYAML: 5.3.1
         PyZMQ: 20.0.0
         smmap: 4.0.0
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4
 
System Versions:
          dist: debian 11 bullseye
        locale: utf-8
       machine: x86_64
       release: 5.10.0-10-amd64
        system: Linux
       version: Debian GNU/Linux 11 bullseye

P. S. Near the is_binary function is the is_text function, which performs a similar job and also tries to convert part of the bytes to UTF-8. That's just it reads 512 bytes. Maybe it should be brought to one size?

Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey.
Please be sure to review our Code of Conduct. Also, check out some of our community resources including:

There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar.
If you have additional questions, email us at saltproject@vmware.com. We’re glad you’ve joined our community and look forward to doing awesome things with you!

Maybe I can provide some more help in solving this problem?

Well, I am hitting very same issue with 2048 bytes being read to guess whether the file is binary, with cp.get_file.
The issue is in salt/utils/files.py in my case:

   674	def is_binary(path):
   675	    """
   676	    Detects if the file is a binary, returns bool. Returns True if the file is
   677	    a bin, False if the file is not and None if the file is not available.
   678	    """
   679	    if not os.path.isfile(path):
   680	        return False
   681	    try:
   682	        with fopen(path, "rb") as fp_:
   683	            try:
   684	                data = fp_.read(2048)
   685	                data = data.decode(__salt_system_encoding__)
   686	                return salt.utils.stringutils.is_binary(data)
   687	            except UnicodeDecodeError:
   688	                return True
   689	    except os.error:
   690	        return False

In case the file being downloaded this way looks like text in first 2048 bytes and then it's really binary, then it is being downloaded with zero length and master log is spammed with errors like:
File "/usr/lib/python3.6/site-packages/salt/utils/gitfs.py", line 3054, in serve_file data = data.decode(__salt_system_encoding__) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 23076: invalid start byte

I propose not to rely on first 2048 bytes only...