miurahr / py7zr

7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, and AES encryption.

Home Page:https://pypi.org/project/py7zr/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Provide zipfile-like in-memory `.open()`

mjsir911 opened this issue · comments

Is your feature request related to a problem? Please describe.
Hi,

It would be nice to get a file-like object of an unextracted file contained within the archive, kind of like how python's zipfile works:

with ZipFile('spam.zip') as myzip:
    with myzip.open('eggs.txt') as myfile:
        print(myfile.read())

Describe the solution you'd like
A clear and concise description of what you want to happen.

Would like to see the same with py7zr:

with py7zr.SevenZipFile("siivagunner_pages_current.xml.7z", mode="r") as z:
    with z.open('siivagunner_pages_current.xml') as f:
        print(f.read())

Describe alternatives you've considered
Extracting to directories works, but in this instance I have a .7z file with a single file contained within, that I am only interested in ephemerally.

Duplicated with #117

You can use read method for the purpose. please see API document
https://py7zr.readthedocs.io/en/latest/api.html#py7zr.SevenZipFile.read

Here is sample code.

targets = ['siivagunner_pages_current.xml']
with SevenZipFile('siivagunner_pages_current.xml.7z', 'r') as zip:
     for fname, bio in zip.read(targets).items():
         print(bio.read())

Thanks for the example @miurahr

If my targets are text files, should I made further conversions to the output of bio.read()?
When using print(bio.read()) I see a b' at their beginning i.e.
b'my text file content
where I would expect
my text file content

Also, regarding your linked example code in documentation:

filter_pattern = re.compile(r'scripts.*')
with SevenZipFile('archive.7z', 'r') as zip:
     allfiles = zip.getnames()
     targets = [f for f in allfiles if filter_pattern.match(f)]
with SevenZipFile('archive.7z', 'r') as zip:  # <--- why this line? do we need to close and reopen?
     for fname, bio in zip.read(targets).items():
         print(f'{fname}: {bio.read(10)}...')

Are we opening the compressed file twice?
Wouldn't it work the same if we put the for inside the first with?

Sorry if these are pretty basic python questions, non py7zr related. But just in case

Thanks
@abubelinha

Bing chat answered

When you use bio.read(), it returns a bytes object. The b' at the beginning of the output indicates that it is a bytes object. If you want to convert it to a string, you can use the .decode() method. For example, bio.read().decode() will return the string without the b' at the beginning.