Giters
webrecorder
/
py-wacz
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
33
Watchers:
4
Issues:
28
Forks:
10
webrecorder/py-wacz Issues
Add --copy-pages option to copy pages.jsonl/extraPages.jsonl as-is into WACZ
Closed
2 months ago
Python in the read me file
Updated
3 months ago
Comments count
2
AttributeError: 'NoneType' object has no attribute 'lower'
Updated
4 months ago
Windows 10 truncates read path and prevents validation
Updated
5 months ago
better documentation via `wacz --help`
Updated
8 months ago
Detecting pages
Closed
10 months ago
Comments count
2
Canonical method for converting multiple WARC files to WACZ
Updated
a year ago
Comments count
4
Rename compressed WARC files without .gz extension when creating WACZ
Updated
a year ago
[FEATURE] Add logs to WACZ
Closed
a year ago
[FEATURE] Add a WARC Record Iterator
Updated
2 years ago
Some commands documented to interact with WACZ files are invalid
Updated
2 years ago
Test failure under Python 3.10
Closed
2 years ago
Comments count
3
Instructions how to create wacz from browsertrix crawl
Updated
2 years ago
zipfile.BadZipFile error during wacz creation from warc file - Windows only
Updated
2 years ago
Dev dependencies should be separated from normal dependencies
Updated
2 years ago
Allow MD5 as datapackage hash
Closed
3 years ago
Support premade page lists from a crawler
Closed
3 years ago
Command Line Return Code should be 0
Closed
3 years ago
Use psf/black for python code formatting
Closed
3 years ago
Comments count
1
Combine the text and page index
Closed
3 years ago
Comments count
1
Improve testing suite
Closed
3 years ago
`datapackage.json` does not pass frictionless data default profile validation
Closed
3 years ago
Comments count
2
Validation of WACZ Format
Closed
3 years ago
Error "File size unexpectedly exceeded ZIP64 limit" occurs when using py-wacz on a large WACZ file
Closed
3 years ago
Comments count
2
Ability to specify main page via --url / --ts flags
Closed
3 years ago
py-wacz: when adding pages from specified page list, check for https versions.
Closed
3 years ago
py-wacz: Implement test suite for py-wacz
Closed
3 years ago
py-wacz: Add a way to list pages in the WACZ
Updated
3 years ago