File cleanup

Question

File cleanup

ian-whitestone opened this issue 6 years ago · comments

When generating the fake data, the scripts started interfering with each other (same filenames) part way, so cancelled the jobs and started with new file prefixes.

Need to clean up the old files with the outdated prefixes.

Ian Whitestone · Answer 1 · Mon Oct 15 2018 04:23:16 GMT+0800 (China Standard Time)

import re
import boto3
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('dask-avro-data')

reg = 'application-data\/\d*.avro|fulfillment-data\/\d*.avro|scoring-data\/\d*.avro'

objects = []
for object in my_bucket.objects.all():
    objects.append(object)



for object in objects:
    if re.match(reg, object.key):
        object.delete()