A fuzz tool for super mario game.
python main.py transform <input_data>
<input_data>
: Input data used for the transformation operation.
python main.py fuzz <seed_path> <crash_path>
<seed_path>
: Path to the seed data for the fuzzing operation.<crash_path>
: Path to the crash data for saving crash reports during the fuzzing process.
Game: Mario-Level-1
Required Packages: Please refer to the requirements.txt
file for the list of required packages.
Please make sure to modify the game_path
in the config.yaml
file located in the action
folder to your own game's absolute path.
南京大学软件学院软件测试课程代码作业,选题方向为程序分析-模糊测试。
实现命令行工具形式的基于变异的模糊器,设计思路如下:
- 定义测试输入:一组游戏操作的序列,五元组={left, right, up, jump, fire}
- 定义输出:o={is_mario, gold, score}, is_mario=mario是否存在画面中;gold=金币数量;score=游戏得分
- 种子调度,基于feedback
- 种子优先级排序:得分更高序列优先
- 能量分配:根据feedback结果,对照基准线分配energy
- 测试生成:六种变异算子,并使用退火算法进行变异算子调度
- 游戏执行,并截图保存
- 结果分析:使用图像识别训练模型,识别分析状态返回结果
- 结果保存:保存所有产生唯一状态的测试输入
本项目特点如下:
- 面向新的模糊目标:测试目标为经典游戏超级马里奥
- 添加新机制
- 更多样的变异算子:新增Shuffle变异算子
- 变异算子调度:使用退火算法实现
- 新颖的输出分析策略:训练图像识别模型,对游戏结果截图进行图像识别后获得游戏分数等信息
本项目架构如下:
-
action: 包含与执行相关的文件
config.yaml
: 配置文件fuzz.py
: Fuzz操作run.py
: 游戏运行transform.py
: Transform操作
-
test: 测试相关的文件
-
util: 工具类
mutator_schedule
: 变异算子调度工具output_analysis
: 输出分析工具preprocess
: 预处理工具seed_schedule
: 种子调度工具
-
command.py
: 命令解析 -
main.py
: 执行入口 -
README.md
: 项目的说明文档 -
requirements.txt
: 项目所需的依赖包列表
main.py
coloredlogs.install(level='INFO', fmt='%(asctime)s - %(levelname)s - %(message)s')
def main():
logging.info("Welcome to mario fuzz!")
parse_and_run()
if __name__ == '__main__':
main()
command.py
使用argparse
对命令行选项、参数进行解析。
增加了operation
参数,并分为transform
与fuzz
命令。
def parse_and_run():
parser = argparse.ArgumentParser(description='A fuzz tool for super mario game.')
subparsers = parser.add_subparsers(dest='operation', help='Operation to perform')
# 添加 transform 子命令
transform_parser = subparsers.add_parser('transform',
help='Transform the seed with one specific rules')
transform_parser.add_argument('input_data', help='Input data for the transform operation')
transform_parser.set_defaults(func=transform_cmd)
# 添加 fuzz 子命令
fuzz_parser = subparsers.add_parser('fuzz', help='Fuzz the game')
fuzz_parser.add_argument('seed_path', help='Path of seeds for the fuzz operation')
fuzz_parser.add_argument('crash_path', help='Path of crash data for the fuzz operation')
fuzz_parser.set_defaults(func=fuzz_cmd)
# 解析命令行参数并执行相应的操作
args = parser.parse_args()
args.func(args)
transform操作用于对指定的种子进行特定变异操作,尤其在前期开发对变异算子部分测试使用。
fuzz操作用于模糊测试,需要用户指定种子路径与crash存放路径。
本模块包括Fuzz, Transform, Run三种操作
fuzz.py
初始化使用用户定义的target_path
和crash_path
来指定种子路径和崩溃路径。
def __init__(self, target_path, crash_path):
self.target_path = target_path
self.crash_path = crash_path
首先遍历种子文件收集初始分数,然后调用SeedSchedule
进行种子调度,并根据游戏分数对种子进行能量分配,最后调用MutatorSchedule
进行变异算子调度,并对变异后返回结果分析进行对应的保存。
def run(self):
file_list = self.get_file_list()
seed_score_pairs = []
for i in tqdm(range(len(file_list)), "Loading seeds"):
tmp_ops = read_file_content(file_list[i])
_, gold, score = run.play_game(tmp_ops, len(tmp_ops))
seed_score_pairs.append((tmp_ops, score))
while True:
seed_schedule = SeedSchedule(seed_score_pairs)
selected_tuple = seed_schedule.schedule()
energy = min(len(selected_tuple[0]), c.OP_COUNT_BASELINE * (selected_tuple[1] / c.SCORE_BASELINE))
mutation_schedule = MutatorSchedule(selected_tuple, energy)
output_data, score, is_crash = mutation_schedule.schedule()
if is_crash:
self.save_crash_data(output_data)
continue
else:
seed_score_pairs.append((output_data, score))
self.save_output_data(output_data)
transform.py
使用枚举规定Transform种类,除规定实现的CharFlip, CharIns, CharDel, Havoc, Splice五种算子,增加Shuffle变异算子:
class TransformKind(Enum):
"""Transforms for the action."""
CHAR_FLIP = 0
CHAR_INS = 1
CHAR_DEL = 2
HAVOC = 3
SPLICE = 4
SHUFFLE = 5
新增的Shuffle变异算子:对操作序列本身乱序
def shuffle(self):
self.input_data = ''.join(random.sample(self.input_data, len(self.input_data)))
return self.input_data
run.py
使用pynput库将字符串的操作序列转换为游戏中的输入
class KeyboardActions:
def __init__(self):
self.keyboard = Controller()
self.time_interval = 0.5
def press_a_key(self):
self.keyboard.press('a')
time.sleep(self.time_interval)
self.keyboard.release('a')
def press_up_key(self):
self.keyboard.press(Key.up)
time.sleep(self.time_interval)
self.keyboard.release(Key.up)
def press_s_key(self):
self.keyboard.press('s')
time.sleep(self.time_interval)
self.keyboard.release('s')
def press_left_key(self):
self.keyboard.press(Key.left)
time.sleep(self.time_interval)
self.keyboard.release(Key.left)
def press_right_key(self):
self.keyboard.press(Key.right)
time.sleep(self.time_interval)
self.keyboard.release(Key.right)
def press_down_key(self):
self.keyboard.press(Key.down)
time.sleep(self.time_interval)
self.keyboard.release(Key.down)
def press_enter_key(self):
self.keyboard.press(Key.enter)
time.sleep(self.time_interval)
self.keyboard.release(Key.enter)
对游戏窗口截图,交由识别部分进行图像识别,实时得到结果
def take_screenshot(window):
# 获取窗口位置和大小
window_x, window_y, window_width, window_height = window.left, window.top, window.width, window.height
# 截取窗口图像
screenshot = pyautogui.screenshot(region=(window_x + 20, window_y, window_width - 20, window_height - 20))
# 将Pillow图像对象转换为OpenCV图像对象
opencv_image = cv2.cvtColor(np.array(screenshot), cv2.COLOR_RGB2BGR)
# 将OpenCV图像对象转换为灰度图像
gray_image = cv2.cvtColor(opencv_image, cv2.COLOR_BGR2GRAY)
boolValue = analysis.check_image(gray_image)
gold = analysis.extract_gold(opencv_image)
score = analysis.extract_score(opencv_image)
# 可以根据需要返回截图对象或者其他信息
return boolValue, gold, score
创建游戏进程,开始游戏操作
pygame_process = subprocess.Popen(['python', game_path + '/mario_level_1.py'], stdin=subprocess.PIPE)
preprocess/
mario/
存储三种状态下的mario图像素材
resources/
存储原始的mario游戏素材
getMarioPics.py
对原始mario游戏素材进行切片,得到三种状态下的mario图像素材
def main():
mario_sprites = MarioSprites()
# 保存所有图片
save_images(mario_sprites.right_frames, "./mario/small_normal", "right_small_normal")
save_images(mario_sprites.left_frames, "./mario/small_normal", "left_small_normal")
save_images(mario_sprites.left_big_normal_frames, "./mario/big_normal", "left_big_normal")
save_images(mario_sprites.right_big_normal_frames,"./mario/big_normal", "right_big_normal")
save_images(mario_sprites.right_fire_frames, "./mario/big_fire", "right_big_fire")
output_analysis/
numberTrain.py
对游戏的数字图像进行训练,通过图像中的数字轮廓,将每个数字提取出来,并将这些数字的分类结果和压平图像分别保存到 classifications.txt 和 flattened_images.txt中。
# 遍历所有轮廓
for npaContour in npaContours:
# 如果轮廓面积大于阈值
if cv2.contourArea(npaContour) > MIN_CONTOUR_AREA:
[intX, intY, intW, intH] = cv2.boundingRect(npaContour)
# 在原始图像上绘制红色矩形
cv2.rectangle(imgTrainingNumbers, (intX, intY), (intX + intW, intY + intH), (0, 0, 255), 2)
imgROI = imgThresh[intY:intY + intH, intX:intX + intW]
imgROIResized = cv2.resize(imgROI, (RESIZED_IMAGE_WIDTH, RESIZED_IMAGE_HEIGHT))
# 显示原始字符区域和调整大小后的字符区域
cv2.imshow("imgROI", imgROI)
cv2.imshow("imgROIResized", imgROIResized)
cv2.imshow("training_numbers.png", imgTrainingNumbers)
# 获取键盘输入
intChar = cv2.waitKey(0)
# 如果按下ESC键,退出程序
if intChar == 27:
sys.exit()
# 如果按下有效字符,添加到分类列表
elif intChar in intValidChars:
intClassifications.append(intChar)
# 压平字符区域并添加到数组
npaFlattenedImage = imgROIResized.reshape((1, RESIZED_IMAGE_WIDTH * RESIZED_IMAGE_HEIGHT))
npaFlattenedImages = np.append(npaFlattenedImages, npaFlattenedImage, 0)
fltClassifications = np.array(intClassifications, np.float32)
npaClassifications = fltClassifications.reshape((fltClassifications.size, 1))
print("\n训练完成!!\n")
np.savetxt("classifications.txt", npaClassifications)
np.savetxt("flattened_images.txt", npaFlattenedImages)
numberExtract.py
通过训练好的KNN模型对输入的数字图像进行识别提取。
def number_get(test_image):
......
# 创建KNN对象
kNearest = cv2.ml.KNearest_create()
# 训练KNN模型
kNearest.train(npaFlattenedImages, cv2.ml.ROW_SAMPLE, npaClassifications)
......
# 最终识别结果字符串
strFinalString = ""
# 遍历有效轮廓
for contourWithData in validContoursWithData:
# 获取数字区域
imgROI = imgThresh[contourWithData.intRectY: contourWithData.intRectY + contourWithData.intRectHeight,
contourWithData.intRectX: contourWithData.intRectX + contourWithData.intRectWidth]
# 调整数字区域大小
imgROIResized = cv2.resize(imgROI, (RESIZED_IMAGE_WIDTH, RESIZED_IMAGE_HEIGHT))
# 将数字区域展平为一维numpy数组
npaROIResized = imgROIResized.reshape((1, RESIZED_IMAGE_WIDTH * RESIZED_IMAGE_HEIGHT))
# 将数据类型转换为float32
npaROIResized = np.float32(npaROIResized)
# 使用KNN进行识别
retval, npaResults, neigh_resp, dists = kNearest.findNearest(npaROIResized, k=1)
# 获取识别结果字符
strCurrentChar = str(chr(int(npaResults[0][0])))
strFinalString = strFinalString + strCurrentChar
return strFinalString
outputAnalysis.py
提取mario图像的特征点,将特征点保存在train_des.pkl
def extract_mario_des(self):
template_image_path = '../preprocess/mario/small_normal/right_small_normal_0.png'
arr_des = []
arr_kp = []
arr_template = []
# 遍历文件
PATH_TO_TEST_IMAGES_DIR = '../preprocess/mario'
for pidImage in glob.glob(PATH_TO_TEST_IMAGES_DIR + "/*/*.png"):
template_image_path = pidImage
template_image_path = template_image_path.replace('\\', '/')
# print(template_image_path)
# 读取mario和背景
template = cv2.imread(template_image_path, 0)
if template is None:
print('Could not open or find the images!')
exit(0)
template = cv2.resize(template, None, fx=2, fy=2)
# 创建SIFT对象
sift = cv2.xfeatures2d.SIFT_create()
# 提取特征点
kp1, des1 = sift.detectAndCompute(template, None)
arr_des.append(des1)
arr_kp.append(kp1)
arr_template.append(template)
save_data('train_des.pkl', arr_des)
return arr_des, arr_kp, arr_template
提取游戏窗口截图的特征点,与已提取的mario特征点进行匹配,返回输入图像中是否包含mario
def check_image(self, screenshot):
train_des = load_des('train_des.pkl')
if screenshot is None:
print('Could not open or find the images!')
exit(0)
# 创建SIFT对象
sift = cv2.SIFT_create()
# 提取特征点
kp2, des2 = sift.detectAndCompute(screenshot, None)
bf = cv2.BFMatcher()
for i in range(len(train_des)):
# 匹配特征点
matches = bf.knnMatch(train_des[i], des2, k=2)
good = []
for m, n in matches:
if m.distance < 0.75 * n.distance:
good.append([m])
# 若匹配超过3点则认为存在
if len(good) > 3:
return True
return False
提取游戏窗口截图中的游戏分数
def extract_score(self, screen):
imgROI = screen[165:219, 140:429]
# cv2.imshow("ROI_WIN", imgROI)
# cv2.waitKey(0)
res = number_get(imgROI)
if len(res) > 6:
res = res[0:6]
# print("score:"+res)
return res
提取游戏窗口截图中的金币数
def extract_gold(self, screen):
imgROI = screen[165:219, 635:730]
# cv2.imshow("ROI_WIN", imgROI)
# cv2.waitKey(0)
res = number_get(imgROI)
if len(res) > 2:
res = res[0:2]
# print("gold:"+res)
return res
mutator_schedule/
实现了一个可以根据种子和能量进行游戏运行的函数。
def get_score(seed, energy):
is_mario, _, score = action.run.play_game(seed, energy)
return is_mario, score
以模拟退火算法为载体进行变异调度,先选中操作种子,并获取其得分。如果发生崩溃,停止操作并返回崩溃。在算法中,通过在迭代循环中在当前种子附近进行变异,并计算新种子的得分,同时监测程序是否崩溃,然后根据Metropolis准则计算接受概率,在接受概率内或得到了更高分的种子-得分对的情况下,会更新种子-得分对,循环直到迭代结束。最后函数会返回迭代完成后的最佳种子-得分对。
def schedule(self):
current_op = self.seed_score_pairs[0]
is_mario, current_score = get_score(current_op, len(current_op))
if not is_mario:
return current_op, current_score, False
best_op = current_op
best_score = current_score
for _ in tqdm(range(self.iterations)):
self.temperature *= self.cooling_rate
# 在当前字符串附近进行变化
transform_action = Transform(current_op)
new_op = transform_action.transform()
# 计算新字符串的得分
is_mario, new_score = get_score(new_op, self.energy)
if not is_mario:
return new_op, current_score, False
# 计算接受概率
acceptance_probability = math.exp((new_score - current_score) / self.temperature)
# 根据概率决定是否接受新字符串
if new_score > current_score or np.random.uniform(low=0, high=1) < acceptance_probability:
current_op = new_op
current_score = new_score
# 更新最佳字符串
if current_score > best_score:
best_op = current_op
best_score = current_score
return best_op, best_score, True
seed_schedule/
实现了一个排序算法,用于对种子序列进行排序。
def sort_tuples(tuples_list):
sorted_tuples = sorted(tuples_list, key=lambda x: x[1], reverse=True)
return sorted_tuples
实现了一个选择算法,用于对种子进行有一定权重的选择。在这里,我们将排好序的种子-分数对按一定比例切分成高分组和低分组,并在低分组中随机选取一个同高分数组一同进行随机选择。
def select_tuples(sorted_tuples, high_ratio):
total_tuples = len(sorted_tuples)
high_count = int(total_tuples * high_ratio)
high_tuples = sorted_tuples[:high_count]
random_tuple = random.choice(sorted_tuples[high_count:])
selected_tuple = random.choice(high_tuples + [random_tuple])
return selected_tuple
应用六种变异算子,生成53个种子,详细可见product/seed