simple-tools-for-machine-learning(MLTools)

Simple tools for machine learning. Including computer vision, deep learning,...

mltools

用于机器学习/深度学习/机器视觉的小工具，但是不仅于此。

UI-tools 部分界面，详情请看文档。


demo1	demo2

demo3	...

server，详情请看文档。

requirements

numpy

scipy

scikit_image

tqdm

xmltodict

matplotlib

PyYAML

PyYAML 在低版本（5.3左右）有一个致命的漏洞，但是它是distutils ，在升级的时候使用以下指令

pip3 install --ignore-installed PyYAML

或者指定版本号

pip3 install --ignore-installed PyYAML==5.4

How to use

工具 (utils)
- xml2createML
  
  将labelImg格式（Pascal VOC）转化为createML（json）格式
  
  Convert labelImg format (Pascal VOC) to createML (JSON) format
- img2xml
  
  用来生成Pascal VOC标注文件的一个简单工具
  
  A simple tool for generating Pascal VOC annotation files
- json2mask
  
  将labelme格式的json文件转化为用于训练的mask文件（使用的时候输入绝对路径）
  
  Convert the JSON file in labelme format into a mask file for training (enter the absolute path when using)
- json2xml
  
  将labelme json格式文件转化为labelImg xml格式文件
  
  Convert labelme JSON file to labelimg XML file
- split
  
  将长宽比差距较大的图像和标注文件（例如，管道图），切分为长宽比1：1的图像和标注文件
  
  Images/annotations with a large difference in aspect ratio (eg. pipeline images) , split into images/annotations with an aspect ratio of 1:1
- yolo_train_val_dataset_split
  
  yolo格式（txt）的标注文件自动分配训练以及验证数据集
- widerface_convert
  
  将widerface数据集格式转化为可用于labelImg展示的xml格式
- xml2json
  
  xml格式labelImg数据转化为jsonlabelme格式数据
- xml2mask
- mlfiles_standardization
  
  将viewer生成的.ml格式文件转化为标准的labelme以及labelImg标注文件(一个ml文件同时生成两种格式，如果同时存在rect和polygon两种形式的标注类型的话)
- non_standardization
  
  与上一个方法相反的方法

无标注文件图像增广 (image augmentation without label files)

from mltools.src.augmentation.aug import NoLabelAugmentation
n = NoLabelAugmentation(["your_file_1",...,"your_file_n"], False, augNumber=3)

parameters

"""
@ imgs : 增广图片数组
@ parallel : 是否并行（多进程）
@ savedPath : 结果保存路径，可不填
@ augNumber : 增广数量
@ augMethods : 增广用到的方法，默认的有 "noise", "rotation", "trans", "flip", "zoom"
@ optionalMethods : 增广用到的可选方法，默认为空数组,包括 crop, cutmix, cutout, distort, inpaint,mixup, mosaic, resize
"""

codes

# random augmentation
n.go()
# only flip
n.onlyFlip()
# only noise
n.onlyNoise()
# only rotation
n.onlyRotation()
# translation
n.onlyTranslation()
# zoom
n.onlyZoom()
# crop
n.onlyCrop()
# cutmix
n.append("3.png")
n.onlyCutmix()
# distort
n.onlyDistort()
# inpaint
n.onlyInpaint(reshape=True)
# mosaic
n.onlyMosaic()
# resize
n.onlyResize()

examples

Column1	Column2	Column3
原始图片original	随机增广randomaugmentation	翻转flip

噪声noise	旋转rotation	平移translation

变焦zoom	裁切crop	cutmix

畸变distort	修补inpaint	mosaic

修改尺寸resize	...

labelImg标注增广 (augmentation for labelImg)

from mltools.src.augmentation.aug_labelimg import LabelimgAugmentation
l = LabelimgAugmentation(["0.png"], ["0.xml"])

parameters

""" ...
    labels: List[str], 标注储存的地址，要和图片一一对应
"""

codes

# flip
l.onlyFlip()
# rotate
l.onlyRotate()
# translation
l.onlyTrans()
# zoom
l.onlyZoom()
# noise
l.onlyNoise()
# mosaic
l.append("3.png", "3.xml")
l.onlyMosaic()
# resize
l.onlyResize()

examples

标注类型	结果
原始图像
flip
rotate
translation
zoom
noise
mosaic
resize
...

labelme标注增广 (augmentation for labelme)

from mltools.src.augmentation.aug_labelme import LabelmeAugementation
l = LabelmeAugementation(["3.png"],["3.json"],"3.yaml")

parameters

""" ...
 labels: List[str], 标注储存的地址，要和图片一一对应
 yamlPath: str, 储存标注信息的文件，形如：
 	label_names:
      _background_: 0
      eye: 1
      mouth: 3
      nose: 2
"""

codes

# flip
l.onlyFlip()
# rotate
l.onlyRotate()
# translation
l.onlyTrans()
# zoom
l.onlyZoom()
# noise
l.onlyNoise()

examples

标注类型	结果
original image
flip
noise
rotate
translation
zoom
...

note:

测试用的图像为GAN生成，并无侵权行为

The testing images are generated by Gan. There is no infringement.

这个repo是mask2json的一次重构，原始的代码是使用python3.6完成的，同时numpy等包的版本也比较低（高版本出了很多问题，尤其是numpy，有的时候3.8版本可以正常通过测试3.9版本就会出错）；加上同时使用了opencv-python和scikit_image做图像处理，有点冗余；而且一直想做的可视化界面也半途而废了，所以才有了重构的想法

This repo is a refactor of mask2json. The original codes are completed with python3.6, and the versions of packages such as numpy are relatively low (there are many problems with the higher version, especially numpy. Sometimes version 3.8 can pass the tests normally, but version 3.9 will raise errors); In addition, both opencv-python and scikit are used in image processing , which is redundant; That's why I came up with the idea of reconstruction

可视化界面有可能会用flutter做，参考我用flutter做的移动端标注工具，不过可能不会提供修改的功能（稍微有点复杂）

The UI tool may be developed with flutter. Refer to the mobile end annotation tool I made with flutter.

skimage 保存png的时候很慢，保存成jpg的时候要快很多,尽量使用jpg保存。参考这个 issue

skimage saves PNGs is slow, but it saves JPGs is faster. Refer to this issue

inpaint增广，印象中使用opencv没有这么慢的(图片越大越慢，同时有可能OOM)，尽量少用

inpaint is slow with skimage . Using opencv-python is faster.

进度

2022-08-15 开始考虑 windows opencv+flutter直接在前端完成一些图像操作（opencv.js有很多功能不完善，很多方法都没有实现，所以应该时不支持web端），这是仓库地址
2022-07-29 添加一个方法，用于转化labelme和labelImg标注文件到.ml标注文件(v0.1.2)
2022-07-25 添加一个方法，用于转化viewer生成的文件到labelme和labelImg标注文件(v0.1.1)
2022-07-15 已完成重构工作，准备着手UI工具开发(v0.1.0)
2022-07-14 添加labelme部分增广，添加部分工具
2022-07-13 添加labelImg 部分增广
2022-07-12 添加labelImg 部分增广
2022-07-11 大致完成无标注文件的图像增广，更新readme
2022-07-04 无标注增广完成(去掉了原版透视变换增广)
2022-06-13 开始重构，抄/改（主要还是统一cv2和skimage）了部分代码

guchengxi1994 / simple-tools-for-machine-learning

simple-tools-for-machine-learning(MLTools)

mltools

UI-tools 部分界面，详情请看文档。

server，详情请看文档。

requirements

How to use

note:

进度

About

Languages