PixelDoodle / MassImageRetrieval

This project is intended to solve the task of massive image retrieval.

10

模型设计的指导

修改采样的方案，通过每隔几轮的更新候选集合进行采样
- 采样中当选择了(x_a, x_p)之后，如何确定选择的x_n是一个可以提升结果的点
细化case方案，重新定制损失函数，把损失函数可视化出来
设计x_a, x_p, x_n之间的矢量信息，求出夹角方向值，重新设计损失函数
通过增大的batch信息，将类内误差和类间误差添加到损失函数中去

问题以及解决？

所有的训练样本都是根据随机选择的，其中存在部分数据是很难被直接选择到的，导致10分类的分类器的分类性能下降
改进样本构造的方案，使得所有的样本都可以进入分类器进行训练

实验结果

TODOLIST

使用Res50提取图像的特征
编写孪生网络进行测试
编写Triple Loss网络，并进行测试
重新设计Triple Loss网络训练样本的构造
- 添加了基于聚类中心的anchor选择和在给定半径之外的正负样本的选择
- 添加了针对训练样本中 $(x_a, x_p, x_n)$ 之间的方向条件进行选择
- 添加针对Query列表候选集进行训练样本选择的策略
根据TripleModel输入的数据中可以转化成PairWise的排序问题
将每次训练出得模型结果保存成文件便于后续分析
结果图中，聚类不够紧凑
- 针对数据采样策略的修改
  - 在采样时使用一个set，保证被采样过的样本不能在被采样一次，直到没有可采样数据后，结束这一轮的训练
  - 每一个batch采样时，将记录每个样本被采样的次数，每次会得到一个分布，将分布改成概率p，下一次按照(1-p)去进行采样
  - 损失函数为max(0, dist loss)，在训练段记录为0的样本，这些样本对整体训练没有梯度的贡献，进而指导采样
  - 每一轮训练后，会得到全量数据的距离矩阵，将距离矩阵转换成概率矩阵对采样端进行结果指导(MCMC)
- 修改loss函数策略
  - 关注到x_p到x_a的距离的控制
  - 是否可以引入EM算法，对进行二维变量的混合高斯估计
- 当选择的数据sample(x_a, x_p, x_n)为一下情况，样本失效(目标是max(0.0, dist_p - dist_n + margin))
  - dist_n too large, dist_p too small
  - margin too small
  - the categories of positive and negative samples are not close neighbors
  - the selection of positive and negative samples is not on the same side
添加Hash Loss Function
每次使用2000个Triple样本进行训练，相邻的两个epoch得到的预测结果差异很大，如何较好的控制每次聚类的结果，这个确实很重要？
使用SoftMax Loss + Center Loss进行训练，得到模型

Reference List

Deep Learning of Binary Hash Codes for Fast Image Retrieval
Deep Relative Distance Learning- Tell the Difference Between Similar Vehicles
Deep Supervised Discrete Hashing
Deep Supervised Hashing for Fast Image Retrieval
FaceNet- A Unified Embedding for Face Recognition and Clustering
Fast Training of Triplet-based Deep Binary Embedding Networks
Hard-Aware Deeply Cascaded Embedding
HashNet: Deep Learning to Hash by Continuation
Fast Supervised Hashing with Decision Trees for High-Dimensional Data
Simultaneous Feature Learning and Hash Coding with Deep Neural Networks
Learning to Hash with Binary Reconstructive Embeddings

About

This project is intended to solve the task of massive image retrieval.

Apache License 2.0

Languages

Language:Python 100.0%

Links

ProductDiscover

Data Powerby api.github.com. Remove your profile on the Giters? Go to settings.

Contact Site Admin: Giters.