XuJin's repositories

AdvancedDataStructures

大学时期学习数据结构的C++源码,包含AVL树、Treap、多个有序链表合并、二叉查找树、二项堆、红黑树、伸展树、跳表、栈与队列相互模拟以及最小(大)值求解、主席树的C++版实现,欢迎指出错误和提交贡献。

ChineseTextClassifier

实现中文文本分类,支持文件、文本分类,基于多项式分布的朴素贝叶斯分类器。由于工作实际应用是二分类,加之考虑到每个分类属性都建立map存储词语向量可能引起的内存问题,所以目前只支持二分类。当然,直接复用这个结构扩展到多分类也是很容易。之所以自己写,主要原因是没有仔细研读mahout、weka等代码,不能灵活地进行中文分词、停用词过滤、词频统计、TF-IDF等,也就是向量化和特征提取没有自己手写相对灵活。

Language:JavaStargazers:23Issues:0Issues:0

The-Research-And-Implementation-Of-Data-Mining-For-Geological-Data

Data mining and knowledge discovery, refers to discover knowledge from huge amounts of data, has a broad application prospect.When faced with geological data, however, even the relatively mature existing models, there are defects performance and effect.Investigate its reason, mainly because of the inherent characteristics of geological data, high dimension, unstructured, more relevance, etc., in the data model, indexing structure knowledge representation, storage, mining, etc., is far more complicated than the traditional data. The geological data of the usual have raster, vector and so on, this paper pays attention to raster data processing.Tobler theorem tells us: geography everything associated with other things, but closer than far stronger correlation.Spatial correlation characteristics of geological data, the author of this paper, by establishing a spatial index R tree with spatial pattern mining algorithms as the guiding ideology, through the raster scanning method materialized space object space between adjacent relationship, transaction concept, thus the space with a pattern mining into the traditional association rules mining, and then take advantage of commonly used association rules to deal with some kind of geological data, to find association rules of interest. Using the simulation program to generate the geological data of the experiment, in the process of experiment, found a way to use R tree indexing can significantly speed up the generating spatial transaction set, at the same time, choose the more classic Apriori algorithm and FP - growth algorithm contrast performance, results show that the FP - growth algorithm is much faster than the Apriori algorithm, analyses the main reasons why the Apriori algorithm to generate a large number of candidate itemsets.In this paper, the main work is as follows: (1) In order to speed up the neighborhood search, choose to establish R tree spatial index, on the basis of summarizing the common scenarios to apply spatial indexing technology and the advantages and disadvantages. (2) Based on the analysis of traditional association rule mining algorithm and spatial association rule mining algorithm on the basis of the model based on event center space with pattern mining algorithm was described, and puts forward with a rule mining algorithm based on raster scanning, the algorithm by scanning for the center with a grid of R - neighborhood affairs set grid, will study data mining into the traditional data mining algorithm. (3) In the process of spatial index R tree insert, in order to prevent insertion to split after the leaf node, leading to a recursive has been split up destroy the one-way traverse, is put forward in the process of looking for insert position that records if full node number is M (M number) for each node up to insert nodes, first to divide to avoid after layers of recursive splitting up, speed up the R tree insertion efficiency. (4) On the basis of spatial transaction set preprocessing, realize the Apriori algorithm and FP-growth algorithm two kinds of classic association rule mining algorithm, performance contrast analysis.

Language:C++Stargazers:8Issues:2Issues:0

zookeeper

zookeeper实战入门实战,理论知识浅显落地,不至于太泛泛而谈

Language:JavaStargazers:6Issues:3Issues:0

five-in-a-row-game

Base in java at Junior when studing Java.

Language:JavaStargazers:5Issues:3Issues:0

crawler

crawle from douban movie using python

Language:PythonStargazers:3Issues:2Issues:0

phppack

phppack函数java实现

Language:JavaStargazers:2Issues:0Issues:0

WebServiceServerDemo

WebService Server Demo

Language:JavaStargazers:2Issues:0Issues:0
Language:JavaStargazers:1Issues:0Issues:0

elasticsearch

在看《深入理解elasticsearch》,本地搭个es sever,然后写个demo比较low

Language:JavaStargazers:1Issues:2Issues:0

netty

看并发编程网上的netty代码解读,写个demo比较low

Language:JavaStargazers:1Issues:2Issues:0

programset

记录大学接触编程的acm部分刷题、课设和算法练手、新语言学习等代码,现在回头看看自己也觉得比较low,欢迎斧正!

WebServiceClientDemo

WebService Client Demo

Language:JavaStargazers:1Issues:2Issues:0

forum

Django forum clone from F2E.im support SAE

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0