fly2leoo / phpanalysis

PHP中文无组件分词类(chinese analysis by php scripts)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PHPAnalysis php中文无组件分词类

一、最新变化

1、修改源文件结构支持composer
2、把切分同时优化的操作模式改为独立步骤操作(即是粗分、切分、优化三步完全独立)
3、修改类成员调用支持自身引用,即是 xx()->xx()->xx() 模式

二、一个基本的分词操作

use Tutu\PhpAnalysis;
header('content-type:text/html;charset=utf-8');
$result_str = PhpAnalysis::Instance()
              ->SetSource("composer的出现真是让人们眼前一亮,web开发从此变成了一件很『好玩』的事情。")
              ->Delimiter(' ')
              ->ExecSimpleAnalysis()
              ->ExecDeepAnalysis()
              ->Optimize( true );
echo $result_str;

如果用默认参数,上面也可以简化为:
$result_str = PhpAnalysis::Instance()
              ->SetSource("composer的出现真是让人们眼前一亮,web开发从此变成了一件很『好玩』的事情。")
              ->Exec();

三、常用设置及方法

  • Instance( $force_init = false )

  • SetOptions($unit_special_word=true, $unit_single_word=false, $max_split=false, $high_freq_priority=false, $optimize=true)

  • SetSource($source, $source_encoding = 'utf-8', $target_encoding='utf-8')

  • Delimiter( $str )

  • Exec( $return = true )

  • LoadDict( $main_dic_file = '' )

  • AssistBuildDict( $source_file, $target_file='' )

  • AssistExportDict( $target_file, $dicfile = '' )

  • AssistGetCompare()

  • AssistGetDeep()

  • AssistGetSimple( $string=true )

  • GetNewWords( $is_array=false )

  • GetResult()

  • GetResultProperty()

  • GetTags( $num = 10, $with_rank = false )

About

PHP中文无组件分词类(chinese analysis by php scripts)

License:Apache License 2.0


Languages

Language:TSQL 93.5%Language:PHP 6.4%Language:JavaScript 0.1%