There are 0 repository under tokenizer-parser topic.
数据标注是一款专门对文本数据进行处理和标注的工具,通过简化快捷的文本标注流程和动态的算法反馈,支持用户快速标注关键词并能通过算法持续减少人工标注的成本和时间。数据标注的过程先由人工标注构建基础,再由自动标注反哺人工标注,最后由人工标注进行纠偏,从而大幅度提高标注的精准度和高效性。数据标注需要依赖开源的数字底座进行人员岗位管控。
A toolkit that makes it easier to write recursive-descent parsers in Zig.
C language lexer & parser & virtual interpreter from scratch
:wrench: My studies on context-free grammar, using ANTLR4 (C++) to generate the parser files. Some basics are developed, such as token processing, recursion, variable definition, array processing, Abstract Syntax Tree (AST) manipulation, UNICODE support, and error handling.
Simple to use parser capable of parsing a usable time object from human input
This is a short and modern JIT compiler that transform source text, into LLVM IR bytecode that executes machine code at runtime. This project was developed at the hths.hacks() hackathon against more 250+ participants internationally and was placed as a winner. Among the winners, my project was the only one developed solo.
:wrench: My studies involving context-free grammar analysis. The analyzers were built using familiar tools such as YACC, Lex and Bison. Topics covered include token filtering, simple variable manipulation, and arrays.
A Basic Experiment in Parser and Compilers and Stack VM . A basic stack based CPU with Assembly language and basic commands. A basic programming Languge Parsed to Tokens to e parsed to expressions to be compiled to assembly code to be executed on the virtual CPU... Also to be used to Parse English grammar to make abstract syntax trees.
Machine Learning approach to Bengali Corpus POS Tagging using BNLTK. This is an experimenting project under the mentorship of Prof. Sandipan Ganguly, HIT-K.
A lightweight, simple and fast parser for OData V4 query options supporting standard query parameters. Provides helper functions to apply OData V4 query options to ORM/ODM queries such as SQLAlchemy and Beanie.
Write use-case specific parsers within minutes!
Oxide is a hybrid database and streaming messaging system (think Kafka + MySQL); supporting data access via REST and SQL.
A README for my private CS 2112 Critter World Project
A tiny and complete tool to supercharge static JSON strings with dynamic, user-defined expressions.
A very fast and low memory usage C++ automaton tokenizer that breaks an input string into a list of tokens looking at tabs, spaces, new lines, and detects special tokens like numbers, prces, personal noms, emails, lexemes, etc. It allows to specify delimeters and detect special cases.
tokeniser for math in c#
this repository contient a minimal parser
An interpreter for a custom-made, Pascal-Like Programming Language
:wrench: Demonstration of using ANTLR4 (with runtime for C++) in projects for context-free grammar processing. The ANTLR4 (Java) package is included, and the project is configured to compile on Linux.
An automatic UML generator for Java that *actually works*
Recreating JSON.parse
Mini programming language.
Software produzido na disciplina de Linguagens Formais que deverá receber como entrada uma expressão em LISP e retornar a expressão posfixa equivalente.
An experimental database implementation written in pure Go.
Parse HTML in Go using Node Parser, Tokenizer, and tools like Goquery and Colly, with practical examples and efficient web scraping techniques
Dynamic array implementation in C with a modular, folder-based structure.