sonny's repositories
awesome-cto
A curated and opinionated list of resources for Chief Technology Officers, with the emphasis on startups
betterdev.link
Links to improve programing skill
bitcoin-notes
personal notes about bitcoin and bitcoin core
bucket4j
Java rate limiting library based on token/leaky-bucket algorithm.
ccia_code_samples
Code samples for C++ Concurrency in Action
Clean-Code---Tieng-Viet
Clean Code Tiếng Việt: Bản dịch 6 chương đầu từ quyển "Clean Code - A Handbook of Agile Software Craftsmanship" - Robert C. Martin et. al.
d2l-vi
Một cuốn sách về Học Sâu đề cập đến nhiều framework phổ biến, được sử dụng trên 300 trường Đại học từ 55 đất nước bao gồm MIT, Stanford, Harvard, và Cambridge.
DPED
Software and pre-trained models for automatic photo quality enhancement using Deep Convolutional Networks
easy-rules
The simple, stupid rules engine for Java
fileshelter
FileShelter is a “one-click” file sharing web application
guava-cache-redis
implement guava cache interface backed by redis
Low-Level-Design
Useful Resources for Low Level System Design
LuceneTutorial
A simple tutorial of Lucene for LIS 501 Introduction to Text Mining students at the University of Wisconsin-Madison (Fall 2021).
news-crawler
News crawlers for some sites.
nlp-cheat-sheet-python
NLP Cheat Sheet, Python, spacy, LexNPL, NLTK, tokenization, stemming, sentence detection, named entity recognition
numpy_exercises
Numpy exercises.
open-source-search-engine
Nov 20 2017 -- A distributed open source search engine and spider/crawler written in C/C++ for Linux on Intel/AMD. From gigablast dot com, which has binaries for download. See the README.md file at the very bottom of this page for instructions.
pandas_exercises
Practice your pandas skills!
python-cheatsheet
Comprehensive Python Cheatsheet
system-design
Learn how to design systems at scale and prepare for system design interviews
text2vec
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
tiktok-downloader
Telegram Bot for downloading video from TikTok without watermark.
tiktok-downloader-bot
A Telegram bot to download videos or images from tiktok without watermark
tiktok-nowatermark
Tiktok Video Downloader without watermark
tiktok-scraper
scraper do tiktok (principal hashtag)
timeshift
System restore tool for Linux. Creates filesystem snapshots using rsync+hardlinks, or BTRFS snapshots. Supports scheduled snapshots, multiple backup levels, and exclude filters. Snapshots can be restored while system is running or from Live CD/USB.
Vietnamese_LLMs
Dự án bao gồm: 1. Xây dựng bộ dữ Instructions Vietnamese (chất lượng, nhiều, và đa dạng). 2.LLM Training, Finetuning, Evaluating & Testing trên Open-source mô hình ngôn ngữ: Bloomz,T5, UL2, LLaMA (1&2), OpenLLaMA, GPT-J pythia etc. 3. Ứng dụng và Giao diện Người dùng (UI)
web-scraping
Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
WebCollector
WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.