ThomasHedden / utf8

Programs and functions for processing UTF-8 Unicode encoded text in the C programming language, and tutorials about this

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

utf8

This repository contains programs and functions for processing UTF-8 encoded text in the C programming language, and a brief tutorial about this. If you are new to Unicode, I suggest that you first read a tutorial about it. This repository does contain a brief tutorial, but it makes no attempt to be comprehensive and assumes that the reader already knows the basics of UTF-8. Unfortunately, the Unicode Consortium website has so much information that it is overwhelming and hard to use. There are a few books about Unicode, but once again, they tend to be long and "get lost in the weeds", and discourage the reader before providing the key information necessary to really understand Unicode, when in fact only a small amount of knowledge is needed to understand how UTF-8 works. The only guide that I wholeheartedly recommend is the laminated reference chart entitled "Unicode Guide: The Ultimate Reference Guide to the Universal Character Encoding Standard", ISBN-13: 9781423201809, ISBN-10: 1423201809 (a QuickStudy: Computer guide from BayCharts, Inc.). This guide is not what students buy the afternoon before the final exam in a vain attempt to learn what they should have, but did not learn over the course of the semester, nor is it of the "for dummies" genre: you already have to understand a lot about computers before you can understand this guide. For example, if you don't understand what bytes or ints or chars or bits are, you are not ready for it. If you are somewhat familiar with UTF-8, then I suggest that you read the tutorial in this repository. The functions are provided in the form of source code that can be compiled to object code that can then compiled together with another program. At a later date I will provide the same code in the form of header files that can be #included. No binaries of any kind are provided.

About

Programs and functions for processing UTF-8 Unicode encoded text in the C programming language, and tutorials about this

License:MIT License


Languages

Language:C 98.9%Language:Makefile 1.1%