sheredom / utf8.h

πŸ“š single header utf8 string functions for C and C++

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Character iterating?

codecat opened this issue Β· comments

What do you suggest is the best way of iterating over codepoints using this library?

Heyo @angelog!

So at present I've just done it manually when I've needed to - but I agree that is not the ideal approach for everyone.

I could forsee adding a function (something like):

void* some_utf8_str = ...;
long codepoint;
some_utf8_str = utf8codepoint(some_utf8_str, &codepoint);

And you could then iterate until codepoint was the null terminator ('\0'). Would that be of use to you?

That would be a helpful addition to the library, yeah.

I'm curious, how exactly were you doing it manually?

Basically the run length of the utf8 codepoint is encoded by the pattern of the first bits of each byte. I was creating a long codepoint by concating multiple bytes together.

I think having a function to do this makes a lot of sense though, I'll work on it!

Ah, yeah it doesn't sound too practical to do it manually. Thanks! πŸ‘

Hey @angelog can you check out pull request #21 for me please? I've included an example of how to use it in the pull request too πŸ˜„

I've merged #21, solving this issue.

I will play around with it later tonight. Thank you! πŸ‘

Sorry I didn't reply to this earlier, I was pretty busy. Tried it last night, works wonderfully! Thank you :)