CharsetDetector / UTF-unknown

Character set detector build in C# - .NET 5+, .NET Core 2+, .NET standard 1+ & .NET 4+

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Detect encoding from string?

foxi69 opened this issue · comments

Can i get proper encoding from text like this? "Español"

Well the encoding in a string is always UTF 16.

.NET uses UTF-16 to encode the text in a string. A char instance represents a 16-bit code unit.

From https://learn.microsoft.com/en-us/dotnet/standard/base-types/character-encoding-introduction

Okay, and can I convert this to the correct character set?
It could be "Español"

Btw it came from Xabe.FFmpeg SubtitleStream.Title prop.