CharsetDetector / UTF-unknown

Character set detector build in C# - .NET 5+, .NET Core 2+, .NET standard 1+ & .NET 4+

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not working on .NET Core 3.0

pcfulife opened this issue · comments

Hello, I am handling many CJK files. In .NET Framework 4.7.2, it works well. But, when I ported to .NET Core 3.0, CharsetDetector.DetectFromStream(fs).Detected returns null at most case.

I think that it is not a bug of UTF.Unknown. Because, all packages that uses or ported from UDE have same problems. But, I found that currently developing package is only UTF.Unknown. So, I post it.
I already installed System.Text.Encoding.Codepages (v4.6.0-preview.19073.11). And, I am using .NET Core 3.0.100-previre4-010381

Thanks.

Thanks for the info.

I think (also) this could not be an issue from the library - as it's written to .NET Standard and not specific for .NET Core

Hello!

I think the library can support .NET Core. But for this, we need to install the System.Text.Encoding.Codepages (>= .NETStandard 1.3, >= .NETFramework 4.6, >=.NETCoreApp 2.0) package and use it. And write where need it, for example

#if NETSTANDARD1_3
    Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
#endif

If I understand correctly, for this we need to first extract UtfUnknown.Core to a separate project, and the rest create separate projects for different versions of .NET

@304NotModified, What do you think about this?

If this is needed, then we could target the project also to .net core 3. No need for a separate project.

The project is already a multi target one, see

<TargetFrameworks>netstandard1.0;netstandard1.3;net40</TargetFrameworks>

And we need something like this:

https://github.com/NLog/NLog.Extensions.Logging/blob/b284fb661bd570f1cbc8c6bc796881420ac4fd1e/src/NLog.Extensions.Logging/NLog.Extensions.Logging.csproj#L94-L96

@304NotModified ok, thanks!

As I understand that with .NET Core 3.0 don't need add link to System.Text.Encoding.Codepages separately: dotnet/corefx#38357