UltraStar-Deluxe / Play

Free and open source singing game with song editor for desktop, mobile, and smart TV

Home Page:https://ultrastar-play.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problems with File scan on Swedish language letters like "ÅÄÖ"

jimmyhawkin opened this issue · comments

Issue type: Bug report

Actual behaviour

When searching song library it says the files cant be found when they are there.

Expected behaviour

files are there and config is correct but it looks like theres no current support when file names contains ÅÄÖ

Steps to reproduce

1.Add a Directory containing file names with ÅÄÖ
2. The feuture you added that will tell when there is a problem will trrigger and say it cant find musik file. And i could see that what was missing was åäö that it did not understand.

Details

Provide some additional information:
Files are saved in Ansi, Have not had this problem sins early versions of old Ulstrastar verstion "non play editions"
Vocalux App allso scans the files with no problems.

  • UltraStar Play version: 0.8.2
  • Windows 11 + 21H2
  • if applicable, add a log file

Make sure the txt file is saved with UTF-8 encoding.

I assume that the file currently uses something else, such that non-ASCII characters are messed up.

So no planned support for this when other applikations like your using these type of files fully support this format?
Its just ASCII files created with Ultrastar Creator app. Its nothing funny made with them at all. And as i said. Used by all other even former Ultrastar editions worked with this type.

Actually, there is still a method in UltraStar Play TxtReader.GuessFileEncoding, which may just not be very good.

I would not mind to throw in some other algorithm as long as every properly encoded Unicode file will still be recognized as such.

There are plenty of attempts on StackOverflow:

But guessing the encoding will always have edge cases that are not guessed correctly. And it can increase load time considerably.

Its just ASCII files created with Ultrastar Creator app

That's a pity really. IMO, every app should use Unicode by default.

commented

So no planned support for this when other applikations like your using these type of files fully support this format? Its just ASCII files created with Ultrastar Creator app. Its nothing funny made with them at all. And as i said. Used by all other even former Ultrastar editions worked with this type.

Hi, I have been the maintainer of UltraStar Deluxe for the past 7 years or so. The decision to slowly drop support for files that are not UTF-8 was partially my decision, and was done after bunch of discussions about this with developers from other tools like Performous, previous UltraStar Deluxe developers, as well as the developers of UltraStar Manager and UltraStar Creator. After that, all of these tools have been changed to all prefer (and by default use) UTF-8. This will get rid of all those annoying encoding issues that people did regularly run into, and is worth it for the whole community to do this one-time "cleanup effort". When you edit a file in cureent versions of UltraStar Deluxe, it will always be saved with UTF-8 encoding. If you use a current version of UltraStar Manager, it will automatically change all your UltraStar TXT files to UTF-8. if you use the current version of UltraStar creator, it should (afaik) also save the file with UTF-8 encoding.

So no planned support for this when other applikations like your using these type of files fully support this format? Its just ASCII files created with Ultrastar Creator app. Its nothing funny made with them at all. And as i said. Used by all other even former Ultrastar editions worked with this type.

Hi, I have been the maintainer of UltraStar Deluxe for the past 7 years or so. The decision to slowly drop support for files that are not UTF-8 was partially my decision, and was done after bunch of discussions about this with developers from other tools like Performous, previous UltraStar Deluxe developers, as well as the developers of UltraStar Manager and UltraStar Creator. After that, all of these tools have been changed to all prefer (and by default use) UTF-8. This will get rid of all those annoying encoding issues that people did regularly run into, and is worth it for the whole community to do this one-time "cleanup effort". When you edit a file in cureent versions of UltraStar Deluxe, it will always be saved with UTF-8 encoding. If you use a current version of UltraStar Manager, it will automatically change all your UltraStar TXT files to UTF-8. if you use the current version of UltraStar creator, it should (afaik) also save the file with UTF-8 encoding.

I just for the heck of it started Ultrastar Creator now and tried a new file. And yeh your correct ut does use UTF-8 default. Hmm i wonder were down the road it Remakes it to Ascii. Ill see if its Yass that i later use that does this, Or if i changed it on somefiles for some dumb reason. Intresting thing atleast. But it would be nice to have the applikation see that it contains åäö and then revert to using utf-8 on the files or something. I mean Vocalux has to do something like that atlest.

Yes it was Yass that did it. Well Then thats cleard out what was the cult spirit. I had to go in to its preferances and change that i always saves in UTF. So thx for the Quick replys so i could fix so that does not Continue :)

commented

Yass 2.1.1 or newer should by default use UTF-8, as far as I know. May I ask what version you have? (Maybe that still needs a change which we did not yet keep in mind)

Yass 2.1.1 or newer should by default use UTF-8, as far as I know. May I ask what version you have? (Maybe that still needs a change which we did not yet keep in mind)

Have Yass 2.3. I still had to go in to Extra>preferance>File types> and Select always store as UTF

I just integrated a Unity package for UTF-Unknown, which is based on Mozilla Universal Charset Detector.
The project is under Mozilla Public License, which should be compatible with MIT license.

Still, UTF-8 will be the default fallback encoding if charset detection did not work with high enough confidence.

See b37fd6d

commented

@achimmihca might be worth it to do a performance test comparison on a low-end device and with a few thousand song txt files.

performance test comparison

In my tests, the Universal Charset Detector is between 25-50% slower. This is significant already.
Just tested it with 1000 files on my Laptop, duration was between 1000 to 1500 ms.

Thus, I added an option to disable Universal Charset Detector such that the previous approach is used.

  • Question: Should the option be enabled or disabled by default?
    • I'd say enabled because otherwise users will continue to complain that their files do not work. People with lots of songs may disable it if parsing songs takes too long.

BTW: UltraStar Play always saves files as UTF-8, no matter how they have been loaded.