gatheringhallstudios / MHWorldData

Generate a SQLite file from MHW data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sharpness data

TanukiSharp opened this issue · comments

Hello guys.

I think I can help about sharpness data, I already wrote a tool in C# that extracts sharpness data from http://mhwg.org and it's very accurate. Actually I double checked all the data with the game, there were a few mistakes that I reported to this site, and they fixed it pretty quickly.

The tool is there is you want to have a look at the source code.
https://github.com/TanukiSharp/MHWSharpnessExtractor

If you want to me to modify it to generate output data in a different way, please let me know.
There is a lot of work done to output data for https://github.com/LartTyler/MHWDB-API but I can adjust / change for your convenience.

Let me know.

Thank you very much for hitting us up!

I looked through your readme, its great that you were able to to solve the translation problem! We were planning on doing a similar thing but waiting until we had a proper english <-> japanese name mapping, but obviously it'd be quite a bit of work. Would I be correct in assuming that you have the sharpness value for at least sharpness+5? If so we'd love the help!

We can work with any format that pairs english name with sharpness values. The ideal would be an updated sharpness column in https://github.com/gatheringhallstudios/MHWorldData/blob/master/source_data/weapons/weapon_base.csv but I wouldn't mind working with other result types. JSON should be fine.

@TanukiSharp @CarlosFdez The MHWDB-API also now has sharpness values for every handicraft level, sourced from http://mhwg.org. The scraping process I'm using is based off of the work that @TanukiSharp did back in May, as well as code provided by @TrentWest7190. If you're looking for more than just sharpness at Handicraft +0 and +5, mhw-db.com might be a good source for that data as well.

Ah, if you've worked with it that works as well! Handicraft+5 values are enough to derive values for the other handicraft values. Although I do have a question: I looked at your API, you only have the handicraft +0 sharpness? Your units seem to be out of 100 instead of 400 (the ingame limit which is easier to compute with is out of 400).

The switch over to match the ingame values was a recent change, and since I didn't want to break backwards compatibility without warning, the true sharpness values actually live in the durability array on weapons. The sharpness field is deprecated, and will be phased out over the next couple of releases. Eventually, once people have had a chance to switch over to the new values, I'll change durability back to sharpness.

For example, if you take a look at the API results for Dazzling Flash 3, you'll see that the durability field contains the following values. The index of the array corresponds to the level of handicraft required to reach those sharpness values.

Edit: Switched from using Nergal Reaver as the example to Dazzling Flash 3, since I realized after I posted that Nergal Reaver does not change sharpness with extra Handicraft levels.

{
  "id": 103,
  "name": "Dazzling Flash 3",
  "durability": [
    {
      "red": 80,
      "orange": 50,
      "yellow": 60,
      "green": 120,
      "blue": 40,
      "white": 0
    },
    {
      "red": 80,
      "orange": 50,
      "yellow": 60,
      "green": 120,
      "blue": 50,
      "white": 0
    },
    {
      "red": 80,
      "orange": 50,
      "yellow": 60,
      "green": 120,
      "blue": 50,
      "white": 10
    },
    {
      "red": 80,
      "orange": 50,
      "yellow": 60,
      "green": 120,
      "blue": 50,
      "white": 20
    },
    {
      "red": 80,
      "orange": 50,
      "yellow": 60,
      "green": 120,
      "blue": 50,
      "white": 30
    },
    {
      "red": 80,
      "orange": 50,
      "yellow": 60,
      "green": 120,
      "blue": 50,
      "white": 40
    }
  ]
}

I definitely wasn't aware! Thank you very much! I'll start digging into it soon forsure.

@CarlosFdez

[...] you were able to to solve the translation problem!

Yes, actually that was 95% of the work, parsing the data, though parsing HTML is never fun, was not that hard, more like tons and tons of check in order to try (as much as possible) to avoid breaking the parser in case of future styling/DOM change in the HTML.


Would I be correct in assuming that you have the sharpness value for at least sharpness+5?

Yes correct, and as @LartTyler mentionned, it should also be in mhw-db now.
If you need, the code can probably be adjusted to help you map you own IDs.


Handicraft+5 values are enough to derive values for the other handicraft values

Unfortunately not. The increment from a Handicraft level to another is indeed constant (10, or 40, depending on your scale), but you cannot deduce when transition from a sharpness rank to another will happen.

Let's see the Buster Sword 1 (GS):

Sharpness data:

Rank Red Orange Yellow Green Blue White
0 100 50 50 0 0 0
1 100 50 60 0 0 0
2 100 50 70 0 0 0
3 100 50 80 0 0 0
4 100 50 80 10 0 0
5 100 50 80 20 0 0

Let's assume you know the rank 2 100, 50, 70, 0, 0, 0 it is not possible to know if next will be 100, 50, 70, 10, 0, 0 or 100, 50, 80, 0, 0, 0.

So we know base sharpness is 200 (100 + 50 + 50 + 0 + 0 + 0) and the rank 5 is 250 (100 + 50 + 80 + 20 + 0 + 0), but for rank 3 (230 here) it is not possible to determine if it will fall in the yellow or green rank.

As for Nergigante weapons, yes it is their particularity that they quickly max the sharpness, to blue, that's the drawback, but they have still pretty long blue sharpness bar without the need for Handicraft.

@TanukiSharp I do think you'd be able to interpolate between +5 and +0. All you need is the +5 bar and you can work backwards, right?

Yeah, interpolating backwards is what I meant. Forward is impossible for the reasons stated.

I would love the help, I just didn't wish to impose! This repo has a different approach than Lart's. This one is an accumulation of human editable data (as CSVs) that is merged into a final result (the DB). There is currently no scraping code committed to the repo, though utility merge scripts don't sound like a bad idea.

In terms of actual mapping, this one is mapped by the english name of the base object. A weapon_sharpness.csv starting with base_name_en and then sharpness_0, sharpness_1, ... sharpness_5, is actually all that's required. That or just base_name_en + sharpness_5 and a flag to say if handicraft has an effect.

Funny enough, we chose to solve the inconsistent id problem by not keying with ids at all! The id field is only there to maintain a stable identifier in the app and in the generated SQLite Database.

@jaysondc You are right, sorry I didn't see that ^^ but yes, from the +5 it is possible to deduce all other ranks, even the -1, -2 etc... in case some day Capcom releases a malus (like in previous MH) that decreases the sharpness (though very unlikely to happen).

Heads up, but I've been a bit busy with something for generations ultimate so I haven't gotten around to this yet. I'll definitely do it once I find some time.

Not sure where we are now, so if you need my help, please let me know what to concretely do :)

Thank you! It'd be really appreciated.

I can work with a CSV table in source_data/weapons/weapon_sharpness.csv with the following fields:
base_name_en,sharpness_1,sharpness_2,sharpness_3,sharpness_4,sharpness_5

base_name_en is the english name of the weapon.
sharpness_X is be a comma delimited list of sharpness values from red to white.
Weapons without sharpness (guns) would be omitted.

It shouldn't be a whole lot (could be done in an hour or two by parsing json data from @LartTyler), but I haven't been able to get around to it.

@CarlosFdez
I sent a PR to add the sharpness data, the sharpness_X is the level 5.

@LartTyler the following weapons have empty durability array:

  • Azure Star "Dragon Dance"
  • Sapphire Star Lance

@TanukiSharp Thanks for letting me know, I'll take a look.

Because I was curious, I was looking at your PR (#15), and it seems like it's missing information. I think what @CarlosFdez was looking for was something more along the lines of:

base_name_en sharpness_1 sharpness_2 sharpness_3 sharpness_4 sharpness_5
Buster Sword I 100,50,60,0,0,0 100,50,70,0,0,0 100,50,80,0,0,0 100,50,80,10,0,0 100,50,80,20,0,0
Buster Sword II ...

Sorry I've made my PR a bit early and committed several fixes, but now it displays correctly in GitHub: https://github.com/TanukiSharp/MHWorldData/blob/master/source_data/weapons/weapon_sharpness.csv

@LartTyler That's what I thought in the beginning, but he said sharpness rank 5 is enough to deduce others, moreover coma delimited values in a coma separated values format sounded strange...

I can always fix it, but I need @CarlosFdez to be more specific and accurate.

Also, please tell me if you want me to commit the tool somewhere in your repo or if you only want the data ? It's a few lines of C# code.

@TanukiSharp If you take a look at the existing sharpness values in their weapons_base.csv file, that's actually how they currently have it set up, where it's the red to white sharpness values as a comma separated list in a single column. That's why I assumed that they were looking for something along the same lines, but I could be wrong.

@LartTyler Indeed, you are right.

I'm going to wait for input from @CarlosFdez
Also, names do not match, Buster Sword I against Buster Sword 1, are you going to handle the mapping on your side ?

Thank you for your time on this!

LartTyler is correct. I originally did think of deriving, but not deriving is significantly easier and would require less code, and I don't think it'd bloat up the size of the DB much at all (a few kilos max?)

In terms of the name inconsistency, that kind of thing is hard to handle at the import layer without hacks. I could try to mass convert them ahead-of-time though. That said, Buster Sword III is the canonical ingame representation. If you work with any text dumps you'd be working with roman numerals, not numbers.

As far as the tool is concerned, this codebase is mostly hand assembled with external scraping support. I'm not against having a tools section for any support tools, but is it usual for a C# tool to be embedded in a python one? I'd love to link to your tool in the readme! The readme needs fleshing out anyways.

OK I will fix the CSV in your repo, since my PR is not merged yet, I will simply update it.
I will also push the tool with WTF license in my repo for you to freely reference / fork / rip it, to your heart content.
I will keep you posted :)

Updated, and I also pushed the conversion tool to my repo, linked in the PR.

There is 7 columns in the output file:

  • base_name_en: name
  • sharpness: base sharpness
  • sharpness_1: sharpness with Handicraft +1
  • sharpness_2: sharpness with Handicraft +2
  • sharpness_3: sharpness with Handicraft +3
  • sharpness_4: sharpness with Handicraft +4
  • sharpness_5: sharpness with Handicraft +5

Hey! Since I more or less caught up with a separate project, I checked this out a bit more throughly, and now I feel bad. I thought that it was a modified version of the scraping tool that got the data in for mhwdb, not a transferal for data from mhwdb. Which is fine, but doing it in python as a dedicated merge.py would make it easier to solve the name mismatch problem by doing compares with existing data (the db requires 100% exact matching names at the import step by design).

You did the original work regarding Sharpness, so would it be fine if I work on the merge stuff and then credit you for the original sharpness work? It wouldn't be an inconvenience, as you and Lart already did the hard parts.

Its not that I'm against C# (C# is one of my favorites!), but because I already wrote a large amount of support tooling for working with the data in Python.

Funny enough, after starting work on it myself I realized just how messed up the scraped weapons were! I'll need to be doing some correcting on my end anyways.

I managed to pull the data in a format my data can agree with, though I've yet to merge it into the database. @TanukiSharp thank you very much for your work on sharpness data, and I'm sorry for wasting your time with my indecisiveness. Without your work it would have taken much much longer to get sharpness values.

I went with sharpness+5 and will calculate downwards, since its inherently more secure.