AstraZeneca / KAZU

Fast, world class biomedical NER

Home Page:https://AstraZeneca.github.io/KAZU/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Complete Documentation - 'TBA' pages

jo780-full opened this issue · comments

Hello, the documentation is incomplete also is the code for the demo project available on Git Hub?
Thank you

Hi,

Thanks for your interest in kazu!

What documentation in particular would be of help to you that isn't present - is it the pages that are currently marked TBA, or something else?

I'm also not clear what you mean by 'code for the demo project' - have you worked through the quickstart page of the documentation? Are you interested in seeing an example of batch processing a large number of documents? That's something that one of my colleagues intends to write in the 'scaling with Ray' page fairly soon, but I'm not sure if it's something slightly different that you're looking for.

I'm keen to improve the documentation, but also it would be good make this issue more narrowly scoped - there's always more documentation that could be added, so I'd be keen to make this a specific task with a known point at which it's done, so it can be more easily picked up.

Hello and thanks for the response, Yes I was referring to the pages marked TBA.The "code for demo project" I was taking about was the for the web live demo (Swagger UI) from http://kazu.korea.ac.kr/](http://kazu.korea.ac.kr/) .

Hi - thanks for clarifying, I'll update the title of the issue.

We intend to cover running the web server in the docs in the Kazu as a WebService page, but haven't gotten around to it yet - I can see if we can prioritise that. Unfortunately we can't open source how we exactly deploy it because it's using company-internal build processes, but we will be able to provide information about how to achieve the same thing.

For the time being, the key thing for the Swagger UI is just running:

(with the $KAZU_MODEL_PACK variable set as in the quickstart docs)

$ python -m kazu.web.server --config-path "${KAZU_MODEL_PACK}/conf/" hydra.run.dir="." +ray.init.num_cpus=1 +ray.init.object_store_memory=1000000000

There's also a way of running it to set up JWT authentication - let me know if you're interested in that and I can provide details. But the above should be enough to get you up and running on your local machine.

Hope that helps!

Our Dockerfile may also be of interest to you - although it installs kazu from an internal 'artifactory' instance rather than the public PyPI so we can test it internally before releasing publicly, so if you used it, you would need to modify this line to just remove the --extra-index argument to pip install. You would also want to remove some of the ARG lines that relate to 'internal' stuff.

Sorry for the 'internal' difficulty here, we haven't fully mastered how we interface between what's 'internal' to AstraZeneca, and what's external here. We should be able to make some improvements in future (e.g. here, we could potentially just release public 'release candidate' versions and then use public PyPI).

Thank you so much for this information

Also, could you provide a few more examples of usage to get a better understanding, especially on how to get "dbXRefs" and example usage of the ontology parser usage

Was the demo Website taken down??

Hi It seems like it is down at the moment. I will look into this!

Thank you

Is it live yet?

Hi, I was informed that there was a power outage in KU last weekend. The server became active last night, but we are working to ensure its stability. Currently, the server isn't using the GPU, so I am addressing this issue. It should be back online by this afternoon.

Thank you so much

Hi @jo780-full ,

Our server is now online and updated to version v1.0.3. I apologize for any inconvenience caused.

Please note that the current version may produce results that slightly differ from the previous version (0.1.0).

Although the current server operates on a GPU, it seems somewhat unstable and may not deliver the expected speed. I suspect this might be a hardware issue, especially since this server is shared with other web services. I'll be looking into this performance issue, but I am afraid to tell you that I am not able to provide a definite timeframe for a resolution at the moment.

I'll also begin documenting the web server and will add a few lines initially (I will share it in this thread). This will facilitate either myself or our collaborators to enhance the documentation in the near future.

Once again, thank you so much for your patience and your interest in our work!
Best,
WonJin

Here is a draft for webserver demo quickstart.

This is CPU-only version.
For GPU-enabled version, we will update shortly (I put TODO in the line).

python3.9 -m venv kazuenv
source kazuenv/bin/activate

# TODO: install torch first if you want GPU-enabled Kazu. However, it will take more time to initiate the server. 

pip install kazu[webserver]==1.0.3
# if there is an error about bson, please install pymongo by : pip install pymongo 
# You may also need to install: pip install diskcache

mkdir kazu
cd kazu

wget https://github.com/AstraZeneca/KAZU/releases/download/v1.0.3/kazu_model_pack_public-v1.0.3.zip
unzip kazu_model_pack_public-v1.0.3.zip

export KAZU_MODEL_PACK=${PWD}/kazu_model_pack_public-v1.0.3

python -m kazu.web.server --config-path "${KAZU_MODEL_PACK}/conf" hydra.run.dir="${PWD}" +ray.init.num_cpus=1 +ray.init.object_store_memory=1000000000