jhu-bids / fhir-zulip-nlp-analysis

Ad hoc NLP (Natural Language Processing) analysis of HL7 FHIR's online Zulip chat streams.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HL7 FHIR Zulip chat NLP analysis

Ad hoc NLP (Natural Language Processing) analysis of HL7 FHIR's online Zulip chat streams.

Management GoogleSheet

The program reads directly from this Google Sheet. Allows for the management of:

  • Category keywords
  • User roles

Setting up and running

  1. Prerequisites: Python3
  2. Clone this repository.
  3. From the repository directory, install libraries: python3 -m pip install -r requirements.txt
  4. Get .zuliprc: Open up chat.fhir.org. Then follow the Zulip API key documentation. It will instruct you to fetch your API Key, but if you follow these instructions, when you get to the point where it shows you your API Key, there will also be an option that says "Download .zuliprc". Click that, and it will download a file. It might save the file under a different filename. If so, you should rename that .zuliprc. Place this file in the root directory of this cloned repository.
  5. From the repository directory, run: python3 -m fhir_zulip_nlp

Results

After running, analysis reports will be generated and saved in the repository directory as CSV files with the name pattern fhir-zulip-nlp*.csv.

Releases from previous runs have been uploaded on GitHub and GoogleDrive.

Codebook

zulip_raw_results.csv

TODO

zulip_raw_results_user_participation.csv

TODO

zulip_report_queries_with_no_results.csv

TODO

zulip_user_info.csv

TODO

zulip_report1_counts.csv

TODO

zulip_report2_thread_lengths.csv

TODO

zulip_report3_users.csv

  • user.id (integer): Zulip-assigned user ID. Can look up this ID in zulip_user_info.csv to find more info.
  • user.full_name (string): Full name of user.
  • stream (string): Zulip stream.
  • category (string): Category as defined in GoogleSheet.
  • keyword (string): Keyword as defined in GoogleSheet.
  • role (string): Either 'author' or 'respondent'.
  • count (integer): The sum total for of the combination of user x stream x category x keyword x role for the given row. If any of these cells in the row is empty, it means that this row represents a higher-level aggregation. E.g. if only keyword is empty, the count represents the combo of user x stream x category x role. If keyword and category are empty, the count represents the combo of user x stream x role.

About

Ad hoc NLP (Natural Language Processing) analysis of HL7 FHIR's online Zulip chat streams.

License:MIT License


Languages

Language:Python 100.0%