roxana-lafuente / ResearchAnalyzer

Analysis tool for the logs generated with ResearchLogger

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ResearchAnalyzer

Open-source python API under GNU license for automation of ResearchLoggers' logs analysis. It offers reports on:

  • Clicks
  • Keystrokes
  • Pauses
  • Translation phases (provided that some preconditions are fulfilled)

Also it analyzes the final document, providing information on the final result.

How to use

Dependencies

  • termcolor [sudo apt-get install python-termcolor]
  • matplotlib
  • colour [pip install colour]
  • Python Image Library (PIL) [sudo apt-get install python-pil]
  • nltk [sudo pip install nltk]

Classes

  • The LogInfo object provides information on the data by using descriptive statistics. It must be instantiated by providing:
    • A clicks log.
    • A keystroke log.
    • A system log.
  • The TextInfo object provides information on the final document. It must be instantiated by providing:
    • The final document.

Methods

Clicks

  • get_unique_pressed_clicks(): Returns a set with all unique types of clicks used during the logging session.
  • get_all_pressed_clicks(): Returns a list of pairs for each event in the session. Format:
(x coordinate, y coordinate, screen resolution, button type, click type, program name, window title)
  • get_click_info(): Returns a list of tuples. Each tuple has the following format:
(button type, [down times], up time, press duration, [down x], [down y], up x, 'up y)
  • print_click_summary(): It displays a summary of the click data:
    • Total amount of clicks.
    • Windows in which clicks are pressed.
    • Types of clicks used.
    • Mean click press time.
    • Click press time variance.
    • Click press time standard deviation.
  • plot_clicks_in_screenshot(): Marks in a screenshot where clicks were made. Visualization of clicks in the screen session

Keystrokes

  • get_unique_pressed_keys(): Returns a set with all unique types of keystrokes used during the logging session. Examples: period, comma, a, c, z, downarrow, etc.
  • get_all_pressed_keys(): Returns a list of pairs for each key event that occurred in the session. Format:
(key name, key type, window name, window title)
  • get_letter_info(): Returns a list of tuples. Each tuple has the following format:
(key type, [down times], up time, press time, [down x], [down y], up x, up y)
  • print_key_summary(): It displays a summary of the keystroke data:
    • Total amount of pressed keys.
    • Unique pressed keys.
    • Windows in which keys are pressed.
    • Mean key press time.
    • Key press time variance.
    • Key press time standard deviation.
    • Unique function keys used.
    • Unique erase keys used.
    • Unique movement keys used.
    • Unique key combos used.
  • plot_keystroke_progression_graph(bin_size): It plots the amount of pressed down keys against time. Each of the bin size can be specified in seconds.

Keystroke progression graph

  • plot_clicks_progression_graph(bin_size): It plots the amount of pressed down clicks against time. Each of the bin size can be specified in seconds.

Clicks progression graph

Resources

  • get_time_by_active_window(): It returns a dictionary whose key is the name of a window used and its value is the time spent in it (in milliseconds). A special entry "total" contains the total session time.
  • plot_window_distribution_pie_chart(): Plots a pie chart of the times spent in each window in the session.

Window time distribution pie chart

Phases

  • get_orientation_info(): Returns a tuple with the format (start time, start end).
  • get_drafting_info(): Returns a tuple with the format (start time, start end).
  • get_orientation_info(): Returns a tuple with the format (start time, start end).
  • get_total_session_time(): Returns an integer with the total session time.
  • print_phases_summary(): Prints a summary of the phases (orientation, drafting, revision and the whole session).
    • Start and end time.
    • Phase duration.
    • Percentage of the session dedicated to the phase.

Pauses

  • print_pauses(): Prints an enumeration of all the pauses in the session, including the start and end of each one.
  • print_pause_summary(begin, end): Prints a summary of the pauses in the interval [begin, end]. It includes:
    • Total amount of pauses in the session.
    • Amount of:
      • Short pauses.
      • Medium pauses.
      • Large pauses.
      • Non significant pauses.
    • Mean pause time.
    • Pause time variance.

Final document

  • get_words_by_part_of_speech(): Returns a dictionary, the key is a part of speech and the value is a list with all the words that were used that can be as that part of speech.
  • get_sentences(): Returns the sentences in the final document as a list of strings.
  • get_n_most_frequent_words(n): Returns a list of strings containing the n most frequent words in the final document.
  • plot_frequency_distribution(n): Plots the frequency distribution plot of the most common n words. n_most_frequent_words_plot

Example

An usage example of ResearchAnalyzer can be found in main.py; it uses the ResearchLogger's example log files on the "example_log" folder.

The keystrokes summary looks like: Keystrokes summary

The clicks summary looks like: Clicks summary

Note: You need to collect your own data with ResearchLogger to analyze your own datasets.

About

Analysis tool for the logs generated with ResearchLogger

License:GNU General Public License v3.0


Languages

Language:Python 100.0%