Luca Canali's repositories
sparkMeasure
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.
Miscellaneous
Includes notes on using Apache Spark in general, notes on using Spark for Physics, how to run TPCDS on PySpark, how to create histograms with Spark, tools for performance testing CPUs, Jupyter notebooks examples for Spark, examples for Oracle and other DB systems.
Linux_tracing_scripts
Scripts and tools for troubleshooting and performance analysis in Linux. This includes dynamic tracing scripts with SystemTap both for system calls and for userspace function tracing.
Oracle_DBA_scripts
A collection of old-school CLI scripts for Oracle RDBMS monitoring and performance troubleshooting.
PyLatencyMap
PyLatencyMap is a tool for heat map visualization on the CLI. It is integrated with scrips to collect and visualize I/O latency heat maps from various sources, including SystemTap, DTrace, Oracle wait events, NetApp filers, trace files.
PerfSheet4
PerfSheet4 is a tool for performance troubleshooting of Oracle databases. Query and visualize Oracle AWR data using pivot charts.
Stack_Profiling
Tools and scripts for stack profiling: Userspace, Kernel, OS state and optionally Oracle wait
PerfSheet.js
PerfSheet.js is a tool for Oracle RDBMS performance troubleshooting. Use it to extract and visualize Oracle AWR time series data in the browser using JavaScript and dynamic pivot charts.
ipython-sql
%%sql magic for IPython, hopefully evolving into full SQL client
OraLatencyMap
OraLatencyMap is a performance widget running on SQL*plus (Oracle's CLI) to collect and visualize latency histograms for Oracle wait events using heat maps.
dist-keras
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
hbase-connectors
Apache HBase Connectors
jupyter-extensions
Jupyter extensions for SWAN
jupyterhub-extensions
Customized components of the Jupyterhub server in SWAN (handlers, spawners, templates).
oci-hdfs-connector
HDFS Connector for Oracle Cloud Infrastructure
SLOB_2.5.4
Official SLOB distribution for version 2.5.4.0
SLOB_distribution
A Git repository used only for distributing the official SLOB release.
spark-root
Apache Spark Data Source for ROOT File Format
SparkDLTrigger
Notebooks with code and sample data for the blog article: "Machine Learning Pipelines for High Energy Physics Using Apache Spark with BigDL and Analytics Zoo"
sparkmonitor
Monitor Apache Spark from Jupyter Notebook
tf-spawner
spawn workers for tensorflow MultiWorkerMirroredStrategy