bigspark (itsbigspark)

bigspark

itsbigspark

Geek Repo

Location:United Kingdom

Home Page:www.bigspark.dev

Twitter:@itsbigspark

Github PK Tool:Github PK Tool

bigspark's repositories

genai-presidio

Repository for PII Anonymizer code package and sample FastAPI API to use it to talk to LLM

Language:Jupyter NotebookStargazers:0Issues:0Issues:0
Language:JavaScriptStargazers:0Issues:0Issues:0

aws-sso-sync

sso-sync tool to help with the SCIM setup for bigspark.

Language:GoLicense:Apache-2.0Stargazers:0Issues:0Issues:0

test_glue_

To test glue job

Language:PythonStargazers:0Issues:0Issues:0

test_glue

To test glue job

Language:PythonStargazers:0Issues:0Issues:0

ai-hackathon

General Purpose repo for NW AI Hackathon 2023

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

streamsets_json_schema_validator_processor

A streamsets dc sample processor for validation records with a specified JSON schema

Language:JavaLicense:Apache-2.0Stargazers:1Issues:0Issues:0
Language:ShellStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

incubator-livy

Mirror of Apache livy (Incubating)

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
Language:JavaStargazers:0Issues:0Issues:0
Language:ShellLicense:MIT-0Stargazers:0Issues:0Issues:0

snowpark-scala-data-profiling

Data profiling example using Snowflake sample datasets and Scala

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

sparkMeasure

This is the development repository of SparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task and stage metrics data.

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:ShellStargazers:0Issues:0Issues:0

jvm-profiler

JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter

Language:JavaLicense:NOASSERTIONStargazers:0Issues:0Issues:0

emr-uber-profiler-notebooks

emr-uber-profiler-notebooks

Stargazers:0Issues:0Issues:0
Language:JavaStargazers:0Issues:0Issues:0

deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

tutorials

StreamSets Tutorials

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:JavaStargazers:0Issues:0Issues:0
Language:JavaScriptLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:JavaScriptLicense:MITStargazers:1Issues:0Issues:0
Language:JavaStargazers:1Issues:0Issues:0

kafka-local

Basic single broker Kafka cluster - docker compose using confluent image

Stargazers:2Issues:0Issues:0
Language:JavaStargazers:0Issues:0Issues:0

kafdrop

Kafka Web UI

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:TypeScriptStargazers:0Issues:0Issues:0

CMAK

CMAK is a tool for managing Apache Kafka clusters

License:Apache-2.0Stargazers:0Issues:0Issues:0