Beast code in Giters

The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions

Language:JavaApache-2.0010

boilerpipe

Work in progress transmit from Google Code

Language:JavaNOASSERTION010

Chat-with-Github-Repo

This repository contains two Python scripts that demonstrate how to create a chatbot using Streamlit, OpenAI GPT-3.5-turbo, and Activeloop's Deep Lake.

Language:PythonMIT000

classutil

Scala-friendly, fast class-finder library (using ASM under the covers)

Language:ScalaNOASSERTION020

docker-spark-k8s-aws

Docker image for running Spark 3 on Kubernetes on AWS

010

document-api-python

Create and modify Tableau workbook and datasource files

Language:PythonMIT010

experimental_spark-bigquery

Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.

Language:ScalaApache-2.0020

experimental_spark-bigquery-1

Google BigQuery support for Spark, SQL, and DataFrames

Language:ScalaApache-2.0020

generalized-kmeans-clustering

This project generalizes the Spark MLLIB Batch and Streaming K-Means clusterers in every practical way.

Language:ScalaApache-2.0020

incubator-hivemall

Mirror of Apache Hivemall (incubating)

Language:JavaApache-2.0020

influxdb-java

Java client for InfluxDB

Language:JavaMIT020

js-murmur3-128

A JavaScript implementation of the 128bit variant of Murmur3 (that is compatible with Guava)

Language:JavaScriptApache-2.0010

nutch

Apache Nutch

Language:JavaApache-2.0010

okhttp

An HTTP+HTTP/2 client for Android and Java applications.

Language:JavaApache-2.0020

reactive-kafka

Reactive Streams API for Apache Kafka

Language:ScalaNOASSERTION020

redshift-auto-schema

Redshift Auto Schema is a Python library that takes a delimited flat file or parquet file as input, parses it, and provides a variety of functions that allow for the creation and validation of tables within Amazon Redshift.

Language:PythonApache-2.0010