Beast code in Giters

Joyce 's repositories

Medical_ChatBot

The objective of this project is to create a chatbot that can be used to communicate with users to provide answers to their health issues. This is a RAG implementation using open source stack.

Language:HTML000

evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

NOASSERTION000

BookRecommendationSystem

The objective of the project is to create a recommendation engine using collaborative filtering to recommend books to users.

Language:Jupyter Notebook200

MovieSpider

This project is used to crawl movie data from IMDb. Scrapy framework is used to extract relevant information like movie title, datePublished, summary, genres, director etc.

Language:Python000

US-immigrations-data-warehouse

A data warehouse to perform analytics on the immigration trends in the US.

Language:Jupyter Notebook200

SQL-Data-with-Danny-Case-Studies

Case study solutions for #8WeekSQLChallenge at https://8weeksqlchallenge.com

000

The purpose of the project is to create a data pipeline to extract data from Reddit API and create a dashboard to analyse the data. The data is extracted from the subreddit r/Python. The data is extracted daily and uploaded to S3 buckets, and copied to Redshift. The dashboard is created using Google Data Studio.

Language:Python400

pyspark_bigdata

Getting started with PySpark for Big data analysis

Language:Jupyter NotebookApache-2.0000

Data-Warehouse-AWS

A music streaming startup, Sparkify, has grown their user base and song database and want to move their processes and data onto the cloud. The data resides in S3, in a directory of JSON logs on user activity on the app, as well as a directory with JSON metadata on the songs in their app. The objective of the project is to create an ETL pieline to build a datawarehouse . We extract data from S3, stage them in Redshift, and transform data into a set of dimensional tables for the analytics team to continue finding insights into what songs their users are listening to.

Language:Jupyter Notebook000

Data-Modeling-With-Postgres

The main focus of the project is data modeling with Postgres and build an ETL pipeline using Python. The first step is to define fact and dimension tables for a star schema for a particular analytic focus. The second step is to write an ETL pipeline that transfers data from files in different directories into these tables in Postgres using Python and SQL.

Language:Python300

joyceannie

Joyce 's repositories

llm-zoomcamp

mlflow_experiments

CodingChallenge

DiabetesPrediction

Medical_ChatBot

Made-With-ML

joyceannie

AIPostGenerator

evals

Post_Generator

TextSummarizer

MovieRecommendationSystem

BookRecommendationSystem

Urban_Sound_Classification

mlops_zoomcamp

MovieSpider

US-immigrations-data-warehouse

SQL-Data-with-Danny-Case-Studies

Identify_Customer_Segments

image_classifier

neo4j-practice

leetcode

DataLakeWithSpark

Reddit_Data_Pipeline

pyspark_bigdata

Data-Warehouse-AWS

Data-Modeling-With-Postgres

outreachy

networkx

Kaggle