Julian Passebecq's starred repositories
crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
parquet-format
Apache Parquet Format
clickhouse-operator
Altinity Kubernetes Operator for ClickHouse creates, configures and manages ClickHouse clusters running on Kubernetes
pythondataanalysis
Python data repo, jupyter notebook, python scripts and data.
adventure-spark
The "Adventure Works - Spark" repository is a collection of code and resources for analyzing the Adventure Works dataset using Databricks, PySpark, Delta Lake, and Python. It provides examples and tools for ingesting, processing, and analyzing the data to gain insights
Building-OLAP-Dimensional-Model-using-BigQuery-and-DBT
This project is about building a dimensional data warehouse in BigQuery by transforming an OLTP system to an OLAP system, using dbt as our data transformation tool.
streamlit-calendar
A Streamlit component to show calendar view using FullCalendar
covid-19-data
Data on COVID-19 (coronavirus) cases, deaths, hospitalizations, tests • All countries • Updated daily by Our World in Data
PortfolioProjects
PortfolioProjects PY/ML/SQL/ANALYSIS/VISUALIZATION
D3FromShiny
This is the code from a tutorial on using D3 with R Shiny
spark-sklearn
Scikit-learn integration package for Apache Spark
PowerBI-IBCS
IBCS-styled data visualizations created using only core Power BI visuals (Matrix, Table, New Card)
sticky-notes-app
This interactive React app allows users to create sticky notes, as well as edit, search through, save and delete them.
TabularEditor-Scripts
Scripts for Tabular Editor 2 & 3. Community driven to make your Tabular Editor experience as fast as possible.
dbt-duckdb-tutorial
This is a simple analytic project using DuckDB & dbt with air quality data.
powerbi-client-react
Power BI for React which provides components and services to enabling developers to easily embed Power BI reports into their applications.
dbt_facebook_ads
Fivetran data models for Facebook Ads built using dbt.
llama-cpp-python-streamlit
A streamlit app for using a llama-cpp-python high level api
dbt-snowflake-query-tags
From the SELECT team, a dbt package to automatically tag dbt-issued queries with informative metadata.
mlops-coding-course
Learn how to create, develop, and maintain a state-of-the-art MLOps code base
dbldatagen
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines