yingchi / py3-ds-toolkit

Useful data science and Python code snippets at Data Science Simplified

Home Page:https://khuyentran1401.github.io/Python-data-science-code-snippet/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Intro

Personal repository to store useful Python3 Data Science code snippets and useful little tools to assist a lazy Data Scientist with very poor memory. :-)

Python and Data Science Code Snippets

Source code of Python and data science snippets posted daily at Data Science Simplified. You can receive these daily tips in your mailbox for free by subscribing to the website.

To get access to these daily tips on the command line, install python-snippet.

Contents

Python Built-in Methods

Number

Title Explanation Code
Get Multiples of a Number Using Modulus link link
fractions: Get Numerical Results in Fractions instead of Decimals link link
How to Use Underscores to Format Large Numbers in Python link link
Confirm whether a variable is a number link link
Get a Division, Floor Division, And The Remainder of a Division in Python link link

Boolean

Title Explanation Code
Boolean Operators: Connect Two Boolean Expressions into One Expression link link

String

Title Explanation Code
__str__ and __repr__: Create a String Representation of a Python Object link link
String find: Find the Index of a Substring in a Python String link link
eval: Turn a Python String into a Variable or Function link link
re.sub: Replace One String with Another String Using Regular Expression link link

List

Title Explanation Code
any: Check if Any Element of an Iterable is True link link
Extended Iterable Unpacking: Ignore Multiple Values when Unpacking a Python Iterable link link
How to Unpack Iterables in Python link link
random.choice: Get a Randomly Selected Element from a Python List link link
random.sample: Get n Random Elements From a List link link
filter: Get the Elements of an Iterable that a Function Returns True link link
heapq: Find n Max Values of a Python List link link
join method: Turn an Iterable into a Python String link link
Zip: Associate Elements from Two Iterators based on the Order link link
collections.Counter: Count the Occurrences of Items in a List link link
Zip Function: Create Pairs of Elements from Two Lists in Python link link
Stop using = operator to create a copy of a Python list. Use copy method instead link link
itertools.combinations: A better way to iterate through a pair of values in a Python list link link
itertools.product: Nested For-Loops in a Generator Expression link link
itertools.islice: Get Items From an Iterable That are Within a Certain Range With a Specific Incrementation link link
Enumerate link link
set.intersection: Find the Intersection Between 2 Sets link link
Set Difference: Find the Difference Between 2 Sets link link
Difference between list append and list extend link link
map method: Apply a Function to Each Item of an Iterable link link
Why Should you Rewrite a For-Loop as a List Comprehension? link link

Tuple

Title Explanation Code
namedtuple: A Lightweight Python Structure to Mange your Data link link
slice: Make your Indices more Readable by Naming your Slice link link

Dictionary

Title Explanation Code
Defaultdict: Return a default value when a key is not available link link
Defaultdict: Create a Dictionary with Values that are List link link
Ordered dictionary in Python link link

Datetime

Title Explanation Code
datetime + timedelta: Calculate End DateTime based on Start DateTime and Duration link link
Use Dates in a Month as the Feature link link

Function

Title Explanation Code
*iterator: Pass Values of an Iterator to a Function link link
Use Python Built-in Functions to Speed your Code link link
**kwargs: Pass multiple arguments to a function in Python link link
Return Multiple Values from a Function Using Python Dictionary link link
Decorator in Python link link
functools.partial: Generate a New Function with Fewer Arguments link link
singledispatch: Call Another Function Based on the Type of the Current Function’s Argument link link

Classes

Title Explanation Code
Abstract Classes: Declare Methods without Implementation link link
classmethod: What is it and When to Use it link link
getattr: a Better Way to Get the Attribute of a Class link link
__call__: You can Call your Class Instance like a Function. Here is how link link
Static method: use the function without adding the attributes required for a new instance link link
Property Decorator: A Pythonic Way to Use Getters and Setters link link

Files

Title Explanation Code
Shutil: Move Files in Python link link
pathlith.Path link link
pathlib: Create, Write, and Rename Files in One Line of Code link link
Pathlib: Iterate Over All Files that End with ‘.csv’ in a Directory link link
Path.parents: Get the Parent Directory of a File link link
How to Improve the Readability of your JSON file using Indent link link
__main__.py: Run a Directory like a Main Script link link

Error handling

Title Explanation Code
Assert in Python: Output a Customized Message When the Assertion Fails link link
warnings: Ignore Warnings when Running Python Code link link

Interact with Terminal

Title Explanation Code
How to Execute Shell Commands in a Python Script link link
argparse: Python Library to Parse Arguments from Command Line link link

Best Practices

Title Explanation Code
Stop Writing Code Comments. Use Meaningful Names Instead link
Underscore(_): Ignore values that will not be used link link
Underscore “_”: Ignore the index in Python for loops link link
Save Immediate Output when an Error Occurs link
Print error without stopping the for loop in Python link link
Python Pass Statement link link
Type hint in Python 3.9 link

Code Speed

Title Explanation Code
Concurrently execute tasks on separate CPUs link link
Compare the execution time between 2 functions link link

Pandas

Change Values

Title Explanation Code
pd.DataFrame.agg: Aggregate over Columns or Rows Using Multiple Operations link link
pandas.DataFrame.agg: Apply Different Aggregations to Different Columns link link
DataFrame.pipe: Increase the Readability of your Code when Applying Multiple Functions to a DataFrame link link
pd.Series.map: Change Values of a Pandas Series Using a Dictionary link link
pd.Series.str: Manipulate Text Data in a pandas Series link link
set_categories in pandas: Sort Categorical Column by a Specific Ordering link link
parse_dates: Convert Columns into Datetime When Using Pandas to Read CSV Files link link
Filter Rows only if Column Contains Values from another List link link
Specify suffixes when using df.merge() link link
Specify the datatype to speed up your code and reduce memory link
Highlight your pandas DataFrame link link
Assign Values to Multiple New Columns link link
Reduce pd.DataFrame’s Memory link link
pd.DataFrame.explode: Transform Each Element in an Iterable to a Row link link
pandas.cut: Bin a DataFrame’s values into Discrete Intervals link link
Forward Fill in Pandas: Use the Previous Value to Fill the Current Missing Value link link
pandas.pivot_table: Turn Your DataFrame Into a Pivot Table link link

Get Values

Title Explanation Code
df.columns.str.startswith: Find DataFrame’s Columns that Start with a Pattern link link
pandas.DataFrame.iterrows: Iterate over Rows of a DataFrame link link
pandas.Series.dt: Access Datetime Properties of pandas Series link link
pd.Series.between: Select Rows in a pandas Series Containing Values between 2 Numbers link link
DataFrame rolling: Find the average of the previous n datapoints using Pandas link link
select_dtypes: Return a subset of a DataFrame including/excluding columns based on their dtype link link
pct_change: Find the percentage change between the current and a prior element in a pandas Series link link
DataFrame.diff and DataFrame.shift: Take the Difference between Rows within a Column in Pandas link link
Pandas DataFrame: How to select all columns that start with a word link link
Exclude Outliers link link
Pandas DataFrame Get Data in a Year Range link link
pd.reindex: Replace the Values of the Missing Dates with 0 link link
Select DataFrame Rows Before or After a Specific Date link link
DataFrame.groupby.sample: Get a Random Sample of Items from Each Category in a Column link link

Testing

Title Explanation Code
assert_frame equal: Test whether Two DataFrames are Similar link link

Numpy

Title Explanation Code
np.ravel: Flatten a Numpy Array link link
Use List to Change the Positions of Rows or Columns in a Numpy Array link link
Key Parameter in Max(): Find the Key with the Largest Value link link
Difference between Numpy’s All and Any Methods link link
Double np.argsort: Get Rank of Values in an Array link link
Get the index of the max value in a Numpy array link link
np.all: Test Whether All Elements along a Given Axis of a NumPy Array Evaluate to True link link
np.where: Replace Elements of a NumPy Array Based on a Condition link link
array-to-latex: Turn a Numpy Array into Latex link link
Numpy Comparison Operators link link
NumPy.linspace: Get Evenly Spaced Numbers Over a Specific Interval link link
NumPy.testing.assert_almost_equal: Check If Two Arrays Are Equal up to a Certain Precision link link

Data Science Tools

Testing

Title Explanation Code
snoop : Smart Print to Debug your Python Function link link
pytest benchmark: A Pytest Fixture to Benchmark your Code link link
pytest.mark.parametrize: Test your Functions with Multiple Inputs link link
Pytest: Shows only Failed Tests link
Pytest Fixtures: Use the same data for different tests link link
Pytest repeat link link
Pandera: a Python Library to Validate Your Pandas DataFrame link link

Data

Title Explanation Code
faker: Create Fake Data in One Line of Code link link
DVC: A Data Version Control Tool for your Data Science Projects link link
fetch_openml: Get OpenML’s Dataset in One Line of Code link link
github-to-sqlite: Download the Data of your Starred GitHub Repositories in One Command Line link
Autoscraper link link
Extract series data from various Internet sources directly into a pandas DataFrame link link
Compare the similar features between 2 different datasets link link
newspaper3k: Extract Meaningful Information From an Articles in 2 Lines of Code link link
distfit: Find The Best Theoretical Distribution For Your Data in Python link link

Feature extraction

Title Explanation Code
datefinder: Automatically Find Dates and Time in a Python String link link
dill’s getname: Get Names a Python Object link link
pytrend: Get the Trend of a Keyword on Google Search Over Time link link
add_datepart: Add Relevant DateTime Features in One Line of Code link link
Geopy: Extract Location Based on Python String link link
Maya: Convert the string to datetime automatically link link
Select the features by their relevance link
Extract holiday from date column link link
fastai’s cont_cat_split: Get a DataFrame’s Continuous and Categorical Variables Based on Their Cardinality link link

Visualization

Title Explanation Code
D-Tale: A Python Library to Visualize and Analyze your Data Without Code link
Graphviz: Create a Flowchart to Capture your Ideas in Python link link
Create an interactive map in Python link link
dtreeviz: Visualize and Interpret a Decision Tree Model link link

Sharing and Downloading

Title Explanation Code
Datapane: Publish your Python Objects on the Web in 2 Lines of Code link link
gdown: Download a File from Google Drive in Python link link

Natural Language Processing

Title Explanation Code
TextBlob: Processing Text in One Line of Code link link
sumy: Summarize Text in One Line of Code link
Spacy_streamlit: Create a Web App to Visualize your Text in 3 Lines of Code link link
Extract a contiguous sequence of 2 words link link
Detect the “almost similar” articles link link
Convert number to words link link
texthero.clean: Preprocess Text in One Line of Code link link
wordfreq: Estimate the Frequency of a Word in 36 Languages link link

Tools for Best Python Practices

Title Explanation Code
Don’t Hard-Code. Use Hydra Instead link link
python-dotenv: How to Load the Secret Information from .env File link link
kedro pipeline: Create Pipeline for your Data Science Projects in Python link link
docopt: Create Beautiful Command-line Interfaces for Documentation in Python link link

Speed Up Code

Title Explanation Code
fastai’s df_shrink: Shrink DataFrame’s Memory Usage in One Line of Code link link
Swifter: Add One Word to Make your Pandas Apply 23 Times Faster link link

Better Pandas

Title Explanation Code
rich-dataframe: Create Animated and Colorful Pandas Dataframe link
tqdm: Add Progress Bar to your Pandas Apply link link
tqdm.set_description: Set a Description for Your Progress Bar link link

Machine Learning

Title Explanation Code
causalimpact: Find Causal Relation of an Event and a Variable in Python link link
Pipeline + GridSearchCV: Prevent Data Leakage when Scaling the Data link link
Decompose high dimensional data into two or three dimensions link link
Cross Validation with Time Series link
squared=False: Get RMSE from Sklearn’s mean_squared_error method link link

Terminal

Text

Title Explanation Code
tr Command: Translate Characters to Improve Readability In Unix/Linux link link
Sed Command: Replace a string with another string on the command line link link

Files

Title Explanation Code
fd: a Simple Tool to Search for Files or Directories Fast link
ln -s: Create Symbolic Link Between 2 Files link link
tee: Save Command Output to a File link link
Make Important Files Impossible to be Deleted link link
View tree structure of your file link

Tracking

Title Explanation Code
timeit on the Command Line: Measure Execution Time of Small Code Snippets link link
Time Command: Track the Time it Takes to Execute a File in Linux link
htop link

Python

Title Explanation Code
Python Shell as an Calculator: Grab the Last Output Using “_” link
Find version of a Python library using pip list and grep link
Conda rollback to the last revision link link
How to Check Whether a Library is Installed link link
pydash.chunk: Split Elements in a List into Groups of n Items link link

Prettify Terminal

Title Explanation Code
colorls: Beautify your ls Command with Color and Icons link
Colorama: Produce a colored terminal text in Python link

Sharing

Title Explanation Code
terminalizer: Record and Share your Terminal Sessions link

Productive Hacks

Title Explanation Code
Bash For Loop: Stop Staring at your Screen. Write a Bash For Loop instead link link
Environment Variables: Save Private Information in your Local Machine link link
Pet: A Command-line Snippet Tool That Allows you to Store your Favorite Commands link
Loop through a list of data on your terminal link
Multi-run command link
Run multiple commands in one line of code link

Cool Tools

Better Output

Title Explanation Code
How to Strip Outputs and Execute Interactive Code in a Python Script link link
rich.inspect: Produce a Beautiful Report on any Python Object link link
Rich’s Console: Debug your Python Function in One Line of Code link link
loguru: Print Readable Traceback in Python link link
Icecream: Adding a Datetime Stamp to Python print link link
Icrecream: Never use print() to debug again link link
Pyfiglet: Make large and unique letters out of ordinary text in Python link link
heartrate — Visualize the Execution of a Python Program in Real-Time link link

Tracking

Title Explanation Code
Stacer: Visualize the History of your CPU and Memory Usage link

Data

Title Explanation Code
sherlock: Search for a Username Across 298 Popular Website link
getme forecast: Get the Weather Forecast Through your Terminal link link

Automation

Title Explanation Code
notion-py: Access and Edit your Notion App Using Python link link
organize: Automate Organizing Files with Command Line link
Schedule: Schedule your Python Functions to Run At a Specific Time link link
notify-send: Send a Desktop Notification after Finishing Executing a File link link
isort: Automatically Sort your Python Imports in 1 Line of Code link link
knockknock: Receive an email when your code finishes executing link link
snsscrape: Scrape Social Networking Services in Python link link
Typer: Build a Command-Line Interface in a Few Lines of Code link link
yarl: Create and Extract Elements from a URL Using Python link link
interrogate: Check your Python Code for Missing Docstrings link link
mypy: Static Type Checker for Python link link

Git and GitHub

Title Explanation Code
Github CLI: Brings GitHub to your Terminal link link
Pull one file from another branch using git link
Download a file on Github using wget link link
github1s: Read GitHub Code with VS Code on your Browser in One Second link
PyGithub: Manage your Github resources using Python link link
Astral: Organize your Github stars with ease link
pip install -e: Install Forked GitHub Repository using Pip link link

Alternative Approach

Title Explanation Code
Box: Using Dot Notation to Access Keys in a Python Dictionary link link
decorator module: Write Shorter Python Decorators without Nested Functions link link
virtualenv-clone: Create a Copy of a Virtual Environment link link

Jupyter Notebook

Title Explanation Code
nbdime: Better Version Control for Jupyter Notebook link
display in IPython: Display math equations in Jupyter Notebook link link
Reuse the notebook to run the same code across different data link
ngrok: Create a Public Server for your Jupyter Notebook in 1 Line of Code link link
watermark: Get Information About Your Hardware and the Packages Being Used within Your Notebook link link

About

Useful data science and Python code snippets at Data Science Simplified

https://khuyentran1401.github.io/Python-data-science-code-snippet/


Languages

Language:Jupyter Notebook 80.8%Language:HTML 17.9%Language:Python 1.2%Language:Shell 0.1%