shrysr / streamlit-exploration

This repo will contain my exploration of using streamlit.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

I learned about Streamlit from the Data Science Weekly newsletter, which led to a Techcrunch article on Streamlit launching an open source ML application development framework. On first glance, this appears to be quite similar to the shiny library in R, and considering my current foray into python this piqued my interest to explore how it works.

The notes below are my observations, interspersed with references from the documentation, which I have tried to duly reference. This has been constructed in Org mode markup and exported to jupyter notebooks. I’m a long time fan of using Scimax for everything.

I’m always open to hearing constructive feedback.

Overall plan

  1. Start by implementing all the official tutorials
  2. Use Literate programming to make notes in parallel on the exploration procedure followed as well as my notes.
  3. Neatly export each app into a separate folder, so that each app is self-contained.
  4. Figure out how streamlit can be run on a VPS to host my own apps. Possibly deploy with Docker?
  5. Develop functions to streamline the deployment of an app, right from folder creation to other things.
  6. Construct code snippets in Emacs to make it easier to construct a streamlit app quickly.

Quick Links

Streamlit’s official links are placed here for convenient access

General notes

[2019-10-04 Fri] Appears that streamlit is not available in standard conda channels.

Streamlit can be installed using pip.

pip install streamlit

If you are familiar with Hugo / Jupyter / Shiny server - running the streamlit server is similar in concept, in this case the streamlit server runs in the background watching the current working directory and re-executes the entire script each time there is a change. This enables viewing what the script does interactively on a web browser.

# To start the streamlit server:
streamlit hello

For larger or more complex functions, it is not always feasible (i.e fast or efficient) to frequently re-execute the entire script. One way to reduce the frequency is to implement a cache or a store of data related to the function.

  • The actual bytecode that makes up the body of the function
  • Code, variables, and files that the function depends on
  • The input parameters that you called the function with

https://streamlit.io/docs/main_concepts.html#caching

This is similar to using reactive functions in shiny, which control when the script is re-executed or the app refreshed - in streamlit, data is reused using a cache mechanism (store) called st.cache. A function needs to be marked with st.cache’s annotation. If so - on a first run - the above will be stored in the cache and every time the function is run - the same will be checked, and the data will be fetched only if there is a change.

Since the streamlit server is watching only the current working directory and the cache is built by running the function for the first time - a change in an external data source will not be detected by the cache.

Running these apps

After installing the streamlit library, and cloning this repository, each app below can be run by using the streamlit run command pointing towards the python script.

streamlit run <path to repository>/streamlit-exploration/app01/app01.py

Apps

This section will contain all the apps created using streamlit. It starts with implementing the official code snippets and tutorials, along with my own explorations and notes, and will eventually graduate to my own apps.

App 01 - Hello world

The very first app. “Hello World” !

import streamlit as st
import numpy
import pandas

st.title('First app using streamlit')
st.write('This explores creating a table and adding a title')
st.write(" ## Hello World")

# Creating a table

st.write("Creating a table")
st.write(pandas.DataFrame({
    'col 1': [100,200,456,678],
    'col 2': ['a', 'b', 'c', 'd']
}))

App 02 - magic and literal values

This app tries out some simple python commands, as well as the ‘magic’ portion (works only with python3) wherein it is not necessary to explicitly use st.write().

Any time that Streamlit sees a variable or a literal value on its own line, it automatically writes that to your app using st.write()

  • Streamlit docs

For example, a print command will not be translated. However, a literal value - that includes strings will be piped to st.write(). String literals have to be enclosed within quotes. As such, it may not be a good practice to use such literals if the app needs to be compatible across different python versions.

import streamlit as st
import numpy as np
import pandas as pd
import sys

# Preamble
st.title("App No. 2")
st.write("This app tests simple commands in python")

# Simple commands
st.write("Printing the python and system version")
st.write(sys.version_info)
st.write(sys.version)

# Using magic , similar to the example in the docs

st.write("Trying out magic. This essentially means that any variable or literal value on it's own line is passed to st.write()")
df = pd.DataFrame({
    'col 1':['AA', 'BB', 'CC'],
    'col 2':[1 , 2, 3],
    'col 3':["this", "is", "col3"]
})

df

# Testing literal values

13
"The above was a literal value. i.e 13 was just written."
"Both this and the above are literal strings placed in the script! This is actually quite handy, but it does not appear to be good programming practice, i.e more useful for quick demonstrations and notes. Particularly, this works with Python 3 only atm."

App 03 - charts and maps[1/1]

This app explores basic plotting functions. Eg: line, bar, area charts and plotting a map.

  • Notice how the charts are somewhat interactive out of the box! i.e - you can zoom, pan etc on the line chart. However, the documentation show a different set of popups.
  • Notice how literal strings can be input directly as markdown. While not entirely convenient to interject that in code for a long document (It’s not easy to shift everything under a single heading for example) - it is however extremely helpful for quick notes and headings in a demonstration.
  • [X] About 10 rows of a table seem to be shown by default and the remaining have to accessed with scroll bars.
    • The height and width can be set as shown below using the st.dataframe(<data>, width = , height = =). The values need to be in pixels. This applies to charts as well.
    • in the case of width in data frames - higher widths do not seem to work once the data has been ‘fit’, i.e a table of 2 columns cannot
import streamlit as st
import numpy as np
import pandas as pd

st.title("App no. 3")
"## Playing with charts"

chart_data = pd.DataFrame(
    np.random.randn(20,3),
    columns = ['col1', 'col2', 'col3']
)

"### Line chart "
"Printing the chart table"
st.dataframe(chart_data, width = 500)

st.line_chart(chart_data, height = 0)
st.area_chart(chart_data)
st.bar_chart(chart_data)

# Using st.map()
map_data = pd.DataFrame(
    np.random.randn(1000, 2) / [50, 50] + [27.76, -122.4],
    columns  = ['lat', 'lon']
)

" ### Checking out a map"
"Using the st.dataframe function. Changing the default height."
st.dataframe(map_data, height = 1000)
st.map(map_data)

App 04

The tutorial script has been copied verbatim in this app. This will explored and expanded later.

import streamlit as st
import pandas as pd
import numpy as np

st.title('Uber pickups in NYC')

DATE_COLUMN = 'date/time'
DATA_URL = ('https://s3-us-west-2.amazonaws.com/'
            'streamlit-demo-data/uber-raw-data-sep14.csv.gz')

@st.cache
def load_data(nrows):
    data = pd.read_csv(DATA_URL, nrows=nrows)
    lowercase = lambda x: str(x).lower()
    data.rename(lowercase, axis='columns', inplace=True)
    data[DATE_COLUMN] = pd.to_datetime(data[DATE_COLUMN])
    return data

data_load_state = st.text('Loading data...')
data = load_data(10000)
data_load_state.text('Loading data... done!')

if st.checkbox('Show raw data'):
    st.subheader('Raw data')
    st.write(data)

st.subheader('Number of pickups by hour')
hist_values = np.histogram(data[DATE_COLUMN].dt.hour, bins=24, range=(0,24))[0]
st.bar_chart(hist_values)

# Some number in the range 0-23
hour_to_filter = st.slider('hour', 0, 23, 15)
filtered_data = data[data[DATE_COLUMN].dt.hour == hour_to_filter]

st.subheader('Map of all pickups at %s:00' % hour_to_filter)
st.map(filtered_data)

App 05

Overview: This app will implement some simple numpy graphs and calculations and explore user interaction.

import streamlit as st
import pandas as pd
import numpy.random  as npr

st.title('Implementing simple User interactions and numpy calcs')

About

This repo will contain my exploration of using streamlit.io


Languages

Language:Python 100.0%