Python Streamlit - I Made This Stock Price Prediction App

No doubt, we all want to know the future. For financial assets including a company stock traded on an exchange, we all want to know its future value. This is what we hope to accomplish in this tutorial.

Predicting the price of company stock is a very difficult thing to do. This is because many factors influence the price of a stock, not just by looking at past performance. Also, some stocks are highly volatile.

Many traders ready to take risks have, with the help of trading indicators, capitalized on the fluctuations much to their gain.

**DALL-E**: An oil painting of a stock trader in the style of Monet

So, in this tutorial, I’m going to show you how I designed a stock price prediction app and have it hosted on Streamlit Cloud. We will use Python and machine learning technologies which many trading firms use to analyze the stock market.

Getting started

I expect you to have background knowledge of Python and Streamlit because I don’t have to explain everything.

🛑 Disclaimer: This tutorial is purely educational and should not be taken as financial advice. Trading stocks is a risky venture that should be done with full knowledge of the financial market. Still, do not risk money that you cannot afford to lose.

Our stock price prediction app is going to do several things, including to visualize and predict. In the visualization part, we will show some technical indicators investors use to analyze the market. We will try using several machine learning algorithms to predict the price in the prediction part.

This tutorial will show you the capabilities of Streamlit and what you can build using it. Remember, no model is perfect. So, see this tutorial as nothing more than a way to improve your Python skills.

**DALL-E**: An oil painting of a stock chart hitting an asteroid in the style of Monet

I started designing this prediction app by creating a main() function that will be running when we open the app.

import streamlit as st
st.title('Stock Price Predictions')
st.sidebar.info('Welcome to the Stock Price Prediction App. Choose your options below')

def main():
    option = st.sidebar.selectbox('Make a choice', ['Visualize','Recent Data', 'Predict'])
    if option == 'Visualize':
        tech_indicators()
    elif option == 'Recent Data':
        dataframe()
    else:
        predict()

if __name__ == '__main__':
    main()

This was made possible using the special variable __name__ which appeared at the very last of the Python script. So, make sure you have this in place immediately.

if __name__ == '__main__':
    main()

If you have been following my tutorials including this one, everything above should be self-explanatory. Everything, excluding the title, will be displayed as a sidebar.

The ‘Recent Data’ will be a dataframe showing the last few rows. Whichever option the user selects will trigger a callback function. Before we code those functions, let me show you one important part of the script.

Downloading the Stock Data

**DALL-E**: An oil painting of stream of zeros (0) and ones (1) downloaded from the cloud in the style of Monet

import yfinance as yf

@st.cache_resource
def download_data(op, start_date, end_date):
    df = yf.download(op, start=start_date, end=end_date, progress=False)
    return df

This is the function that will download the data from Yahoo! Finance. We add a decorator @st.cache_resource to avoid running this expensive function repeatedly.

🧑‍💻 Recommended: Decorators in Python – A Simple Guide

As we don’t want to display the progress bar, we set it to false. The next snippet gets inputs from the user.

import datetime
import pandas as pd


option = st.sidebar.text_input('Enter a Stock Symbol', value='SPY')
option = option.upper()
today = datetime.date.today()
duration = st.sidebar.number_input('Enter the duration', value=3000)
before = today - datetime.timedelta(days=duration)
start_date = st.sidebar.date_input('Enter Start Date', value=before)
end_date = st.sidebar.date_input('End date', today)


if st.sidebar.button('Send'):
    if start_date < end_date:
        st.sidebar.success('Start date: `%s`\n\nEnd date: `%s`' %(start_date, end_date))
        download_data(option, start_date, end_date)
    else:
        st.sidebar.error('Error: End date must fall after start date')

We import the yfinance module that will help us fetch real-time stock data starting from a selected date.

In the option variable, we store the stock symbol, and it has to be capitalized. The correct stock symbol has to be used otherwise it will not work. We then use the timedelta type of the datetime module to set the duration to be any number selected by the user.

Let me explain this very well.

The st.sidebar.date_input displays a calendar. It will be used to get stock data for a specified start and end date (the end date should always be the current date).

So if the given date is not displayed in the calendar, it is because of the duration set in the timedelta() function. Perhaps you want stock data for the year 2000 and it’s not shown in the calendar. All you have to do is to increase the duration.

Note that it is also possible to increase the duration and still select data far below the duration as well as manually setting the data without using the duration. The second if statement makes sure the end date is not less than the start date once the button is clicked.

Finally, we download the data from Yahoo! Finance by calling the download_data() function.

Data visualization

Next, we call the download_data() function. The function returns a dataframe. This makes it easy to perform whatever operation we want with the data.

from sklearn.preprocessing import StandardScaler

data = download_data(option, start_date, end_date)
scaler = StandardScaler()

We will also scale the data using StandardScaler() function from Scikit-learn.

Back to the main() function, if the user selects ‘Visualize’, the tech_indicators() function will execute.

from ta.volatility import BollingerBands
from ta.trend import MACD, EMAIndicator, SMAIndicator
from ta.momentum import RSIIndicator 

def tech_indicators():
    st.header('Technical Indicators')
    option = st.radio('Choose a Technical Indicator to Visualize', ['Close', 'BB', 'MACD', 'RSI', 'SMA', 'EMA'])

    # Bollinger bands
    bb_indicator = BollingerBands(data.Close)
    bb = data
    bb['bb_h'] = bb_indicator.bollinger_hband()
    bb['bb_l'] = bb_indicator.bollinger_lband()
    # Creating a new dataframe
    bb = bb[['Close', 'bb_h', 'bb_l']]
    # MACD
    macd = MACD(data.Close).macd()
    # RSI
    rsi = RSIIndicator(data.Close).rsi()
    # SMA
    sma = SMAIndicator(data.Close, window=14).sma_indicator()
    # EMA
    ema = EMAIndicator(data.Close).ema_indicator()

    if option == 'Close':
        st.write('Close Price')
        st.line_chart(data.Close)
    elif option == 'BB':
        st.write('BollingerBands')
        st.line_chart(bb)
    elif option == 'MACD':
        st.write('Moving Average Convergence Divergence')
        st.line_chart(macd)
    elif option == 'RSI':
        st.write('Relative Strength Indicator')
        st.line_chart(rsi)
    elif option == 'SMA':
        st.write('Simple Moving Average')
        st.line_chart(sma)
    else:
        st.write('Expoenetial Moving Average')
        st.line_chart(ema)

A lot of things are going on here. Let’s see how far we can go explaining them. The radio buttons display the close price and a list of technical indicators.

💡 Recommended: I expect you to have basic knowledge of the technical indicators. If not, please check this tutorial.

As always, each option selected caused the execution of its respective callback function.

Can you figure out what makes the data visualization as well as the overall code snippets work irrespective of the stock symbol used? Can you? That’s because all stock data pulled from Yahoo! Finance has the same column names.

So, by selecting the Close data, we were able to do all that we want to do, including data visualization and prediction.

This is something I love about the way I designed this app, you know, to be able to kill several birds with one stone. I would have created a separate app for a single company stock data, for example, Tesla, and called it Tesla Stock Prediction App. Although it has some advantages, not everyone buys Tesla stock.

So, this app can accommodate all stocks found in Yahoo! Finance that follows the same pattern. Once you input the stock symbol, you get your results. I also did something similar in the forecasting app tutorial.

✅ Recommended: How I Created a Forecasting App Using Streamlit

Please do not kill beautiful birds for fun. I seriously detest it. I was only speaking figuratively. Alright, let’s continue.

The Stock Data

The next option in the main() function is ‘Recent Data.’ The callback function simply prints the last ten rows of the stock data.

def dataframe():
    st.header('Recent Data')
    st.dataframe(data.tail(10))

Notice how we use st.dataframe to display the data. Using only data.tail() in this case, will not work.

See how Streamlit accommodated everything, including the ones in the sidebar, in just a few lines of code!

Making Predictions

The last option in the main() function is ‘predict.’ Once clicked, the predict() function is called.

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsRegressor
from xgboost import XGBRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.metrics import r2_score, mean_absolute_error

def predict():
    model = st.radio('Choose a model', ['LinearRegression', 'RandomForestRegressor', 'ExtraTreesRegressor', 'KNeighborsRegressor', 'XGBoostRegressor'])
    num = st.number_input('How many days forecast?', value=5)
    num = int(num)
    if st.button('Predict'):
        if model == 'LinearRegression':
            engine = LinearRegression()
            model_engine(engine, num)
        elif model == 'RandomForestRegressor':
            engine = RandomForestRegressor()
            model_engine(engine, num)
        elif model == 'ExtraTreesRegressor':
            engine = ExtraTreesRegressor()
            model_engine(engine, num)
        elif model == 'KNeighborsRegressor':
            engine = KNeighborsRegressor()
            model_engine(engine, num)
        else:
            engine = XGBRegressor()
            model_engine(engine, num)

This code snippet looks a little bit repetitive. If you have a better way to implement these features in a few lines of code, please go ahead. We added five machine learning algorithms for our users to compare and make a choice. In each option, an instance of the model is created and passed to another callback function.

Notice that the model_engine() function has another parameter, num which the user will provide. This tells the function how many days of future value to predict. You can provide as many as you want.

def model_engine(model, num):
    # getting only the closing price
    df = data[['Close']]
    # shifting the closing price based on the number of days forecast
    df['preds'] = data.Close.shift(-num)
    # scaling the data
    x = df.drop(['preds'], axis=1).values
    x = scaler.fit_transform(x)
    # storing the last num_days data
    x_forecast = x[-num:]
    # selecting the required values for training
    x = x[:-num]
    # getting the preds column
    y = df.preds.values
    # selecting the required values for training
    y = y[:-num]

    #spliting the data
    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=.2, random_state=7)
    # training the model
    model.fit(x_train, y_train)
    preds = model.predict(x_test)
    st.text(f'r2_score: {r2_score(y_test, preds)} \
            \nMAE: {mean_absolute_error(y_test, preds)}')
    # predicting stock price based on the number of days
    forecast_pred = model.predict(x_forecast)
    day = 1
    for i in forecast_pred:
        st.text(f'Day {day}: {i}')
        day += 1

We select the closing price to make predictions by first shifting it backward (indicated by the negative sign) based on the number of periods passed by the user. The rests are self-explanatory with comments to guide you.

After scaling the data, we store the value that has the length of num in x_forecast variable. So, we select only the rows needed to train the model. The rows must exclude the x_forecast variable because x_forecast is now the unseen data the model will predict.

Two metrics are used to check the performance of the model on unseen data. But don’t get carried away by the results. The more num selected, the less the metric results become. This is one of the reasons we include other machine learning algorithms.

By comparing the results of the models, our users will get to know whether the stock price is on the increase or decrease.

Conclusion

We have successfully come to the end of the tutorial. The full code is on my GitHub page. You will notice we didn’t perform feature engineering on the data. We want to keep things simple. You will also observe that I didn’t wrap every code in a function. Try to figure out the reason yourself. That’s part of the learning process.

Once you sign up on Streamlit Cloud, you can easily create an app using your GitHub repo. Mine is already running on the Cloud. Check my GitHub page to see other things I included that enabled the app to run on Streamlit Cloud.

There’s no doubting the fact that you have learned something from this tutorial. So, go ahead and use that knowledge to create awesome apps. You are welcome!