Portfolio Optimization Using Python [Part 1/2]

Aman Gupta
13 min readFeb 4, 2021

--

I recently came across applications of analytics in the investment domain. I thought about collating information and code for these techniques and presenting a simple explanation of these, which may be helpful to others. This part deals with Modern Portfolio Theory [MPT] and its implementation in Python.

What is Modern Portfolio Theory?

Modern Portfolio Theory is Markowitz’s theory regarding maximizing the return investors could get in their investment portfolio considering the risk involved in the investments. MPT asks the investor to consider how much the risk of one investment can impact their entire portfolio.

Harry Max Markowitz, in a 1952 paper published by The Journal of Finance, first proposed the theory as a means to create and construct a portfolio of assets to maximize returns within a given level of risk, or to devise one with a desired, specified and expected level of return with the least amount of risk.

Markowitz theorized that investors could design a portfolio to maximize returns by accepting a quantifiable amount of risk. In other words, investors could reduce risk by diversifying their assets and asset allocation of their investments using a quantitative method. With a well-balanced and calculated portfolio, if some of the assets fall due to market conditions, others should rise an equal amount in compensation, according to MPT. Markowitz demonstrated that, by taking a portfolio as its whole, it was less volatile than the total sum of its parts.

The three main ideas that Modern-Portfolio-Theory talks about are:

1. A stock with better returns is a more convincing investment than a stock with inferior returns. [Mean expectation of returns]

2. A stock with wild swings is a less sure investment than a stock with more stable earnings growth. [Variance is bad]

3. Two stocks that are basically doing the same are not as different as two stocks with vastly different revenue models. [Correlation is bad]

4. A hidden fourth idea is that it’s always better if you have more people working for you. In that sense diversification between stocks is better since that way many companies are working to bring you returns.

Practical Implementation using Python

Here we will use this theory to find the optimum portfolio under five distinct cases:

Given the list of securities or assets to be evaluated -

1. An Investor wants the portfolio with the lowest level of risk.

2. An Investor wants the optimum portfolio giving the optimum combination of risk and returns.

3. An Investor wants a portfolio giving a return not less than the minimum acceptable return fixed by the investor.

4. A portfolio that maximizes the Investor’s utility.

Our goal is to determine the portfolio weights of the securities, or the proportion in which the budget has to be distributed among the securities. We will also compute the expected future return and volatility for our portfolio. We have selected the below-mentioned stocks for our analysis:

· HSBC Holdings plc (HSBC)

· JPMorgan Chase & Co. (JPM)

· Tesla Inc. (TSLA)

· Walmart Inc. (WMT)

· Amazon.com Inc. (AMZN)

· Costco Wholesale Corporation (COST)

Let’s first import the necessary libraries.

import numpy as np
import pandas as pd
from pandas_datareader import data as wb
import matplotlib.pyplot as plt
%matplotlib inline
import scipy.optimize as sco

We will need the Adjusted Daily Closing prices of the stocks. We will pull this information from Yahoo Finance. The data will be taken from 01–01–2008 to date.

# Creating a list of Stock Tickers
stocks = ['HSBC','JPM','TSLA','WMT','AMZN','COST']
pf_data = pd.DataFrame()
# Pulling closing price
for stock in stocks:
pf_data[stock] = wb.DataReader(stock, data_source = 'yahoo', start = '2008-1-1')['Adj Close']
num_stocks = len(stocks)

Let’s see what we got here.

# Viewing data
pf_data

Looks Great! So, we have 3297 rows of daily closing prices for the six stocks. Let us now plot these on a line chart and see how the stocks have performed in the past. To compare them, we need to normalize them.

# Plotting Normalized returns
(pf_data / pf_data.iloc[0] * 100).plot(figsize=(10, 5))
plt.show()

We see that Amazon has definitely performed better than the other stocks. The growth rate of Amazon stock price has picked up majorly after 2015. The stock with the lowest growth in stock prices is HSBC.

Before we start with the calculations, we will go through the process which we need to follow to get the portfolio weights and measures.

  1. Assume random weights of the stocks ensuring that the sum of the weights is equal to 1.
  2. Generate a number of portfolios with random weights as per the condition above and calculate
  3. Calculate annualised Average/Mean returns and volatility assuming the number of trading days in a year as 252.
  4. Calculate the Sharpe Ratio [Return- Risk-free Return]/Volatility.
  5. Calculate the Covariance Matrix of the portfolios.
  6. Iterate on the randomly created portfolios and their weights to get to the optimum portfolio.

Let us consider each case and define functions for our analysis.

def portfolio_annualised_performance(weights, mean_returns, cov_matrix):
returns = np.sum(mean_returns*weights ) *252
std = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights))) * np.sqrt(252)
return std, returns
def random_portfolios(num_portfolios, mean_returns, cov_matrix, risk_free_rate):
results = np.zeros((3,num_portfolios))
weights_record = []
for i in range(num_portfolios):
weights = np.random.random(len(stocks) )
weights /= np.sum(weights)
weights_record.append(weights)
portfolio_std_dev, portfolio_return = portfolio_annualised_performance(weights, mean_returns, cov_matrix)
results[0,i] = portfolio_std_dev
results[1,i] = portfolio_return
results[2,i] = (portfolio_return - risk_free_rate) / portfolio_std_dev
return results, weights_record
def neg_sharpe_ratio(weights, mean_returns, cov_matrix, risk_free_rate):
p_var, p_ret = portfolio_annualised_performance(weights, mean_returns, cov_matrix)
return -(p_ret - risk_free_rate) / p_var
def max_sharpe_ratio(mean_returns, cov_matrix, risk_free_rate):
num_assets = len(mean_returns)
args = (mean_returns, cov_matrix, risk_free_rate)
constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
bound = (0.0,1.0)
bounds = tuple(bound for asset in range(num_assets))
result = sco.minimize(neg_sharpe_ratio, num_assets*[1./num_assets,], args=args,
method='SLSQP', bounds=bounds, constraints=constraints)
return result
def portfolio_volatility(weights, mean_returns, cov_matrix):
return portfolio_annualised_performance(weights, mean_returns, cov_matrix)[0]
def min_variance(mean_returns, cov_matrix):
num_assets = len(mean_returns)
args = (mean_returns, cov_matrix)
constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
bound = (0.0,1.0)
bounds = tuple(bound for asset in range(num_assets))
result = sco.minimize(portfolio_volatility, num_assets*[1./num_assets,], args=args,
method='SLSQP', bounds=bounds, constraints=constraints)
return result
def efficient_return(mean_returns, cov_matrix, target):
num_assets = len(mean_returns)
args = (mean_returns, cov_matrix)
def portfolio_return(weights):
return portfolio_annualised_performance(weights, mean_returns, cov_matrix)[1]
constraints = ({'type': 'eq', 'fun': lambda x: portfolio_return(x) - target},
{'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
bounds = tuple((0,1) for asset in range(num_assets))
result = sco.minimize(portfolio_volatility, num_assets*[1./num_assets,], args=args, method='SLSQP', bounds=bounds, constraints=constraints)
return result
def efficient_frontier(mean_returns, cov_matrix, returns_range):
efficients = []
for ret in returns_range:
efficients.append(efficient_return(mean_returns, cov_matrix, ret))
return efficients
def display_simulated_ef_with_random(mean_returns, cov_matrix, num_portfolios, risk_free_rate):
results, weights = random_portfolios(num_portfolios,mean_returns, cov_matrix, risk_free_rate)

max_sharpe_idx = np.argmax(results[2])
sdp, rp = results[0,max_sharpe_idx], results[1,max_sharpe_idx]
max_sharpe_allocation = pd.DataFrame(weights[max_sharpe_idx],index=pf_data.columns,columns=['allocation'])
max_sharpe_allocation.allocation = [round(i*100,2)for i in max_sharpe_allocation.allocation]
max_sharpe_allocation = max_sharpe_allocation.T

min_vol_idx = np.argmin(results[0])
sdp_min, rp_min = results[0,min_vol_idx], results[1,min_vol_idx]
min_vol_allocation = pd.DataFrame(weights[min_vol_idx],index=pf_data.columns,columns=['allocation'])
min_vol_allocation.allocation = [round(i*100,2)for i in min_vol_allocation.allocation]
min_vol_allocation = min_vol_allocation.T

print ("-"*80)
print ("Maximum Sharpe Ratio Portfolio Allocation\n")
print ("Annualised Return:", round(rp,2))
print ("Annualised Volatility:", round(sdp,2))
print ("\n")
print (max_sharpe_allocation)
print ("-"*80)
print ("Minimum Volatility Portfolio Allocation\n")
print ("Annualised Return:", round(rp_min,2))
print ("Annualised Volatility:", round(sdp_min,2))
print ("\n")
print (min_vol_allocation)

plt.figure(figsize=(10, 7))
plt.scatter(results[0,:],results[1,:],c=results[2,:],cmap='YlGnBu', marker='o', s=10, alpha=0.3)
plt.colorbar()
plt.scatter(sdp,rp,marker='*',color='r',s=500, label='Maximum Sharpe ratio')
plt.scatter(sdp_min,rp_min,marker='*',color='g',s=500, label='Minimum volatility')
plt.title('Simulated Portfolio Optimization based on Efficient Frontier')
plt.xlabel('Annualised Volatility')
plt.ylabel('Annualised Returns')
plt.legend(labelspacing = 0.8)
def display_calculated_ef_with_random(mean_returns, cov_matrix, num_portfolios, risk_free_rate):
results, _ = random_portfolios(num_portfolios,mean_returns, cov_matrix, risk_free_rate)

max_sharpe = max_sharpe_ratio(mean_returns, cov_matrix, risk_free_rate)
sdp, rp = portfolio_annualised_performance(max_sharpe['x'], mean_returns, cov_matrix)
max_sharpe_allocation = pd.DataFrame(max_sharpe.x,index=pf_data.columns,columns=['allocation'])
max_sharpe_allocation.allocation = [round(i*100,2)for i in max_sharpe_allocation.allocation]
max_sharpe_allocation = max_sharpe_allocation.T

min_vol = min_variance(mean_returns, cov_matrix)
sdp_min, rp_min = portfolio_annualised_performance(min_vol['x'], mean_returns, cov_matrix)
min_vol_allocation = pd.DataFrame(min_vol.x,index=pf_data.columns,columns=['allocation'])
min_vol_allocation.allocation = [round(i*100,2)for i in min_vol_allocation.allocation]
min_vol_allocation = min_vol_allocation.T

print ("-"*80)
print ("Maximum Sharpe Ratio Portfolio Allocation\n")
print ("Annualised Return:", round(rp,2))
print ("Annualised Volatility:", round(sdp,2))
print ("\n")
print (max_sharpe_allocation)
print ("-"*80)
print ("Minimum Volatility Portfolio Allocation\n")
print ("Annualised Return:", round(rp_min,2))
print ("Annualised Volatility:", round(sdp_min,2))
print ("\n")
print (min_vol_allocation)

plt.figure(figsize=(10, 7))
plt.scatter(results[0,:],results[1,:],c=results[2,:],cmap='YlGnBu', marker='o', s=10, alpha=0.3)
plt.colorbar()
plt.scatter(sdp,rp,marker='*',color='r',s=500, label='Maximum Sharpe ratio')
plt.scatter(sdp_min,rp_min,marker='*',color='g',s=500, label='Minimum volatility')
target = np.linspace(rp_min, 0.40, 50)
efficient_portfolios = efficient_frontier(mean_returns, cov_matrix, target)
plt.plot([p['fun'] for p in efficient_portfolios], target, linestyle='-.', color='black', label='efficient frontier')
plt.title('Calculated Portfolio Optimization based on Efficient Frontier')
plt.xlabel('Annualised Volatility')
plt.ylabel('Annualised Returns')
plt.legend(labelspacing = 0.8)

Case 1: An Investor wants the portfolio with the lowest level of risk.
&
Case 2: An Investor wants the optimum portfolio giving the optimum combination of risk and returns.

We will first simulate the portfolio measures using a loop and then see if optimization improves our results. To optimize we will use Sequential Least-Squares Programming (SLSQP) — SLSQP minimizes a function of several variables with any combination of bounds, equality and inequality constraints.

Let us now define the variables needed for the functions.

returns = pf_data.pct_change()
mean_returns = returns.mean()
cov_matrix = returns.cov()
num_portfolios = 100000
risk_free_rate = 0.01136

The risk-free rate of return is the 10-year US Treasury Bond yield as on 04–02–2021 at 03:12 hours EST is 1.136%. This is the return that an investor can earn at almost no risk. We are considering 0.1 million portfolio weight combinations to arrive at our desired portfolio.

Simulating Portfolios based on Efficient Frontier

# Simulating Portfolios based on Efficient Frontier
display_simulated_ef_with_random(mean_returns, cov_matrix, num_portfolios, risk_free_rate)

We get the following result:

The green star represents the portfolio which is the least volatile, i.e. the portfolio of Case 1. The red star is the optimum portfolio where the Sharpe ratio is the maximum (Case 2). From the plot of the randomly simulated portfolios, we can see it forms a shape of an arch line on the top of clustered blue dots. This line is called the efficient frontier.

The efficient frontier is the set of optimal portfolios that offer the highest expected return for a defined level of risk or the lowest risk for a given level of expected return. Portfolios that lie below the efficient frontier are sub-optimal because they do not provide enough return for the level of risk. Portfolios that cluster to the right of the efficient frontier are sub-optimal because they have a higher level of risk for the defined rate of return. The efficient frontier is curved because there is a diminishing marginal return to risk. Each unit of risk added to a portfolio results in a relatively smaller amount of additional returns.

Optimizing Portfolios based on Efficient Frontier

# Calculating Portfolios based on Efficient Frontier
display_calculated_ef_with_random(mean_returns, cov_matrix, num_portfolios, risk_free_rate)

We see that calculating portfolio optimization lead to a percentage increase for the Case 1 portfolio and Case 2 portfolio.

Case 3: An Investor wants a portfolio giving a return not less than the minimum acceptable return fixed by the investor.

Here we will use Roy’ Safety First (SF) criterion to determine the optimal portfolio weights.

Roy’s safety-first criterion is a risk management technique used by investors to compare and choose a portfolio based on the criterion that the probability of a portfolio’s return dropping below a threshold level return is reduced.

In Roy’s safety-first criterion, the optimal portfolio is one that minimizes the probability of the portfolio’s return falling below a threshold return level. So, the portfolio with the highest SF ratio is the optimum portfolio.

Let’s define the function for Roy’s Safety First (SF) criterion. We will use a modified version of the ‘display_simulated_ef_with_random’ and ‘display_calculated_ef_with_random’ functions.
A thing to note here is that the ‘risk-free return’ is actually the threshold return and so the ‘Sharpe ratio’ becomes SF ratio.

def display_simulated_ef_with_random(mean_returns, cov_matrix, num_portfolios, risk_free_rate):
results, weights = random_portfolios(num_portfolios,mean_returns, cov_matrix, risk_free_rate)

max_sharpe_idx = np.argmax(results[2])
sdp, rp = results[0,max_sharpe_idx], results[1,max_sharpe_idx]
max_sharpe_allocation = pd.DataFrame(weights[max_sharpe_idx],index=pf_data.columns,columns=['allocation'])
max_sharpe_allocation.allocation = [round(i*100,2)for i in max_sharpe_allocation.allocation]
max_sharpe_allocation = max_sharpe_allocation.T

min_vol_idx = np.argmin(results[0])
sdp_min, rp_min = results[0,min_vol_idx], results[1,min_vol_idx]
min_vol_allocation = pd.DataFrame(weights[min_vol_idx],index=pf_data.columns,columns=['allocation'])
min_vol_allocation.allocation = [round(i*100,2)for i in min_vol_allocation.allocation]
min_vol_allocation = min_vol_allocation.T

print ("-"*80)
print ("Maximum Sharpe Ratio Portfolio Allocation\n")
print ("Annualised Return:", round(rp,2))
print ("Annualised Volatility:", round(sdp,2))
print ("\n")
print (max_sharpe_allocation)
print ("-"*80)
#print ("Minimum Volatility Portfolio Allocation\n")
#print ("Annualised Return:", round(rp_min,2))
#print ("Annualised Volatility:", round(sdp_min,2))
#print ("\n")
#print (min_vol_allocation)

plt.figure(figsize=(10, 7))
plt.scatter(results[0,:],results[1,:],c=results[2,:],cmap='YlGnBu', marker='o', s=10, alpha=0.3)
plt.colorbar()
plt.scatter(sdp,rp,marker='*',color='r',s=500, label='Maximum Sharpe ratio')
plt.title('Simulated Portfolio Optimization based on Efficient Frontier')
plt.xlabel('Annualised Volatility')
plt.ylabel('Annualised Returns')
plt.legend(labelspacing = 0.8)
def display_calculated_ef_with_random(mean_returns, cov_matrix, num_portfolios, risk_free_rate):
results, _ = random_portfolios(num_portfolios,mean_returns, cov_matrix, risk_free_rate)

max_sharpe = max_sharpe_ratio(mean_returns, cov_matrix, risk_free_rate)
sdp, rp = portfolio_annualised_performance(max_sharpe['x'], mean_returns, cov_matrix)
max_sharpe_allocation = pd.DataFrame(max_sharpe.x,index=pf_data.columns,columns=['allocation'])
max_sharpe_allocation.allocation = [round(i*100,2)for i in max_sharpe_allocation.allocation]
max_sharpe_allocation = max_sharpe_allocation.T

min_vol = min_variance(mean_returns, cov_matrix)
sdp_min, rp_min = portfolio_annualised_performance(min_vol['x'], mean_returns, cov_matrix)
min_vol_allocation = pd.DataFrame(min_vol.x,index=pf_data.columns,columns=['allocation'])
min_vol_allocation.allocation = [round(i*100,2)for i in min_vol_allocation.allocation]
min_vol_allocation = min_vol_allocation.T

print ("-"*80)
print ("Maximum Sharpe Ratio Portfolio Allocation\n")
print ("Annualised Return:", round(rp,2))
print ("Annualised Volatility:", round(sdp,2))
print ("\n")
print (max_sharpe_allocation)
#print ("-"*80)
#print ("Minimum Volatility Portfolio Allocation\n")
#print ("Annualised Return:", round(rp_min,2))
#print ("Annualised Volatility:", round(sdp_min,2))
#print ("\n")
#print (min_vol_allocation)

plt.figure(figsize=(10, 7))
plt.scatter(results[0,:],results[1,:],c=results[2,:],cmap='YlGnBu', marker='o', s=10, alpha=0.3)
plt.colorbar()
plt.scatter(sdp,rp,marker='*',color='r',s=500, label='Maximum Sharpe ratio')
target = np.linspace(rp_min, 0.60, 50)
efficient_portfolios = efficient_frontier(mean_returns, cov_matrix, target)
plt.plot([p['fun'] for p in efficient_portfolios], target, linestyle='-.', color='black', label='efficient frontier')
plt.title('Calculated Portfolio Optimization based on Efficient Frontier')
plt.xlabel('Annualised Volatility')
plt.ylabel('Annualised Returns')
plt.legend(labelspacing = 0.8)

Assuming a minimum acceptable return requirement of 20%, we will find the optimum portfolio using the SF ratio.

returns = pf_data.pct_change()
mean_returns = returns.mean()
cov_matrix = returns.cov()
num_portfolios = 100000
risk_free_rate = 0.20

Simulating Portfolios based on Efficient Frontier

# Simulating Portfolios based on Efficient Frontier
display_simulated_ef_with_random_sfr(mean_returns, cov_matrix, num_portfolios, risk_free_rate)

Optimizing Portfolios based on Efficient Frontier

# Calculating Portfolios based on Efficient Frontier
display_calculated_ef_with_random_sfr(mean_returns, cov_matrix, num_portfolios, risk_free_rate)

Case 4: A portfolio that maximizes the Investor’s utility.

One of the approaches to finding the optimal point on the efficient frontier for a given investor is to maximize the investor's utility. In the field of economics, utility (u) is a measure of how much benefit consumers derive from certain goods or services. From a finance standpoint, it refers to how much benefit investors obtain from portfolio performance. Utility is a measure of relative satisfaction that an investor derives from different portfolios. We can generate a mathematical function to represent this utility that is a function of the portfolio expected return, the portfolio variance and a measure of risk aversion.

Where U = utility, E(r) = portfolio expected return, A = risk aversion coefficient and σ² = portfolio variance

In determining the risk aversion (A), we measure the marginal reward an investor needs in order to take on more risk. A risk-averse investor will need a high margin reward for taking on more risk. The utility equation shows the following:

  • Utility can be positive or negative — it is unbounded.
  • High returns add to utility.
  • High variance reduces utility.
  • Utility does not measure satisfaction but can be used to rank portfolios.

The risk aversion coefficient, A, ranges between 0 and 10. Practically it ranges between 1 to 10. It is 1 for the aggressive investor(s), 4 for the moderate investor(s), and 10 for the risk-averse investor(s).

We will define a function to calculate the optimal portfolio based on Utility to Investor using the PyPortfolioOpt library.

def utility_optimal_portfolio(data,risk_aversion_coeff):
# Importing libraries
from pypfopt import EfficientFrontier
from pypfopt import risk_models
from pypfopt import expected_returns
from pypfopt import objective_functions

# Expected Returns
mu = expected_returns.mean_historical_return(pf_data)
# Expected Volatility
Sigma = risk_models.sample_cov(pf_data)
ef = EfficientFrontier(mu, Sigma) # setup
ef.add_objective(objective_functions.L2_reg) # add a secondary objective
weights = ef.max_quadratic_utility(risk_aversion=risk_aversion_coeff, market_neutral=False) # find the portfolio that maximizes utility
ret,vol,sharpe_r = ef.portfolio_performance(risk_free_rate=0.01125)
# loop to iterate for values
res = dict()
for key in weights:
# rounding to K using round()
res[key] = round(weights[key], 2)

return "Allocation " + str(res),'Annualised Reutrn ' + str(round(ret,2)),'Annualised Volatility ' + str(round(vol,2)),'Sharpe Ratio ' + str(round(sharpe_r,2))

We will find the optimal portfolios for the three types of Investor(s) we mentioned above.

# Aggressive Investor
utility_optimal_portfolio(pf_data,1)
# Moderate Investor
utility_optimal_portfolio(pf_data,4)
# Risk-Averse Investor
utility_optimal_portfolio(pf_data,10)

Looks good!
Now that we have seen the practical demonstration of the MPT, let us now see the issues with this theory.

Issues With Modern Portfolio Theory

Critics contend MPT doesn’t deal with the real world, because all the measures used by MPT are based on projected values or mathematical statements about what is expected rather than real or existing. Investors have to use predictions based on historical measurements of asset returns and volatility in the equations, which means they are subject to be changed by variables currently not known or considered at the time of the equation.

Investors have to estimate from past market data because MPT tries to model risk in terms of the likelihood of losses, without a rationale for why those losses could occur. That makes the risk assessment probabilistic, but not structural.

In other words, the mathematical model of MPT makes investing appear orderly when its reality is far less so. As an example, contrary to what the theory predicts, a researcher in the late 1970s, Sanjay Basu, demonstrated that low price-to-earnings ratio (P/E) stocks outperformed high P/E stocks. And in the early 1980s, another researcher, Rolf Banz, demonstrated that small-capitalization stocks outperformed large-cap stocks.

The Bottom Line

The gist of MPT is that the market is hard to beat and that the people who beat the market are those who take on above-average risk. It is also implied that these risk-takers will get their comeuppance when markets turn down.

Then again, investors such as Warren Buffett remind us that portfolio theory is just that — theory. At the end of the day, a portfolio’s success rests on the investor’s skills and the time he or she devotes to it. Sometimes it is better to pick a small number of out-of-favour investments and wait for the market to turn in your favour than to rely on market averages alone.

In this article, I tried to collate the theory and practical implementation of the Modern Portfolio Theory. In the next part, we will go through the other portfolio optimization techniques.

References:
https://www.investopedia.com
https://www.analystprep.com
https://www.corporatefinanceinstitute.com
https://www.thestreet.com
https://www.unsplash.com

A special mention for Ricky Kim, a data lover, whose work motivated me to explore this topic.

--

--

Aman Gupta

Is a pantomath and a former entrepreneur. Currently, he is in a harmonious and a symbiotic relationship with Data.