Fundamentals of finance with numerical examples in Python

Target audience

This notebook was prepared for (undergraduate) students who are learning finance and who are also interested in solving finance problem sets with Python. I hope these notes can be of use to others as well.


  1. Time value of money
  2. Introduction to bonds and the yield curve
  3. Introduction to stocks and portfolios

Notebook config

For expositional purposes, I kept the configuration of the packages separate and you can see the notebook config here. We are going to rely on a couple of common packages to exemplify the concepts. These are, together with the versions of them that I used:

In [1]:
%run notebook_config.ipynb
Pandas:     2.1.4
Matplotlib: 3.8.2
Cycler:     0.12.1
Numpy:      1.26.3
Scipy:      1.11.4

Notebook version: 2024-01-14 18:35:17.126617

1. Time value of money


One of the core concepts in finance is that of time value of money. I.e. money today has a different value to money in the future. In many cases, money today is worth more than the same amount would be in the future but it is not necessarily always true as we have witnessed in the recent past with negative interest rates.

The exemplify the time value of money, consider $\$100$ today. Assume that you can invest it risk-free at $10\%$ per year. This means that one year from now that $\$100$ is worth $\$110$. Now, assume that cash today is denoted $PV$ (i.e. the Present Value) and that cash in one year is denoted $FV$ (i.e. the Future Value) and that we invest for $n$ periods. The interest we earn per year is $r$, then we have that $FV_n = PV_0 \times (1+r)^n$ or equivalently the standard formula:

$$ PV_0 = \frac{FV_n}{(1+r)^n} $$

This is the key relationship that underpins most of finance and more generally, we use discount rates rather than interest rates, as not all cashflows are associated with a particular interest rate and even if they are, it is not necessarily the right discount rate due to various factors such as risk. From now on, $r$ will in most cases denote the discount rate.

We now move on to see what this actually means the value of money over time. We are going to compute the future equivalent of a single cashflow of $\$100$ today at different interest rates for different horizons. For example, 30 years from now, how much is the equivalent of $\$100$ today with a $7%$ discount rate? The answer, as is visible in the figure below exceeds $\$700$.

To do this, I setup a pandas dataframe with three columns: Horizon, discount rate and the future value. Then compute and plot the different values:

In [2]:
horizons = np.arange(0,31)
discountrates = np.arange(0,0.07,0.01)
present_value = 100

def future_value_single_cashflow(pv, n, r):
    return pv * (1 + r)**n

future_values = pd.DataFrame(columns=['horizon','discountrate', 'futurevalue'])

for rate in discountrates:
    for horizon in horizons:
        fv = future_value_single_cashflow(present_value, horizon, rate)
        future_values.loc[len(future_values)] = [horizon, rate, fv]

    plot_yaxis = future_values.loc[future_values['discountrate'] == rate, 'futurevalue']
    plot_xaxis = future_values.loc[future_values['discountrate'] == rate, 'horizon']
    plt.plot(plot_xaxis, plot_yaxis, label = str(round(rate*100))+'%')

plt.legend(title='Discount rate')
plt.title('Future equivalent of $100 today at different discount rates')
plt.xlabel('Years into the future')
No description has been provided for this image

The curvature that is clearly visible is getting more and more pronounced with higher discount rates and this is referred to as compounding - commonly referred to as earning interest on interest. This is a very powerful effect that significantly can affect values of cashflows over time. The higher the discount rate the larger the effect of compounding.


Often we are interested in streams of cashflows, not just single ones. When we receive multiple cashflows over time that are linked together we can often use shorthand formulas to compute their present values. One of these types of contracts are annuities: identical cashflows that repeat with a certain interval for a specific number of years. For example, large lottery winnings are often paid out like this.

The present value of an ordinary annuity (i.e. the first payment is made at the end of the current period, or t=1) is computed as follows at t=0 (today):

$$ PV_0 = PMT_1 \times \frac{1 - \frac{1}{(1+r)^n}}{r} $$

This formula assumes that the amount paid each period, $PMT$, is constant. It is fairly common also to have contracts where the amount paid grows each period with a specific factor. Such contracts are called ordinary growing annuities and the present value can be computed as:

$$ PV_0 = PMT_1 \times \frac{1 - \left(\frac{(1+g)}{(1+r)}\right)^n}{r-g} $$

There are also annuity dues these are annuities where the payment is made at the beginning of each period. Since these annuities have cashflows that 'arrive' one period earlier than an ordinary annuity, the present value of an annuity due is the present value of the ordinary annuity multiplied by $(1+r)$, i.e. it is precisely one period more valuable than an ordinary annuity.

Let us compare two almost identical ordinary annuities that pay $\$1000$ per year for 30 years. One of them grows at 0% and the other one has cashflows that grows at 2% per year. Assume that the appropriate discount rate is 4%. We can compute the present values as follows:

In [3]:
def ordinary_annuity(pmt, r, n, g = 0):
    # Default growth set to zero
    return pmt * ((1-((1+g)/(1+r))**n)/(r-g))
In [4]:
ordinary_annuity(pmt = 1000, r = 0.04, n = 30)
In [5]:
ordinary_annuity(pmt = 1000, r = 0.04, n = 30, g = 0.02)

As expected the growing annuity has a higher present value.


Let's move on to a real-life example: large lottery cash prizes, see for instance this example of Powerball).

In these types of lotteries winners are often presented with two options: Take a lower amount as a lump sum today or get your total prize paid in installments (i.e. as an annuity). In the example here, the jackpot is $\$1.2$ billion and the winner has to chose one of the following options:

  1. $\$551.7$ million lump sum payment today
  2. $\$1.2$ billion paid as one immediate payment today and 29 annual payments thereafter, each growing at $5\%$ annually. In other words: A growing annuity due.

Ignoring taxes and other individual preferences, these options are vastly different and require some analysis to compare. Using the insights from the discussion on time value of money and annuities allows us to compute the present day equivalent (under some fairly strict assumptions) of the second option. We can then use the outcome of that procedure to compare it to the immediate payment.

First we need to compute what the annual payment is that adds up to $\$1.2$ billion over the next 29 years with an initial payment today. To do so, we need to find the initial payment that adds up to a present value of the total prize at the pre-defined growth rate:

$$ 1.2B = \sum_{n=0}^{29}\left( PMT \times (1+g)^n \right) $$

We complete this procedure by iterating different initial payments using a function that returns the square-root of the squared difference between the sum of some guess and the total that we are looking for (i.e. the 1.2 billion). We start with a guess of 10 million:

In [6]:
lump_sum_payment = 551700000
total_prize = 1200000000
growth_rate = 0.05
time_periods = 30

def find_payment(pmt, total_cashflow, g, n, payment_stream = False):
    time_periods = range(0,n)

    if payment_stream:
        payments = []
        sum = 0
        for t in time_periods:
            sum += pmt*(1+g)**t
        return payments
        sum = 0
        for t in time_periods:
            sum += pmt*(1+g)**t    
        return np.sqrt((sum - total_cashflow)**2)

out = sp.optimize.minimize(find_payment, 10000000, args = (total_prize, growth_rate, time_periods), method = 'Nelder-Mead')
initial_payment = out.x[0]
print('Initial payment:', initial_payment)
Initial payment: 18061722.09631768

Our optimization procedure concludes that an initial cash payment of $18,061,722.09 satisfies our restriction. We can verify the outcome by setting the initial payment to this value to see how large of an error we get:

In [7]:
find_payment(initial_payment, total_prize, growth_rate, time_periods)

We can now compare the two scenarios graphically. In the left panel, we plot the annuity payments only and in the right panel we add the lump sum payment (using a log scale) since it is a magnitude larger than the annuity payments.

In [8]:
powerball_annuity_payments = find_payment(initial_payment, total_prize, growth_rate, time_periods, payment_stream = True)
fig, ax = plt.subplots(1,2, figsize=(14,4))
ax[0].bar(np.arange(0,time_periods),powerball_annuity_payments, color = blue, label = 'Annuity')
ax[0].set_xlabel('Years from today')
ax[0].set_ylabel('Payout ($)')

ax[1].bar(np.arange(0,time_periods),powerball_annuity_payments, color = blue, label = 'Annuity')
# Note, I moved the lumpsum bar slightly to the left to avoid 100% overlap.
ax[1].bar([-0.25],[lump_sum_payment], color = green,  label = 'Lump sum')
ax[1].set_xlabel('Years from today')
ax[1].set_ylabel('Payout ($, log scale)')
No description has been provided for this image

The next step to better understand the two options is to compute the discount rate at which the two options are equal. We can do this by writing a new function that we can use in an optimization procedure. The idea here is to discount all annuity payments with a rate that gives us a present value equal to the lump sum payment.

In [9]:
def find_discount_rate(r, payments, lump_sum):
    sum = 0
    for t, pmt in enumerate(payments):
        sum += pmt/(1+r)**t
    return np.sqrt((sum - lump_sum)**2)

out = sp.optimize.minimize(find_discount_rate, 0.03, args = (powerball_annuity_payments, lump_sum_payment), method = 'Nelder-Mead')
discount_rate = out.x[0]
print('Discount rate at which both options are equal:', discount_rate)
Discount rate at which both options are equal: 0.04870063615291294

The annuity and the lump sum payment appears to be equal at a discount rate of 4.87%. This means that if the 'true' discount rate is higher than that, then the lump sum is a better choice and vice versa. In other words, a rational investor in a tax free world may be better off with the lump sum if they can invest the proceeds at a rate above 4.87% at a risk-level comparable to that of the counter-party risk of the annuity issuer.

Objectively it may not be possible to determine which of the two options is economically better as there are many risks and considerable uncertainty about the future that we have abstracted away from but the break-even discount rate of 4.87% is nevertheless a good starting point for further analysis.


As an extension of annuities there are also perpetual cashflows, known as perpetuities. An example of a perpetual cashflows are consols. In the past, the UK government issued such perpetual bonds that paid a regular fixed coupon, for instance 3.5% but with no maturity date.

The present value of a perpetuity can be computed as follows: $$ PV_0 = \frac{PMT_1}{r} $$

Note here that just as with the ordinary annuity the first payment comes in one period from now (t=1) but the present value is computed for t=0.

There are also growing perpetuities, contracts whose payments grow by a specific growth rate over time. The present value of a growing perpetuity can be computed as: $$ PV_0 = \frac{PMT_1}{r-g} $$

We can easily compare a perpetuity to an annuity and see that as we increase the number of periods for the annuity payments, the two will converge in value:

In [10]:
pmt = 1000
r = 0.08
periods = range(1,300)
annuity_pv = []

for p in periods:
    annuity_pv.append(ordinary_annuity(pmt, r, p))

plt.plot(annuity_pv[:140], color = blue, label = 'Annuity PV')
plt.axhline(y = pmt/r, color = green, label = 'Perpetuity reference PV')
plt.xlabel('Number of cashflow periods')
No description has been provided for this image

It is clearly visible that the annuity and perpetuity converge with as few as about 100 cashflow periods. Let's take a closer look at the convergence:

In [11]:
difference_in_value = []
for pv in annuity_pv:
    difference_in_value.append(pmt/r - pv)

fig, ax = plt.subplots(1,2, figsize=(14,4))
ax[0].plot(range(100,140), difference_in_value[100:140])
ax[0].axhline(y=1, color = green)
ax[0].set_ylabel('Difference in present value')
ax[0].set_xlabel('Number of cashflow periods')
ax[0].set_title('Difference to 1 USD reference (green line)')

ax[1].plot(range(140,180), difference_in_value[140:180])
ax[1].axhline(y=0.05, color = green)
ax[1].set_ylabel('Difference in present value')
ax[1].set_xlabel('Number of cashflow periods')
ax[1].set_title('Difference to 5 cent reference (green line)')
No description has been provided for this image

As is clearly visible, the difference in value goes down with the number of paying periods of the annuity. Using our example, after a few more than 120 periods, the difference is less than a dollar. After about 160 periods, the difference is less than 5 cents. Thus, the vast majority of the value coming from a perpetuity (in today's dollars) are from near-term term cashflows. Cashflows very far into the future are almost not worth anything today.

We can also compare the convergence of annuities and perpetuities at different discount rates. We use the same annual payment as in the previous example and now vary the discount rate. At each rate level, we compute how many periods it takes for the difference between the contracts to be less than 1% of the perpetuity value.

In [12]:
step = 0.005
lower_bound = 0.005
upper_bound = 0.20 + step
tolerance = 0.01
max_iterations = 1000

discount_rate = np.arange(lower_bound, upper_bound, step)
periods_until_diff_less_1pc = []

for d in discount_rate:
    perpetuity_pv = pmt / d
    num_periods = 1
    while True:
        annuity_pv = ordinary_annuity(pmt, d, num_periods)
        if num_periods == max_iterations:
        if annuity_pv > (1 - tolerance) * perpetuity_pv:
            num_periods += 1

print(f'Number of periods required at a discount rate of {lower_bound * 100}%: {periods_until_diff_less_1pc[0]}')
print(f'Number of periods required at a discount rate of {upper_bound * 100}%: {periods_until_diff_less_1pc[-1]}')
Number of periods required at a discount rate of 0.5%: 924
Number of periods required at a discount rate of 20.5%: 26
In [13]:
plt.ylabel('# of periods until diff. <1% of the perp. PV')
plt.xlabel('Discount rate')
No description has been provided for this image

2. Introduction to bonds


Bonds are debt contracts that are usually issued by corporations or governments. In general, these contracts come with predefined cashflows (we abstract away from other types of bonds for now) that are paid at pre-defined dates. There are two common types: zero-coupon bonds and coupon bonds. As the names imply, the difference is whether or not they pay a non-zero coupon. Since these are in all other respects similar, the focus here is going to be on coupon bonds.

The components of a coupon bond:

  • The coupon: The coupon is the recurring payment and is set at a specific coupon rate that is a percentage of the principal.
  • The princial: The final lump-sum payment also known as the face value of the bond.
  • The yield-to-maturity (YTM): The discount rate that equates all future payments to the current market price of the bond.

A text-book like example of a coupon-bond is a 10 year bond with a principal of $\$1000$ and a coupon rate of 5% and annual coupons. This means that the bond will pay a $\$50$ coupon once per year for 10 years and at the end also pay back the principal itself. Assume that the bond is issued today and that the first coupon is paid one year from now. The price today of this bond is a combination of an annuity and a single cashflow:

$$ Price = 50 \times \frac{1 - \frac{1}{(1+r)^{10}}}{r} + \frac{1000}{(1+r)^{10}} $$

The only unknown variable here is the discount rate (YTM) and the current market price. However, for any publicly traded bond we often know the current market price allowing us to solve for the YTM that the market price implies.

Let's assume that the current price is $\$830$. If that is the case, then the YTM can be solved for using:

In [14]:
price = 830
coupon = 50
principal = 1000
periods = 10

def bond_price(r, pmt, fv, n):
    return ordinary_annuity(pmt, r, n) + fv/(1+r)**n

def bond_price_difference(r, pmt, fv, n, current_price):
    return np.sqrt((current_price - bond_price(r, pmt, fv, n))**2)

result = sp.optimize.minimize(bond_price_difference, 0.03, args = (coupon, principal, periods, price), method = 'Nelder-Mead')
print(f'The YTM is {result.x[0]*100:.3f}%')
The YTM is 7.474%

We can easily verify this using the bond price function:

In [15]:
print(bond_price(result.x[0], coupon, principal, periods))

Government bonds and the yield curve

Governments around the world issue bonds to pay for expenditures in excess of tax revenues. These bonds are issued with varying maturity dates ranging from very short-term bonds maturing in less than a year to long-term bonds of more than 10 years to maturity. Most countries will have many bonds that are publicly traded and often issue new ones. Since maturities and coupons vary across bond issues, we often track what are known as constant maturity bonds (or in the U.S.: CMTs: Constant Maturity Treasuries, I will use CMTs to denote these types of bonds from now on). These bonds have fixed maturities, are zero-coupon and generally do not exist. Their prices are computed using existing bonds that are traded on the market.

We need to do two things to convert traded bonds to CMTs: First, we need to compute the zero-coupon yield for each maturity and second, we need to compute the CMT yield at each constant maturity (e.g. 3 months, 1 year, 5 years etc...). How this is precisely done depends on the country and publisher of these rates.

If we put all the CMTs into a figure with the rate on the y-axis and the maturity on the x-axis, we have what is known as the yield curve.

Assume that we observe the following CMTs:

Maturity: 3 months 6 months 1 year 2 years 5 years 10 years
Rate: 1.0% 1.2% 1.6% 2.1% 2.6% 3.0%

The Yield curve can then be represented using function that fits these data points. There are various models in the literature and a commonly used one is the Nelson-Siegel approach published in 1987. This approach uses the following functional form and we need to fit four parameters to solve it: $(a, b, c, \tau)$: $$ y(m) = a + b\times\frac{1-\exp^{-m/\tau}}{m/\tau} + c\times\exp^{-m/\tau} $$ While it is possible to fix $\tau$ at different values and then use least squares to find a good fit for the other parameters, we can also implement a procedure to solve for all parameters at once using scipy's minimize:

In [16]:
### Observed yields on the market:
maturity = [0.25, 0.5, 1, 2, 5, 10]
yields = [0.01, 0.012, 0.016, 0.021, 0.026, 0.03]

# Nelson-Siegel
def nelson_siegel(parameters, m):
    a = parameters[0]
    b = parameters[1]
    c = parameters[2]
    tau  = parameters[3]
    return a + b * ((1-np.exp(-m/tau))/(m/tau)) + c * np.exp(-m/tau)

def nelson_siegel_error(parameters, maturity = maturity, yields = yields):
    error = 0
    for m,y in zip(maturity,yields):
        error += (y - nelson_siegel(parameters, m))**2
    return error

parameters = sp.optimize.minimize(nelson_siegel_error, [0.0, 0.0, 0.0, 1.0])

First we check that the optimization succeeded:

In [17]:

These are the estimated model parameters:

In [18]:
array([ 0.032, -0.028,  0.003,  1.001])

Note here, that there are many (better) ways to solve for the parameters of the Nelson-Siegel model depending on how one computes the errors in the minimization procedure. A paper with more details on this, and with an overview on central bank-level choices with respect to methods, is available from the Bank of International Settlements as BIS Paper 25 from 2005.

Next we can compute the yield curve at any maturity based on the parameters we just estimated:

In [19]:
time = np.arange(0.25,10.25,0.25)
estimated_yields = []
for m in time:
    estimated_yields.append(nelson_siegel(parameters.x, m))
In [20]:
plt.plot(time, estimated_yields, label = 'Estimated yield curve')
plt.scatter(maturity, yields, label = 'Observed yields')
No description has been provided for this image

3. Introduction to stocks and portfolios

Dividend discount model

One of the simplest models to value a stock is the Dividend Discount Model (see e.g. wikipedia for its history and the Gordon and Shapiro (1956) paper).

This model exists in many variations and can easily be extended and adapted to suit different situations but at the core the idea is always: The fair value of a stock depends on the cashflows an investor can expect to receive in the future. The underlying rationale is that a stockholder of a firm has a claim on the residual cashflows in the event of a bankruptcy and therefore if we value all the intermediate cashflows and the terminal one, we should be able to assess the fair value of the stock.

If we assume that the firm will exist indefinitely, we can compute the current fair value of the stock as the present value of a perpetual flow of dividends. To make it a bit more realistic, we can also assume that dividends grow at a constant rate. We therefore have that the price today ($P_0$) can be computed as,

$$ P_0 = \frac{D_1}{r-g} $$

The difficulty in valuing stocks comes from the fact that we neither know future dividends nor the appropriate discount rate for these dividends. We also do not know for how long a firm is going to be around. These factors make it very difficult to accurately price stocks.

However, we can still do some simple numerical analysis to get a feeling for how the growth rate of dividends (i.e. $g$) affects the present value (i.e. $P_0$). We can do this for two different 'stocks', one which is assumed to exist into infinity (i.e. modelled with a perpetuity) and that we assume only to exist for another 10 years (i.e. modelled with an annuity). The difference is present values will tell us what the importance is of all the cashflows beyond the next 10 years.

In [21]:
d1 = 10
r = 0.1
g_step = 0.005
g_lower_bound = 0
g_upper_bound = 0.08 + g_step
annuity_periods = 50

growth_rates = np.arange(g_lower_bound, g_upper_bound, g_step)
p0_perpetuity = []
p0_annuity = []

for g in growth_rates:
    p0_annuity.append(ordinary_annuity(d1, r, annuity_periods, g))

fig, ax = plt.subplots(1,2, figsize=(14,4))
ax[0].scatter(growth_rates, p0_perpetuity, color = blue, label = 'Perpetuity')
ax[0].scatter(growth_rates, p0_annuity, color = green, label = f'Annuity ({annuity_periods}y)')
ax[0].set_xlabel('Growth rate')
ax[0].set_ylabel('Stock price (today)')
ax[0].set_title('Present values today')

ax[1].plot(growth_rates, (np.array(p0_annuity)/np.array(p0_perpetuity))*100)
ax[1].set_xlabel('Growth rate')
ax[1].set_ylabel(f'% value from first {annuity_periods}y')
ax[1].set_title('% of value derived in the first 10 years.')
No description has been provided for this image

The two figures show that as the growth rate of dividends go up, the value of the stock goes up (left panel) and at the same time the share of the stockprice that comes from the first 50 years of cashflows goes down (right panel). At a growth rate of 0%, almost 100% of the share price is generated from (relatively more) near-term cashflows whereas this is only approximately 60% at a growth rate of 8% (using the parameters of this example).

Risk, return and portfolios

One of the fundamental assumptions underlying much of finance is that investors who take on higher levels of risk, will be compensated with higher expected returns. This is a fundamental relationship that defines much of how we think about the pricing of stocks and to put some structure on this, imagine a world with the following two stocks (annual percentages): Both have 7% return while one has a 12% and the other a 15% standard deviation associated with those returns.

If a rational investor can only pick one of them, and given that they both have the same expected return, they will pick the one that comes with less risk. In fact, thinking about the trade-off between risk and return is closely related to a core measure for the price of risk in finance, the Sharpe ratio, developed by William Sharpe and published in 1966 in the Journal of Business:

$$ \frac{E(r_i) - r_f}{\sigma_i} $$ This ratio expresses the return of stock i in excess of the risk-free rate (commonly referred to as the excess return) per unit of risk (in this case the standard deviation of the returns or volatility of stock i). In our case, assuming that the risk-free rate is 2%, the Sharpe ratios are (rounded): 1) 0.42; 2) 0.33. This simply indicates that one percentage point of volatility delivers in expectation 0.42 and 0.33 percentage points of return per year.

Sharpe-ratios are not the only aspects that matter when selecting stocks as a rational investor. also the correlations between different stocks' returns matter. This is why creating portfolios of different stocks often can help offset some risk. To exemplify this, let's setup three artificial stocks with the following properties:

In [22]:
# Three assets: a, b, and c:
return_vector = [0.0227, 0.1016, 0.0593]
vol_vector = [0.1612, 0.2294, 0.2116]
correl_vector = [-0.13, -0.22, 0.18] # correlation pairs: ab, ac, bc

The first step is to setup a covariance matrix between these three assets, where the assets a, b, and c are assigned indices 0, 1, and 2.

In [23]:
# Compute covariance matrix
covar_ab = correl_vector[0] * vol_vector[0] * vol_vector[1]
covar_ac = correl_vector[1] * vol_vector[0] * vol_vector[2]
covar_bc = correl_vector[2] * vol_vector[1] * vol_vector[2]
var_a = vol_vector[0]**2
var_b = vol_vector[1]**2
var_c = vol_vector[2]**2

covar_matrix = np.matrix([[var_a, covar_ab, covar_ac], 
                          [covar_ab, var_b, covar_bc], 
                          [covar_ac, covar_bc, var_c]])

We then define three functions that take as inputs among other things a weight vector for the assets. Using these weights, we can compute the portfolio standard deviation, return and Sharpe ratio respectively. The function for the Sharpe ratio has some additional features that we need later on.

In [24]:
def portfolio_sd(weights, covar_matrix):
    weight_matrix = np.outer(weights,weights)
    weighted_covar_matrix = np.multiply(covar_matrix,weight_matrix)
    portfolio_variance = np.sum(weighted_covar_matrix)
    return np.sqrt(portfolio_variance)

def portfolio_return(weights, return_vector):

def portfolio_sharpe(weights, return_vector, covar_matrix, number_of_assets, risk_free_rate = 0.0, negative = False):
    sharpe = (portfolio_return(weights[:number_of_assets], return_vector) - risk_free_rate) / portfolio_sd(weights[:number_of_assets], covar_matrix)
    if negative:
        return -sharpe
    return sharpe

By assigning a weight of 1 to asset b, we can naively verify that the functions work:

In [25]:
portfolio_sd([0,1,0], covar_matrix)
In [26]:
portfolio_return([0,1,0], return_vector)
In [27]:
portfolio_sharpe([0,1,0], return_vector, covar_matrix, len(return_vector))

We can now solve for the optimal portfolio, i.e. the portfolio that combines our three assets so as to maximize the Sharpe ratio. We solve for this portfolio numerically and under the assumption that all weights need to add up to one, and that all weights need to be between zero and one.

For this procedure we again use scipy's minimize:

In [28]:
# Optimal portfolio
# Constraints
def constraint_sum_weights_one(w):
    return np.sum(w) - 1
constraints = {'type':'eq', 'fun': constraint_sum_weights_one}

# Optimization
number_of_assets = len(return_vector)
initial_values = np.random.random(number_of_assets)
output = sp.optimize.minimize(portfolio_sharpe, 
                              args = (return_vector, covar_matrix, number_of_assets, 0.0, True), 
                              constraints = constraints, 

We now verify the output of the procedure and it is clear that it worked: success is True.

In [29]:
 message: Optimization terminated successfully
 success: True
  status: 0
     fun: -0.5462103192110038
       x: [ 3.366e-01  4.009e-01  2.625e-01]
     nit: 6
     jac: [-6.541e-04 -1.705e-05  8.647e-04]
    nfev: 25
    njev: 6

The weights are stored in x and we can use these weights to compute the expected return and volatility of the optimal portfolio (i.e. the market portfolio):

In [30]:
risk_free_rate = 0.02
market_return = portfolio_return(output.x, return_vector)
market_volatility = portfolio_sd(output.x, covar_matrix)
market_sharpe = (market_return - risk_free_rate) / market_volatility

The market Sharpe ratio looks good and we can verify that it is higher than any of the individual assets:

In [31]:
print(portfolio_sharpe([1,0,0], return_vector, covar_matrix, len(return_vector)) < market_sharpe,
      portfolio_sharpe([0,1,0], return_vector, covar_matrix, len(return_vector)) < market_sharpe,
      portfolio_sharpe([0,0,1], return_vector, covar_matrix, len(return_vector)) < market_sharpe)
True False True

Good news: All comparisons return True. The next step is to setup the efficient frontier, that is for each level of expected return, what is the highest possible Sharpe ratio that we can attain?

To solve for this we essentially repeat the exercise to solve for the market portfolio but now we add an additional constraint: We maximize the Sharpe ratio for a specific expected return. We need to repeat this exercise at each return level:

In [32]:
# Efficient frontier
stepsize = 0.0001
number_of_assets = len(return_vector)
lower_bound_return = np.min(return_vector)
upper_bound_return = np.max(return_vector)

frontier_return_vector = np.arange(lower_bound_return, upper_bound_return + stepsize, stepsize)
frontier_risk_vector = []
frontier_weights_vector = []

# Contraints
def constraint_sum_weights_one(parameters):
    weights = parameters[:-1]
    return np.sum(weights) - 1
def constraint_return_value(parameters, return_vec = return_vector):
    weights = parameters[:-1]
    fixed_return = parameters[-1]
    return portfolio_return(weights, return_vec) - fixed_return
constraints = ({'type':'eq', 'fun': constraint_sum_weights_one},
               {'type':'eq', 'fun': constraint_return_value})

# Main loop
for index, return_value in enumerate(frontier_return_vector):
    initial_values = np.insert(np.random.random(number_of_assets), number_of_assets, return_value)
    bounds = ([(0,1)]*number_of_assets)
    output = sp.optimize.minimize(portfolio_sharpe, 
                                  args = (return_vector, covar_matrix, number_of_assets, 0.0, True), 
                                  constraints = constraints, 
                                  bounds = bounds,
                                  tol = 0.000001)
    # Store values if optimization was successful
    if output.success:
        frontier_risk_vector.append(portfolio_sd(output.x[:number_of_assets], covar_matrix))

We proceed by plotting the frontier and the three assets:

In [33]:
# Plot frontier
plt.plot(frontier_risk_vector,frontier_return_vector, label = 'Efficient frontier')
plt.scatter(market_volatility, market_return, color = red, label = 'Market portfolio')

# Add assets
for ret, vol, num in zip(return_vector, vol_vector, range(3)):
    if num == 0:
        plt.scatter(vol, ret, color = orange, label = 'Individual assets')
        plt.scatter(vol, ret, color = orange)

plt.ylabel('Expected return')
No description has been provided for this image

The plot above reveals that we can assemble portfolios that are superior to each individual asset in terms of their sharpe ratios. This happens because we can diversify away some of the risk associated with individual assets. If we think of the volatility as the total risk of an asset, we can 'split' it up into two components: Risk that we can diversify away (often called idiosyncratic risk) and risk that we cannot diversify away (often called systematic risk).

It turns out that only systematic risk is priced by the market. An early, and one of the simplest models that captures only systematic risk and return is the capital asset pricing model (CAPM) developed in the early 1960s. It states that there is a linear relationship between an asset's systematic risk and its expected return. Formalized, the relationship between the expected return of asset $i$ and its exposure to market risk, captured by $\beta_i$ can be stated as follows: $$ E(r_i) = r_f + \beta_i(E(r_m) - r_f) $$

Knowing the 'expected' returns of the market portfolio, our three assets and the risk-free rate (let's still assume it is 2% per annum), we can solve for the betas of the three assets using the CAPM equation above (under the assumption that CAPM is indeed the correct model for asset pricing - please see the empirical example with real assets below for further elaboration on this).

We extract the betas from the model:

In [34]:
risk_free_rate = 0.02
beta_vector = list(map(lambda x: (x - risk_free_rate) / (market_return - risk_free_rate), return_vector))

We now define the security market line (SML):

In [35]:
def security_market_line(beta, rf, rm):
   return list(map(lambda x: rf + x*(rm-rf), beta))
beta_linspace = np.linspace(0, 2, 20)
In [36]:
# Plot the Security market line
plt.plot(beta_linspace, security_market_line(beta_linspace, risk_free_rate, market_return), label = 'SML')

# Plot individual assets
plt.scatter(beta_vector, return_vector, color = blue)
plt.scatter(0, risk_free_rate, color = green)
plt.scatter(1, market_return, color = red)

plt.ylabel('Expected return')
No description has been provided for this image

The plot of the security market line (SML), as imposed by the CAPM, shows a linear relationship between risk (in this case the CAPM beta) and expected returns - which given our assumptions is by design! The red dot shows the market portfolio with a beta of 1 and the the green dot in the lower left corner shows that risk-free asset with a beta of 0.

CAPM with real-life data

Let's turn to a real-life example. To do this, we will use some data that is publicly available through the website of Kenneth R. French at Dartmouth College. In particular, we will use the CSV formatted 10 Industry Portfolios. The tricky thing with this file is that the data is formatted as follows:

  • First lines contain some description.
  • Around line 5 or so, the variable names (for the industry portfolios) are presented.
  • After that, there are monthly returns that are value-weighted within each industry (i.e. companies receive a weight relative to their market capitalization).
  • Finally, the returns are repeated but this time the firms are equally-weighted (an alternative to value-weighting).

This data is updated regularly so hardcoding any line numbers to extract the data we are interested in is not feasible. Instead, I write a short (not very robust) algorithm to let the computer determine the correct rows to extract:

In [37]:
# Read the compressed data file with directly with pandas:
# This is the source URL to the data:
data = pd.read_csv("", 
                   names = range(0,11))

# Extract row numbers with the data we are interested in
first_data_row = -1
last_data_row = -1

for index, row in data.iterrows():
        if first_data_row != -1:
            first_data_row = index
        if first_data_row != -1:
                dtype_nan = np.isnan(row[0])
                if dtype_nan:
                    last_data_row = index - 1

We now know that the data we are interested in is located between rows:

In [38]:
print(first_data_row, last_data_row)
12 1180

This means that we can subset the data to those rows and also set the column names to the correct ones for the different portfolios. Finally, we divide the returns by 100 to set the scale as: 1% = 0.01:

In [39]:
# Subset data and reset index
portfolios = data[first_data_row:last_data_row+1].copy()
portfolios.reset_index(inplace = True, drop = True)

# Set correct column names
column_names = list(data.loc[first_data_row-1])
column_names[0] = 'Date'
portfolios.columns = column_names

# Set data formats and scale data
portfolios['Date'] = pd.to_datetime(portfolios['Date'], format = '%Y%m')

portfolio_names = column_names
portfolio_names.pop(0) # Remove 'Date' from list
for p in portfolio_names:
    portfolios[p] = portfolios[p].astype('float')/100

# Remove original data from memory
del data

We can view the dataframe head and tail to verify that all looks good:

In [40]:
Date NoDur Durbl Manuf Enrgy HiTec Telcm Shops Hlth Utils Other
0 1926-07-01 0.0145 0.1555 0.0469 -0.0118 0.0290 0.0083 0.0011 0.0177 0.0704 0.0213
1 1926-08-01 0.0397 0.0368 0.0281 0.0347 0.0266 0.0217 -0.0071 0.0425 -0.0169 0.0435
In [41]:
Date NoDur Durbl Manuf Enrgy HiTec Telcm Shops Hlth Utils Other
1167 2023-10-01 -0.0353 -0.1788 -0.0276 -0.0624 -0.0178 -0.0018 0.0047 -0.0458 0.0112 -0.0253
1168 2023-11-01 0.0502 0.1576 0.0868 -0.0129 0.1196 0.0697 0.0718 0.0580 0.0508 0.1041

In addtion to the industry portfolios, we need a market benchmark. We could compute a 'market portfolio' consisting of these 10 assets but as an alternative we can also use some larger market index for this purpose. Important to remember is that both solutions are just approximations of the true unobserved market portfolio.

In any case, Kenneth R. French publishes some market data as well and we can download that from his website. This data is embedded in the factor returns that he publishes (see for example for more information on one such model).

Again, the file stacks two datasets on top of each other, so we reuse the algorithm from before to extract the first of the two datasets (the second dataset contains annual data and not compatible with our monthly industry returns).

In [42]:
# We use this data file:
data = pd.read_csv("", 
                   names = range(0,5))

# Extract row numbers with the data we are interested in
first_data_row = -1
last_data_row = -1

for index, row in data.iterrows():
        if first_data_row != -1:
            first_data_row = index
        if first_data_row != -1:
                dtype_nan = np.isnan(row[0])
                if dtype_nan:
                    last_data_row = index - 1

# Subset data and reset index
market = data[first_data_row:last_data_row+1].copy()
market.reset_index(inplace = True, drop = True)

# Set correct column names
column_names = list(data.loc[first_data_row-1])
column_names[0] = 'Date'
market.columns = column_names

# Set data formats and scale data
market['Date'] = pd.to_datetime(market['Date'], format = '%Y%m')

market_names = column_names
market_names.pop(0) # Remove 'Date' from list
for p in market_names:
    market[p] = market[p].astype('float')/100

# Remove original data from memory
del data

We check the output from this procedure:

In [43]:
0 1926-07-01 0.0296 -0.0256 -0.0243 0.0022
1 1926-08-01 0.0264 -0.0117 0.0382 0.0025
In [44]:
1167 2023-10-01 -0.0319 -0.0387 0.0019 0.0047
1168 2023-11-01 0.0884 0.0000 0.0165 0.0044

From this data we are interested in the Mkt-RF variable (the market risk premium and slope of the SML) as well as the risk free-rate RF. The other factors (SMB and HML) come from the Three-factor model mentioned earlier.

In [45]:
# Remove unused variables
market = market[['Date', 'Mkt-RF', 'RF']]

We join the portfolio and market data together on the Date variable:

In [46]:
portfolios = portfolios.merge(market, on='Date')
Date NoDur Durbl Manuf Enrgy HiTec Telcm Shops Hlth Utils Other Mkt-RF RF
0 1926-07-01 0.0145 0.1555 0.0469 -0.0118 0.0290 0.0083 0.0011 0.0177 0.0704 0.0213 0.0296 0.0022
1 1926-08-01 0.0397 0.0368 0.0281 0.0347 0.0266 0.0217 -0.0071 0.0425 -0.0169 0.0435 0.0264 0.0025
2 1926-09-01 0.0114 0.0480 0.0115 -0.0339 -0.0038 0.0241 0.0021 0.0069 0.0204 0.0029 0.0036 0.0023
3 1926-10-01 -0.0124 -0.0823 -0.0363 -0.0078 -0.0458 -0.0011 -0.0229 -0.0057 -0.0263 -0.0284 -0.0324 0.0032
4 1926-11-01 0.0520 -0.0019 0.0410 0.0001 0.0471 0.0163 0.0643 0.0542 0.0371 0.0211 0.0253 0.0031

The CAPM beta for each industry portfolio captures the sensitivity of the returns relative to the market. Assuming that past returns are representative of the future and therefore also expected return, we can compute the beta using past data by running the following linear regression model: $$ r_{i,t} - r_{f,t} = \alpha + \beta_i(r_{m,t} - r_{f,t}) + \varepsilon_t $$ where $r_i$ represents the return series of our industry portfolio and $r_m$ represents the return series of the market and $r_f$ the risk-free rate. As the error term shows, this is a time-series regression for each asset $i$ at a time.

I use scipy to run the regression for each portfolio separately:

In [47]:
portfolio_betas = []
portfolio_returns = []

for industry in portfolio_names:
    excess_returns = portfolios[industry]-portfolios['RF']
    beta = sp.stats.linregress(portfolios['Mkt-RF'], excess_returns).slope
    portfolio_returns.append((1+portfolios[industry].mean())**12 - 1)

We can now plot the betas against the annualized mean monthly past returns (i.e. our proxy for the true expected return) against the beta and compare this to the theoretical example we computed earlier. For simplicity,I use the annualized average market return to proxy for the expected market return and the latest available risk-free rate as the current risk free rate.

In [48]:
# Compute past average market and risk-free rate
avg_market_return = (1+(portfolios['Mkt-RF']+portfolios['RF']).mean())**12-1
last_risk_free = (1+portfolios.tail(1)['RF'])**12-1

# Add scatter plots
plt.scatter(portfolio_betas, portfolio_returns)
plt.scatter(1, avg_market_return, color = red, label = 'Market portfolio')
plt.scatter(0, last_risk_free, color = green, label = 'Risk-free rate')

# Plot the SML
plt.plot(beta_linspace, security_market_line(beta_linspace, last_risk_free, avg_market_return), label = 'SML')

# Formatting
plt.xlabel('CAPM Beta')
plt.ylabel('Annualized mean monthly returns')
No description has been provided for this image

It is quite clear that there is a decent fit for the CAPM although not perfect in any way. All portfolios lie close to the SML which is a good sign. However, the deviations from the SML indicate that the model cannot price all the risk coming from the portfolios. Portfolios above the SML are underpriced - i.e. they earn a too high expected return given their beta. Similarly, assets under the SML are overpriced. Using more advanced models may likely improve the fit.

End of notebook