# 39. Net Income Method Example with Numpy, Matplotlib and Scipy

The Net Income Method (in Germany known as Einnahmeüberschussrechnung, EÜR) is a simplified profit determination method. Under German law, self-employed persons such as doctors, lawyers, architects and others have the choice of submitting annual accounts or using the simple net income method. What we present in this chapter of our Pandas tutorial will not only be interesting for small companies, especially in Germany and EU countries, but also for other countries in the world as well.

We are mainly interested in introducing the different ways in which pandas and python can solve such problems. We show what it is all about to monitor the flow of money and in this way to better understand the financial situation. The algorithms can also be used for tax purposes. But be warned, this is a very general treatment of the matter and needs to be adjusted to the actual tax situation in your country. Although it is mainly based on German law, we cannot guarantee its accuracy! So, before you actually use it, you have to make sure that it is suitable for your situation and the tax laws of your country.

Very often a Microsoft Excel file is used for mainting a journal file of the financial transactions. In the data1 folder there is a file that contains an Excel file net_income_method_2020.xlsx with the accounting data of a fictitious company. We could have used simple texts files like csv as well.

## Journal File

This excel document contains two data sheets: One with the actual data "journal" and one with the name "account numbers", which contains the mapping from the account numbers to the description.

We will read this excel file into two DataFrame objects:

import pandas as pd

with pd.ExcelFile("/data1/net_income_method_2020.xlsx") as xl:
accounts2descr = xl.parse("account numbers",
index_col=0)
journal = xl.parse("journal",
index_col=0,
)

journal.index = pd.to_datetime(journal.index)
journal.index


### OUTPUT:

DatetimeIndex(['2020-04-02', '2020-04-02', '2020-04-02', '2020-04-02',
'2020-04-02', '2020-04-02', '2020-04-05', '2020-04-05',
'2020-04-05', '2020-04-05', '2020-04-09', '2020-04-09',
'2020-04-10', '2020-04-10', '2020-04-10', '2020-04-10',
'2020-04-10', '2020-04-10', '2020-04-13', '2020-04-13',
'2020-04-13', '2020-04-26', '2020-04-26', '2020-04-26',
'2020-04-26', '2020-04-27', '2020-05-03', '2020-05-03',
'2020-05-03', '2020-05-03', '2020-05-05', '2020-05-05',
'2020-05-08', '2020-05-09', '2020-05-10', '2020-05-11',
'2020-05-11', '2020-05-11', '2020-05-11', '2020-05-11',
'2020-05-13', '2020-05-18', '2020-05-25', '2020-05-25',
'2020-06-01', '2020-06-02', '2020-06-03', '2020-06-03',
'2020-06-04', '2020-06-04', '2020-06-09', '2020-06-10',
'2020-06-10', '2020-06-11', '2020-06-11', '2020-06-11',
'2020-06-11', '2020-06-11', '2020-06-12', '2020-06-13',
'2020-06-13', '2020-06-26', '2020-06-26', '2020-06-27',
'2020-07-02', '2020-07-03', '2020-07-05', '2020-07-05',
'2020-07-08', '2020-07-09', '2020-07-10', '2020-07-10',
'2020-07-10', '2020-07-10', '2020-07-10', '2020-07-10',
'2020-07-11', '2020-07-11', '2020-07-13', '2020-07-18',
'2020-07-23', '2020-07-23', '2020-07-25', '2020-07-25',
'2020-07-27', '2020-07-26', '2020-07-28'],
dtype='datetime64[ns]', name='date', freq=None)

The first one is the tab "account numbers" which contains the mapping from the account numbers to the description of the accounts:

accounts2descr

description
account
4400 revenue plant Munich
4401 revenue plant Frankfurt
4402 revenue plant Berlin
2010 souvenirs
2020 clothes
2030 other articles
2050 books
2100 insurances
2200 wages
2300 loans
2400 hotels
2500 petrol
2600 telecommunication
2610 internet

The second data sheet "journal" contains the actual journal entries:

journal[:10]

account number document number description tax rate gross amount
date
2020-04-02 4402 8983233038 Zurkan, Köln 19 4105.98
2020-04-02 2010 57550799 Birmann, Souvenirs 19 -1890.00
2020-04-02 2200 14989004 wages 0 -17478.23
2020-04-02 2500 12766279 Filling Station, Petrol 19 -89.40
2020-04-02 4400 3733462359 EnergyCom, Hamburg 19 4663.54
2020-04-02 4402 7526058231 Enoigo, Strasbourg 19 2412.82
2020-04-05 4402 1157284466 Qbooks, Frankfurt 7 2631.42
2020-04-05 4402 7009463592 Qbooks, Köln 7 3628.45
2020-04-05 2020 68433353 Jamdon, Clothes 19 -1900.00
2020-04-05 2010 53353169 Outleg, Souvenirs 19 -2200.00

There are many ways to analyze this data. We can for example sum up all the accounts:

account_sums = journal[["account number", "gross amount"]].groupby("account number").sum()
account_sums

gross amount
account number
2010 -4090.00
2020 -10500.80
2030 -1350.00
2050 -900.00
2100 -612.00
2200 -69912.92
2300 -18791.92
2400 -1597.10
2500 -89.40
2600 -492.48
2610 -561.00
4400 37771.84
4401 69610.35
4402 61593.99

Live Python training

Enjoying this page? We offer live Python training courses covering the content of this site.

Enrol here

## Account Charts

What about showing a pie chart of these sums? We encounter one problem: Pie charts cannot contain negative values. However this is not a real problem. We can split the accounts into income and expense accounts. Of course, this corresponds more to what we really want to see.

### Charts for the Income Accounts

We create a DataFrame with the income accounts:

income_accounts = account_sums[account_sums["gross amount"] > 0]
income_accounts

gross amount
account number
4400 37771.84
4401 69610.35
4402 61593.99

We can now visualize these values in a pie chart.

plot = income_accounts.plot(y='gross amount', figsize=(5, 5), kind="pie")


You probably don't like the position of the legend? With the parameter bbox_to_anchor we can position it on a desired position. So, if we want to, we can even move it outside of the plot. With the relative coordinate values (0.5, 0.5) we position the legend in the center of the plot. The legend is a box structure. So the question is what does the postion (0.5, 0.5) mean? We can define this by using the parameter loc additionally:

loc value meaning
upper left bbox_to_anchor describes the position of the left upper corner of the legend box
upper right correspondingly the upper right corner
lower left the lower left corner
lower left the lower right corner

We use this to position the legend with its left upper corner positioned in the middle of the plot:

plot = income_accounts.plot(y='gross amount',
figsize=(5, 5),
kind="pie")
plot.legend(bbox_to_anchor=(0.5, 0.5),
loc="upper left")


### OUTPUT:




<matplotlib.legend.Legend at 0x7f172ca03a90>

Now we position the lower right corner of the legend into the center of the plot:

plot = income_accounts.plot(y='gross amount', figsize=(5, 5), kind="pie")
plot.legend(bbox_to_anchor=(0.5, 0.5),
loc="lower right")


### OUTPUT:




<matplotlib.legend.Legend at 0x7f172c9c5590>

There is another thing we can improve. We see the labels 4400, 4401, and 4402 beside of each pie segment. In addition, we see them in the legend. This is ugly and redundant information. In the following we will turn the labels off, i.e. set them to an empty string, in the plot and we explicitly set them in the legend method:

plot = income_accounts.plot(y='gross amount',
figsize=(5, 5),
kind="pie",
labels=['', '', ''])
plot.legend(bbox_to_anchor=(0.5, 0.5),
labels=income_accounts.index)


### OUTPUT:




<matplotlib.legend.Legend at 0x7f172c956550>

Now, we are close to perfection. Just one more tiny thing. Some might prefer to see the actual description text rather than an account number. We will cut out this information from the DataFrame accounts2descr by using loc and the list of desired numbers [4400, 4401, 4402]. The result of this operation will be the argument of the set_index method. (Atention: reindex is not giving the wanted results!)

descriptions = accounts2descr["description"].loc[[4400, 4401, 4402]]
plot = income_accounts.plot(kind="pie",
y='gross amount',
figsize=(5, 5),
labels=['', '', ''])

plot.legend(bbox_to_anchor=(0.5, 0.5),
loc="lower left",
labels=descriptions)


### OUTPUT:




<matplotlib.legend.Legend at 0x7f172c8ba8d0>

Would you prefer a bar chart? No problem, we just have to set the parameter kind to bar instead of pie:

plot = income_accounts.plot(y='gross amount',
figsize=(5, 5),
kind="bar",
legend=False)


For bar charts we have to set kind to barh. So that it doesn't get too boring, we can also color the bars by passing a list of colors to the color parameter:

plot = income_accounts.plot(y='gross amount',
figsize=(5, 5),
kind="barh",
legend=False,
color=['green', 'orange', 'blue'])


### Charts for the Expenses Accounts

we can do the same now with our debitors (expenses accounts):

expenses_accounts = account_sums[account_sums["gross amount"] < 0]
expenses_accounts

gross amount
account number
2010 -4090.00
2020 -10500.80
2030 -1350.00
2050 -900.00
2100 -612.00
2200 -69912.92
2300 -18791.92
2400 -1597.10
2500 -89.40
2600 -492.48
2610 -561.00
acc2descr_expenses = accounts2descr["description"].loc[expenses_accounts.index]
acc2descr_expenses


### OUTPUT:

account number
2010            souvenirs
2020              clothes
2030       other articles
2050                books
2100           insurances
2200                wages
2300                loans
2400               hotels
2500               petrol
2600    telecommunication
2610             internet
Name: description, dtype: object
expenses_accounts.set_index(acc2descr_expenses.values, inplace=True)

expenses_accounts *= -1

labels = [''] * len(expenses_accounts)
plot = expenses_accounts.plot(kind="pie",
y='gross amount',
figsize=(5, 5),
labels=labels)
plot.legend(bbox_to_anchor=(0.5, 0.5),
labels=expenses_accounts.index)


### OUTPUT:




<matplotlib.legend.Legend at 0x7f172868f0d0>

## Tax Sums

We will sum up the amount according to their tax rate.

journal.drop(columns=["account number"])

document number description tax rate gross amount
date
2020-04-02 8983233038 Zurkan, Köln 19 4105.98
2020-04-02 57550799 Birmann, Souvenirs 19 -1890.00
2020-04-02 14989004 wages 0 -17478.23
2020-04-02 12766279 Filling Station, Petrol 19 -89.40
2020-04-02 3733462359 EnergyCom, Hamburg 19 4663.54
... ... ... ... ...
2020-07-25 5204418668 BoKoData, Bodensee, Konstanz 19 3678.38
2020-07-25 85241331 Hotel, Konstanz 7 -583.00
2020-07-27 26865618 Hotel, Franfurt 7 -450.00
2020-07-26 5892708524 Oscar Zopvar KG, Luxemburg 19 2589.80
2020-07-28 1633775505 Oscar Zopvar KG, Luxemburg 19 3578.46

87 rows × 4 columns

In the following we will define a function tax_sums that calculates the VAT sums according to tax rates from a journal DataFrame:

def tax_sums(journal_df, months=None):
""" Returns a DataFrame with sales and tax rates -
If a number or list is passed to 'months', only the sales
of the corresponding months will be used.
Example: tax_sums(df, months=[3, 6]) will only use the months
3 (March) and 6 (June)"""
if months:
if isinstance(months, int):
month_cond = journal_df.index.month == months
elif isinstance(months, (list, tuple)):
month_cond = journal_df.index.month.isin(months)
positive = journal_df["gross amount"] > 0
# sales_taxes eq. umsatzsteuer
sales_taxes = journal_df[positive & month_cond]
negative = journal_df["gross amount"] < 0
# input_taxes equivalent to German Vorsteuer
input_taxes = journal_df[negative & month_cond]
else:
sales_taxes = journal_df[journal_df["gross amount"] > 0]
input_taxes = journal_df[journal_df["gross amount"] < 0]

sales_taxes = sales_taxes[["tax rate", "gross amount"]].groupby("tax rate").sum()
sales_taxes.rename(columns={"gross amount": "Sales Gross"},
inplace=True)
sales_taxes.index.name = 'Tax Rate'

input_taxes = input_taxes[["tax rate", "gross amount"]].groupby("tax rate").sum()
input_taxes.rename(columns={"gross amount": "Expenses Gross"},
inplace=True)
input_taxes.index.name = 'Tax Rate'

taxes = pd.concat([input_taxes, sales_taxes], axis=1)
taxes.insert(1,
column="Input Taxes",
value=(taxes["Sales Gross"] * taxes.index / 100).round(2))
taxes.insert(3,
column="Sales Taxes",
value=(taxes["Expenses Gross"] * taxes.index / 100).round(2))

return taxes.fillna(0)

tax_sums(journal)

Expenses Gross Input Taxes Sales Gross Sales Taxes
Tax Rate
0 -90102.20 0.00 8334.43 -0.00
7 -3847.10 786.85 11240.71 -269.30
19 -14948.32 28386.20 149401.04 -2840.18
stsum_5 = tax_sums(journal, months=5)
stsum_6 = tax_sums(journal, months=6)
stsum_5

Expenses Gross Input Taxes Sales Gross Sales Taxes
Tax Rate
0 -22411.53 0.00 0.00 -0.00
7 -900.00 0.00 0.00 -63.00
19 -145.00 5952.51 31328.98 -27.55
tax_sums(journal, months=[5, 6])

Expenses Gross Input Taxes Sales Gross Sales Taxes
Tax Rate
0 -44812.14 0.00 6479.47 -0.00
7 -900.00 348.66 4980.84 -63.00
19 -268.12 12520.85 65899.23 -50.94

Live Python training

Enjoying this page? We offer live Python training courses covering the content of this site.

Upcoming online Courses

Enrol here