40. Net Income Method Example with Numpy, Matplotlib and Scipy

By Bernd Klein. Last modified: 03 Feb 2025.

On this page ➤

The Net Income Method (in Germany known as Einnahmeüberschussrechnung, EÜR) is a simplified profit determination method. Under German law, self-employed persons such as doctors, lawyers, architects and others have the choice of submitting annual accounts or using the simple net income method. What we present in this chapter of our Pandas tutorial will not only be interesting for small companies, especially in Germany and EU countries, but also for other countries in the world as well.

money flow

We are mainly interested in introducing the different ways in which pandas and python can solve such problems. We show what it is all about to monitor the flow of money and in this way to better understand the financial situation. The algorithms can also be used for tax purposes. But be warned, this is a very general treatment of the matter and needs to be adjusted to the actual tax situation in your country. Although it is mainly based on German law, we cannot guarantee its accuracy! So, before you actually use it, you have to make sure that it is suitable for your situation and the tax laws of your country.

Very often a Microsoft Excel file is used for mainting a journal file of the financial transactions. In the data1 folder there is a file that contains an Excel file net_income_method_2020.xlsx with the accounting data of a fictitious company. We could have used simple texts files like csv as well.

Journal File

This excel document contains two data sheets: One with the actual data "journal" and one with the name "account numbers", which contains the mapping from the account numbers to the description.

Screenshot of journal sheet

Screenshot of column account numbers

We will read this excel file into two DataFrame objects:

import pandas as pd

with pd.ExcelFile("/data1/net_income_method_2020.xlsx") as xl:
    accounts2descr = xl.parse("account numbers", 
                              index_col=0)
    journal = xl.parse("journal", 
                       index_col=0,
                      )
    
journal.index = pd.to_datetime(journal.index)
journal.index

OUTPUT:

DatetimeIndex(['2020-04-02', '2020-04-02', '2020-04-02', '2020-04-02',
               '2020-04-02', '2020-04-02', '2020-04-05', '2020-04-05',
               '2020-04-05', '2020-04-05', '2020-04-09', '2020-04-09',
               '2020-04-10', '2020-04-10', '2020-04-10', '2020-04-10',
               '2020-04-10', '2020-04-10', '2020-04-13', '2020-04-13',
               '2020-04-13', '2020-04-26', '2020-04-26', '2020-04-26',
               '2020-04-26', '2020-04-27', '2020-05-03', '2020-05-03',
               '2020-05-03', '2020-05-03', '2020-05-05', '2020-05-05',
               '2020-05-08', '2020-05-09', '2020-05-10', '2020-05-11',
               '2020-05-11', '2020-05-11', '2020-05-11', '2020-05-11',
               '2020-05-13', '2020-05-18', '2020-05-25', '2020-05-25',
               '2020-06-01', '2020-06-02', '2020-06-03', '2020-06-03',
               '2020-06-04', '2020-06-04', '2020-06-09', '2020-06-10',
               '2020-06-10', '2020-06-11', '2020-06-11', '2020-06-11',
               '2020-06-11', '2020-06-11', '2020-06-12', '2020-06-13',
               '2020-06-13', '2020-06-26', '2020-06-26', '2020-06-27',
               '2020-07-02', '2020-07-03', '2020-07-05', '2020-07-05',
               '2020-07-08', '2020-07-09', '2020-07-10', '2020-07-10',
               '2020-07-10', '2020-07-10', '2020-07-10', '2020-07-10',
               '2020-07-11', '2020-07-11', '2020-07-13', '2020-07-18',
               '2020-07-23', '2020-07-23', '2020-07-25', '2020-07-25',
               '2020-07-27', '2020-07-26', '2020-07-28'],
              dtype='datetime64[ns]', name='date', freq=None)

The first one is the tab "account numbers" which contains the mapping from the account numbers to the description of the accounts:

accounts2descr

	description
account
4400	revenue plant Munich
4401	revenue plant Frankfurt
4402	revenue plant Berlin
2010	souvenirs
2020	clothes
2030	other articles
2050	books
2100	insurances
2200	wages
2300	loans
2400	hotels
2500	petrol
2600	telecommunication
2610	internet

The second data sheet "journal" contains the actual journal entries:

journal[:10]

	account number	document number	description	tax rate	gross amount
date
2020-04-02	4402	8983233038	Zurkan, Köln	19	4105.98
2020-04-02	2010	57550799	Birmann, Souvenirs	19	-1890.00
2020-04-02	2200	14989004	wages	0	-17478.23
2020-04-02	2500	12766279	Filling Station, Petrol	19	-89.40
2020-04-02	4400	3733462359	EnergyCom, Hamburg	19	4663.54
2020-04-02	4402	7526058231	Enoigo, Strasbourg	19	2412.82
2020-04-05	4402	1157284466	Qbooks, Frankfurt	7	2631.42
2020-04-05	4402	7009463592	Qbooks, Köln	7	3628.45
2020-04-05	2020	68433353	Jamdon, Clothes	19	-1900.00
2020-04-05	2010	53353169	Outleg, Souvenirs	19	-2200.00

There are many ways to analyze this data. We can for example sum up all the accounts:

account_sums = journal[["account number", "gross amount"]].groupby("account number").sum()
account_sums

	gross amount
account number
2010	-4090.00
2020	-10500.80
2030	-1350.00
2050	-900.00
2100	-612.00
2200	-69912.92
2300	-18791.92
2400	-1597.10
2500	-89.40
2600	-492.48
2610	-561.00
4400	37771.84
4401	69610.35
4402	61593.99

Live Python training

Enjoying this page? We offer live Python training courses covering the content of this site.

See our Python training courses

See our Machine Learning with Python training courses

Account Charts

What about showing a pie chart of these sums? We encounter one problem: Pie charts cannot contain negative values. However this is not a real problem. We can split the accounts into income and expense accounts. Of course, this corresponds more to what we really want to see.

Charts for the Income Accounts

We create a DataFrame with the income accounts:

income_accounts = account_sums[account_sums["gross amount"] > 0]
income_accounts

	gross amount
account number
4400	37771.84
4401	69610.35
4402	61593.99

We can now visualize these values in a pie chart.

plot = income_accounts.plot(y='gross amount', figsize=(5, 5), kind="pie")

data image

You probably don't like the position of the legend? With the parameter bbox_to_anchor we can position it on a desired position. So, if we want to, we can even move it outside of the plot. With the relative coordinate values (0.5, 0.5) we position the legend in the center of the plot. The legend is a box structure. So the question is what does the postion (0.5, 0.5) mean? We can define this by using the parameter loc additionally:

loc value	meaning
upper left	bbox_to_anchor describes the position of the left upper corner of the legend box
upper right	correspondingly the upper right corner
lower left	the lower left corner
lower left	the lower right corner

We use this to position the legend with its left upper corner positioned in the middle of the plot:

plot = income_accounts.plot(y='gross amount', 
                            figsize=(5, 5), 
                            kind="pie")
plot.legend(bbox_to_anchor=(0.5, 0.5),
            loc="upper left")

OUTPUT:

data image

<matplotlib.legend.Legend at 0x7f172ca03a90>

Now we position the lower right corner of the legend into the center of the plot:

plot = income_accounts.plot(y='gross amount', figsize=(5, 5), kind="pie")
plot.legend(bbox_to_anchor=(0.5, 0.5),
            loc="lower right")

OUTPUT:

data image

<matplotlib.legend.Legend at 0x7f172c9c5590>

There is another thing we can improve. We see the labels 4400, 4401, and 4402 beside of each pie segment. In addition, we see them in the legend. This is ugly and redundant information. In the following we will turn the labels off, i.e. set them to an empty string, in the plot and we explicitly set them in the legend method:

plot = income_accounts.plot(y='gross amount', 
                            figsize=(5, 5), 
                            kind="pie",
                            labels=['', '', ''])
plot.legend(bbox_to_anchor=(0.5, 0.5),
            labels=income_accounts.index)

OUTPUT:

data image

<matplotlib.legend.Legend at 0x7f172c956550>

Now, we are close to perfection. Just one more tiny thing. Some might prefer to see the actual description text rather than an account number. We will cut out this information from the DataFrame accounts2descr by using loc and the list of desired numbers [4400, 4401, 4402]. The result of this operation will be the argument of the set_index method. (Atention: reindex is not giving the wanted results!)

descriptions = accounts2descr["description"].loc[[4400, 4401, 4402]]
plot = income_accounts.plot(kind="pie",
                            y='gross amount', 
                            figsize=(5, 5),
                            labels=['', '', ''])

plot.legend(bbox_to_anchor=(0.5, 0.5), 
            loc="lower left",
            labels=descriptions)

OUTPUT:

data image

<matplotlib.legend.Legend at 0x7f172c8ba8d0>

Would you prefer a bar chart? No problem, we just have to set the parameter kind to bar instead of pie:

plot = income_accounts.plot(y='gross amount', 
                            figsize=(5, 5), 
                            kind="bar",
                            legend=False)

data image

For bar charts we have to set kind to barh. So that it doesn't get too boring, we can also color the bars by passing a list of colors to the color parameter:

plot = income_accounts.plot(y='gross amount', 
                            figsize=(5, 5), 
                            kind="barh",
                            legend=False,
                            color=['green', 'orange', 'blue'])

data image

Charts for the Expenses Accounts

we can do the same now with our debitors (expenses accounts):

expenses_accounts = account_sums[account_sums["gross amount"] < 0]
expenses_accounts

	gross amount
account number
2010	-4090.00
2020	-10500.80
2030	-1350.00
2050	-900.00
2100	-612.00
2200	-69912.92
2300	-18791.92
2400	-1597.10
2500	-89.40
2600	-492.48
2610	-561.00

acc2descr_expenses = accounts2descr["description"].loc[expenses_accounts.index]
acc2descr_expenses

OUTPUT:

account number
2010            souvenirs
2020              clothes
2030       other articles
2050                books
2100           insurances
2200                wages
2300                loans
2400               hotels
2500               petrol
2600    telecommunication
2610             internet
Name: description, dtype: object

expenses_accounts.set_index(acc2descr_expenses.values, inplace=True)

expenses_accounts *= -1

labels = [''] * len(expenses_accounts)
plot = expenses_accounts.plot(kind="pie",
                            y='gross amount', 
                            figsize=(5, 5),
                            labels=labels)
plot.legend(bbox_to_anchor=(0.5, 0.5), 
            labels=expenses_accounts.index)

OUTPUT:

data image

<matplotlib.legend.Legend at 0x7f172868f0d0>

Tax Sums

We will sum up the amount according to their tax rate.

journal.drop(columns=["account number"])

	document number	description	tax rate	gross amount
date
2020-04-02	8983233038	Zurkan, Köln	19	4105.98
2020-04-02	57550799	Birmann, Souvenirs	19	-1890.00
2020-04-02	14989004	wages	0	-17478.23
2020-04-02	12766279	Filling Station, Petrol	19	-89.40
2020-04-02	3733462359	EnergyCom, Hamburg	19	4663.54
...	...	...	...	...
2020-07-25	5204418668	BoKoData, Bodensee, Konstanz	19	3678.38
2020-07-25	85241331	Hotel, Konstanz	7	-583.00
2020-07-27	26865618	Hotel, Franfurt	7	-450.00
2020-07-26	5892708524	Oscar Zopvar KG, Luxemburg	19	2589.80
2020-07-28	1633775505	Oscar Zopvar KG, Luxemburg	19	3578.46

87 rows × 4 columns

In the following we will define a function tax_sums that calculates the VAT sums according to tax rates from a journal DataFrame:

def tax_sums(journal_df, months=None):
    """ Returns a DataFrame with sales and tax rates -
         If a number or list is passed to 'months', only the sales
         of the corresponding months will be used.
         Example: tax_sums(df, months=[3, 6]) will only use the months
         3 (March) and 6 (June)"""
    if months:
        if isinstance(months, int):
            month_cond = journal_df.index.month == months
        elif isinstance(months, (list, tuple)):
            month_cond = journal_df.index.month.isin(months)
        positive = journal_df["gross amount"] > 0
        # sales_taxes eq. umsatzsteuer
        sales_taxes = journal_df[positive & month_cond]
        negative = journal_df["gross amount"] < 0
        # input_taxes equivalent to German Vorsteuer
        input_taxes = journal_df[negative & month_cond]   
    else:
        sales_taxes = journal_df[journal_df["gross amount"] > 0]
        input_taxes = journal_df[journal_df["gross amount"] < 0]
    
    sales_taxes = sales_taxes[["tax rate", "gross amount"]].groupby("tax rate").sum()
    sales_taxes.rename(columns={"gross amount": "Sales Gross"},
                       inplace=True)
    sales_taxes.index.name = 'Tax Rate'
    
    input_taxes = input_taxes[["tax rate", "gross amount"]].groupby("tax rate").sum()
    input_taxes.rename(columns={"gross amount": "Expenses Gross"},
                      inplace=True)
    input_taxes.index.name = 'Tax Rate'
    
    taxes = pd.concat([input_taxes, sales_taxes], axis=1)
    taxes.insert(1, 
                 column="Input Taxes", 
                 value=(taxes["Sales Gross"] * taxes.index / 100).round(2))
    taxes.insert(3, 
                 column="Sales Taxes", 
                 value=(taxes["Expenses Gross"] * taxes.index / 100).round(2))

    return taxes.fillna(0)

tax_sums(journal)

	Expenses Gross	Input Taxes	Sales Gross	Sales Taxes
Tax Rate
0	-90102.20	0.00	8334.43	-0.00
7	-3847.10	786.85	11240.71	-269.30
19	-14948.32	28386.20	149401.04	-2840.18

stsum_5 = tax_sums(journal, months=5)
stsum_6 = tax_sums(journal, months=6)
stsum_5

	Expenses Gross	Input Taxes	Sales Gross	Sales Taxes
Tax Rate
0	-22411.53	0.00	0.00	-0.00
7	-900.00	0.00	0.00	-63.00
19	-145.00	5952.51	31328.98	-27.55

tax_sums(journal, months=[5, 6])

	Expenses Gross	Input Taxes	Sales Gross	Sales Taxes
Tax Rate
0	-44812.14	0.00	6479.47	-0.00
7	-900.00	348.66	4980.84	-63.00
19	-268.12	12520.85	65899.23	-50.94

Live Python training

Enjoying this page? We offer live Python training courses covering the content of this site.

Upcoming online Courses

Python Intensive Course

Data Analysis with Python

Efficient Data Analysis with Pandas

Python Text Processing Course

See our Python training courses

See our Machine Learning with Python training courses