How do you Unit Test Python DataFrames

How do i unit test python dataframes?

I have functions that have an input and output as dataframes. Almost every function I have does this. Now if i want to unit test this what is the best method of doing it? It seems a bit of an effort to create a new dataframe (with values populated) for every function?

Are there any materials you can refer me to? Should you write unit tests for these functions?

Solution 1:

While Pandas' test functions are primarily used for internal testing, NumPy includes a very useful set of testing functions that are documented here: NumPy Test Support.

These functions compare NumPy arrays, but you can get the array that underlies a Pandas DataFrame using the values property. You can define a simple DataFrame and compare what your function returns to what you expect.

One technique you can use is to define one set of test data for a number of functions. That way, you can use Pytest Fixtures to define that DataFrame once, and use it in multiple tests.

In terms of resources, I found this article on Testing with NumPy and Pandas to be very useful. I also did a short presentation about data analysis testing at PyCon Canada 2016: Automate Your Data Analysis Testing.

Solution 2:

you can use pandas testing functions:

It will give more flexbile to compare your result with computed result in different ways.

For example:

df1=pd.DataFrame({'a':[1,2,3,4,5]})
df2=pd.DataFrame({'a':[6,7,8,9,10]})

expected_res=pd.Series([7,9,11,13,15])
pd.testing.assert_series_equal((df1['a']+df2['a']),expected_res,check_names=False)

For more details refer this link

Solution 3:

I don't think it's hard to create small DataFrames for unit testing?

import pandas as pd
from nose.tools import assert_dict_equal

input = pd.DataFrame.from_dict({
    'field_1': [some, values],
    'field_2': [other, values]
})
expected = {
    'result': [...]
}
assert_dict_equal(expected, my_func(input).to_dict(), "oops, there's a bug...")

Are the HTTP status codes defined anywhere in the iOS SDK?

Restart my heroku application automatically

Google Play error "Error while retrieving information from server [DF-DFERH-01]"

Getting requirejs to work with Jasmine

Show that the Laplacian of a function $f$ is $\Delta^2 f=(\frac{n−1}{r})g'(r)+g''(r)$

Find the sum $ \sum\limits_{a,b=0}^ \infty \begin{vmatrix} x^{a+b} & y^{a+b} & z^{a+b}\\ x^b & y^b & z^b \\ 1 & 1 & 1 \end{vmatrix} u^a v^b. $

Doubt regarding one-one onto function

topology-Quotient topology,quotient space

Is it true that, if $f$ is uniformly continuous in $(a,b),$ then the limits $\lim_{x\to a^+} f(x)$ and $\lim_{x\to b^-} f(x)$ exist?

Prove by induction that for $n ≥ 3, 4^{n} > 5n^{2}+ n.$

Prove every subset of $\Bbb N$ is countable.

Must a Developable Surface be Tangent Developable or a Generalised Cone/Cylinder?