R summary() equivalent in numpy
Is there an equivalent of R
's summary()
function in numpy
?
numpy
has std, mean, average functions separately, but does it have a function that sums up everything, like summary
does in R
?
If found this question which relates to pandas
and this article with R-to-numpy equivalents, but it doesn't have what I seek for.
1. Load Pandas in console and load csv data file
import pandas as pd
data = pd.read_csv("data.csv", sep = ",")
2. Examine first few rows of data
data.head()
3. Calculate summary statistics
summary = data.describe()
4. Transpose statistics to get similar format as R summary() function
summary = summary.transpose()
5. Visualize summary statistics in console
summary.head()
No. You'll need to use pandas
.
R is for language for statistics, so many of the basic functionality you need, like summary()
and lm()
, are loaded when you boot it up. Python has many uses, so you need to install and import the appropriate statistical packages. numpy
isn't a statistics package - it's for numerical computation more generally, so you need to use packages like pandas
, scipy
and statsmodels
to allow Python to do what R can do out of the box.
If you are looking for details like summary() in R i.e
- 5 point summary for numeric variables
- Frequency of occurrence of each class for categorical variable
To achieve above in Python you can use df.describe(include= 'all').