Convert summary to data.frame
I have this admission_table
containing ADMIT
, GRE
, GPA
and RANK
.
> head(admission_table)
ADMIT GRE GPA RANK
1 0 380 3.61 3
2 1 660 3.67 3
3 1 800 4.00 1
4 1 640 3.19 4
5 0 520 2.93 4
6 1 760 3.00 2
I'm trying to convert the summary of this table into data.frame
. I want to have ADMIT
, GRE
, GPA
and RANK
as my column headers.
> summary(admission_table)
ADMIT GRE GPA RANK
Min. :0.0000 Min. :220.0 Min. :2.260 Min. :1.000
1st Qu.:0.0000 1st Qu.:520.0 1st Qu.:3.130 1st Qu.:2.000
Median :0.0000 Median :580.0 Median :3.395 Median :2.000
Mean :0.3175 Mean :587.7 Mean :3.390 Mean :2.485
3rd Qu.:1.0000 3rd Qu.:660.0 3rd Qu.:3.670 3rd Qu.:3.000
Max. :1.0000 Max. :800.0 Max. :4.000 Max. :4.000
> as.data.frame(summary(admission_table))
Var1 Var2 Freq
1 ADMIT Min. :0.0000
2 ADMIT 1st Qu.:0.0000
3 ADMIT Median :0.0000
4 ADMIT Mean :0.3175
5 ADMIT 3rd Qu.:1.0000
6 ADMIT Max. :1.0000
7 GRE Min. :220.0
8 GRE 1st Qu.:520.0
9 GRE Median :580.0
10 GRE Mean :587.7
11 GRE 3rd Qu.:660.0
12 GRE Max. :800.0
13 GPA Min. :2.260
14 GPA 1st Qu.:3.130
15 GPA Median :3.395
16 GPA Mean :3.390
17 GPA 3rd Qu.:3.670
18 GPA Max. :4.000
19 RANK Min. :1.000
20 RANK 1st Qu.:2.000
21 RANK Median :2.000
22 RANK Mean :2.485
23 RANK 3rd Qu.:3.000
24 RANK Max. :4.000
As I'm trying to convert into data.frame
, this is the only result I get. I want the data frame have the exact output just like the summary table because after that I want to insert that into Oracle database using this line of code:
dbWriteTable(connection,name="SUM_ADMISSION_TABLE",value=as.data.frame(summary(admission_table)),row.names = FALSE, overwrite = TRUE ,append = FALSE)
Is the any way to do so?
You can consider unclass
, I suppose:
data.frame(unclass(summary(mydf)), check.names = FALSE, stringsAsFactors = FALSE)
# ADMIT GRE GPA RANK
# 1 Min. :0.0000 Min. :380.0 Min. :2.930 Min. :1.000
# 2 1st Qu.:0.2500 1st Qu.:550.0 1st Qu.:3.047 1st Qu.:2.250
# 3 Median :1.0000 Median :650.0 Median :3.400 Median :3.000
# 4 Mean :0.6667 Mean :626.7 Mean :3.400 Mean :2.833
# 5 3rd Qu.:1.0000 3rd Qu.:735.0 3rd Qu.:3.655 3rd Qu.:3.750
# 6 Max. :1.0000 Max. :800.0 Max. :4.000 Max. :4.000
str(.Last.value)
# 'data.frame': 6 obs. of 4 variables:
# $ ADMIT: chr "Min. :0.0000 " "1st Qu.:0.2500 " "Median :1.0000 " "Mean :0.6667 " ...
# $ GRE : chr "Min. :380.0 " "1st Qu.:550.0 " "Median :650.0 " "Mean :626.7 " ...
# $ GPA : chr "Min. :2.930 " "1st Qu.:3.047 " "Median :3.400 " "Mean :3.400 " ...
# $ RANK: chr "Min. :1.000 " "1st Qu.:2.250 " "Median :3.000 " "Mean :2.833 " ...
Note that there is a lot of excessive whitespace there, in both the names and the values.
However, it might be sufficient to do something like:
do.call(cbind, lapply(mydf, summary))
# ADMIT GRE GPA RANK
# Min. 0.0000 380.0 2.930 1.000
# 1st Qu. 0.2500 550.0 3.048 2.250
# Median 1.0000 650.0 3.400 3.000
# Mean 0.6667 626.7 3.400 2.833
# 3rd Qu. 1.0000 735.0 3.655 3.750
# Max. 1.0000 800.0 4.000 4.000
Another way to output a dataframe is:
as.data.frame(apply(mydf, 2, summary))
Works if only numerical columns are selected.
And it may throw an Error in dimnames(x)
if there are columns with NA's. It's worth checking for that without the as.data.frame()
function first.
None of these solutions actually capture the output of the summary function. The tidy()
function extracts the elements from a summary object and makes a bland data.frame, so it does not preserve other features or formatting.
If you want the exact output of the summary function in a data frame, you can do:
output<-capture.output(summary(thisModel), file=NULL,append=FALSE)
output_df <-as.data.frame(output)
This retains all of the new lines and is suitable for writing to XLSX, etc., which will result in the output appropriately spaced across rows.
If you want this output collapsed into a single cell, you can do:
output_collapsed <- paste0(output,sep="",collapse="\n")
output_df <-as.data.frame(output_collapsed)