How to export a dataframe to a json file? [duplicate]

I want to export this DataFrame:

Data Frame

to a json file with this format:

     {"index":["Home", "CAT", 2015],"value":3.0},
     {"index":["Home", "CAT", 2016],"value":4.0},
     {"index":["Home", "CAT", 2017],"value":5.0},
     {"index":["Home", "CAT", 2018],"value":6.0},
     {"index":["Home", "CAT", 2019],"value":3.0},
     {"index":["Home", "CAT", 2020],"value":2.0},
     {"index":["Home", "DOG", 2015],"value":5.0},
     {"index":["Home", "DOG", 2016],"value":1.0},
     {"index":["Home", "DOG", 2017],"value":3.0},
     {"index":["Home", "DOG", 2018],"value":2.0},
     {"index":["Home", "DOG", 2019],"value":4.0},
     {"index":["Home", "DOG", 2020],"value":6.0}
     ...

I solve the problem with these lines:

df = df.to_json(orient="index")
df = df.replace("'","\"")
df = df.replace('(','index":[')
df = df.replace(')"','],"value"')
df = df.replace(',"i','},"i')
df = df.replace('"index"','{"index"')
df = df.replace('{{','{') 

Is there another way to do it more efficiently?


You can use stack which will create a new level in your multiindex with each of the years from your columns in this level (which will by default be named "level_2" since you already have two levels there) and a single row of values (which by default will be named "0"). Then you can reset the index and rename all of the columns as desired. For example:

import numpy as np
import pandas as pd

np.random.seed(42)
df = pd.DataFrame(
    np.random.randn(4, 4),
    index=pd.MultiIndex.from_arrays(
        [["HOME"] * 4, ["CAT", "DOG", "PARROT", "SNAKE"]],
        names=["REGION", "Animal"]
    ), columns=[2015, 2016, 2017, 2018]
)

Starting df:

                   2015      2016      2017      2018
REGION Animal                                        
HOME   CAT     0.496714 -0.138264  0.647689  1.523030
       DOG    -0.234153 -0.234137  1.579213  0.767435
       PARROT -0.469474  0.542560 -0.463418 -0.465730
       SNAKE   0.241962 -1.913280 -1.724918 -0.562288

Run this line::

new_df = df.stack().reset_index().rename(columns={"level_2": "YEAR",0:"value"})

Result:

   REGION  Animal  YEAR     value
0    HOME     CAT  2015  0.496714
1    HOME     CAT  2016 -0.138264
2    HOME     CAT  2017  0.647689
3    HOME     CAT  2018  1.523030
4    HOME     DOG  2015 -0.234153
5    HOME     DOG  2016 -0.234137
6    HOME     DOG  2017  1.579213
7    HOME     DOG  2018  0.767435
8    HOME  PARROT  2015 -0.469474
9    HOME  PARROT  2016  0.542560
10   HOME  PARROT  2017 -0.463418
11   HOME  PARROT  2018 -0.465730
12   HOME   SNAKE  2015  0.241962
13   HOME   SNAKE  2016 -1.913280
14   HOME   SNAKE  2017 -1.724918
15   HOME   SNAKE  2018 -0.562288