How to flatten a column of nested json objects in an already flattened dataframe

I have a json file with nested objects which is flattened in a pandas dataframe. There is a single column with nested json objects that I find hard the flatten.

I have tried many approaches, this is the approach that got me the furthest.

Help would be greatly appreciated, thank you.

Unfortunately I was unable to find a jsfiddle like alternative for python to provide a working example.

I understand that with the meta parameter of json_normalize I can add columns to my dataframe. But that approach does not work on the unflat column because I only got json_normalize to work well in my setup by setting record_path to 'markets' which is the main json object in my file. So in this setup I am unable to record_path to 'marketStats' and add any relevant columns via the meta parameter.

Goal

The goal is to transform one or all json objects in the marketStats object in to columns of the dataframe.

Code

with open('Data/20012022.json') as file:
data = json.loads(file.read())

# Flatten data
df0 = pd.json_normalize(
      data, 
      record_path =['markets']
)

df0.head(3)

Screenshot

This is a screenshot of what the table currently looks like, the marketStats column contains nested json.

enter image description here

Data

This a snippet from the json file. `

{
  "markets": [
    {
      "id": 335,
      "baseCurrency": "eth",
      "quoteCurrency": "btc",
      "exchangeName": "Binance",
      "exchangeCode": "BINA",
      "longName": "BTC-ETH",
      "marketName": "btc-eth",
      "symbol": "ETHBTC",
      "volume": "40624.5823",
      "quoteVolume": "3026.13646935",
      "btcVolume": "3026.13646935",
      "usdVolume": "127009429.050524367",
      "currentPrice": 0.074681,
      "latestBase": {
        "id": 161774475,
        "time": 1639576800,
        "date": "2021-12-15T14:00:00.000+00:00",
        "price": "0.077653",
        "lowestPrice": "0.0729",
        "bounce": "6.283",
        "currentDrop": "-3.8272829124438206",
        "crackedAt": "2022-01-07T03:00:00.000Z",
        "respectedAt": "2022-01-15T15:00:00.000Z",
        "isLowest": false
      },
      "marketStats": [
        {
          "algorithm": "original",
          "ratio": "50.0",
          "medianDrop": "-4.08",
          "medianBounce": "5.51",
          "hoursToRespected": 106,
          "crackedCount": 2,
          "respectedCount": 1
        },
        {
          "algorithm": "day_trade",
          "ratio": "100.0",
          "medianDrop": "-6.12",
          "medianBounce": "6.28",
          "hoursToRespected": 204,
          "crackedCount": 1,
          "respectedCount": 1
        },
        {
          "algorithm": "conservative",
          "ratio": "100.0",
          "medianDrop": "-6.12",
          "medianBounce": "8.38",
          "hoursToRespected": 204,
          "crackedCount": 1,
          "respectedCount": 1
        },
        {
          "algorithm": "position",
          "ratio": "50.0",
          "medianDrop": "-6.12",
          "medianBounce": "6.19",
          "hoursToRespected": 204,
          "crackedCount": 2,
          "respectedCount": 1
        },
        {
          "algorithm": "hodloo",
          "ratio": "50.0",
          "medianDrop": "-3.29",
          "medianBounce": "0.0",
          "hoursToRespected": 225,
          "crackedCount": 4,
          "respectedCount": 2
        }
      ]
    },
    {
      "id": 337,
      "baseCurrency": "ltc",
      "quoteCurrency": "btc",
      "exchangeName": "Binance",
      "exchangeCode": "BINA",
      "longName": "BTC-LTC",
      "marketName": "btc-ltc",
      "symbol": "LTCBTC",
      "volume": "68309.637",
      "quoteVolume": "223.79294524",
      "btcVolume": "223.79294524",
      "usdVolume": "9392773.4219378968",
      "currentPrice": 0.003275,
      "latestBase": {
        "id": 163982984,
        "time": 1642374000,
        "date": "2022-01-16T23:00:00.000+00:00",
        "price": "0.003346",
        "lowestPrice": "0.00322",
        "bounce": "3.839",
        "currentDrop": "-2.1219366407650926",
        "crackedAt": "2022-01-18T23:00:00.000Z",
        "respectedAt": null,
        "isLowest": false
      },
      "marketStats": [
        {
          "algorithm": "original",
          "ratio": "57.14",
          "medianDrop": "-3.28",
          "medianBounce": "3.84",
          "hoursToRespected": 186,
          "crackedCount": 7,
          "respectedCount": 4
        },
        {
          "algorithm": "day_trade",
          "ratio": "0.0",
          "medianDrop": "0.0",
          "medianBounce": "5.68",
          "hoursToRespected": 0,
          "crackedCount": 1,
          "respectedCount": 0
        },
        {
          "algorithm": "conservative",
          "ratio": "0.0",
          "medianDrop": "0.0",
          "medianBounce": "5.68",
          "hoursToRespected": 0,
          "crackedCount": 1,
          "respectedCount": 0
        },
        {
          "algorithm": "position",
          "ratio": "0.0",
          "medianDrop": "0.0",
          "medianBounce": "8.16",
          "hoursToRespected": 0,
          "crackedCount": 1,
          "respectedCount": 0
        },
        {
          "algorithm": "hodloo",
          "ratio": "75.0",
          "medianDrop": "-3.7",
          "medianBounce": "0.0",
          "hoursToRespected": 35,
          "crackedCount": 4,
          "respectedCount": 3
        }
      ]
    },
    {
      "id": 339,
      "baseCurrency": "bnb",
      "quoteCurrency": "btc",
      "exchangeName": "Binance",
      "exchangeCode": "BINA",
      "longName": "BTC-BNB",
      "marketName": "btc-bnb",
      "symbol": "BNBBTC",
      "volume": "154576.177",
      "quoteVolume": "1724.66664804",
      "btcVolume": "1724.66664804",
      "usdVolume": "72385673.4448901928",
      "currentPrice": 0.01099,
      "latestBase": {
        "id": 163753765,
        "time": 1642068000,
        "date": "2022-01-13T10:00:00.000+00:00",
        "price": "0.01093",
        "lowestPrice": "0.01093",
        "bounce": "3.102",
        "currentDrop": "0.5489478499542543",
        "crackedAt": null,
        "respectedAt": null,
        "isLowest": false
      },
      "marketStats": [
        {
          "algorithm": "original",
          "ratio": "100.0",
          "medianDrop": "-7.18",
          "medianBounce": "4.34",
          "hoursToRespected": 62,
          "crackedCount": 2,
          "respectedCount": 2
        },
        {
          "algorithm": "day_trade",
          "ratio": "100.0",
          "medianDrop": "-6.19",
          "medianBounce": "4.3",
          "hoursToRespected": 63,
          "crackedCount": 1,
          "respectedCount": 1
        },
        {
          "algorithm": "conservative",
          "ratio": "66.67",
          "medianDrop": "-3.15",
          "medianBounce": "4.05",
          "hoursToRespected": 62,
          "crackedCount": 3,
          "respectedCount": 2
        },
        {
          "algorithm": "position",
          "ratio": "100.0",
          "medianDrop": "-3.15",
          "medianBounce": "4.46",
          "hoursToRespected": 60,
          "crackedCount": 2,
          "respectedCount": 2
        },
        {
          "algorithm": "hodloo",
          "ratio": "100.0",
          "medianDrop": "-7.46",
          "medianBounce": "0.0",
          "hoursToRespected": 62,
          "crackedCount": 5,
          "respectedCount": 5
        }
      ]
    }
  ]
}

You can apply some post-processing to df0 to achieve what you want. Here you can apply explode followed by apply(pf.Series) applied to 'marketStats' column:

df1 = df0.explode('marketStats')['marketStats'].apply(pd.Series)

df1 looks like this:

    algorithm       ratio    medianDrop    medianBounce    hoursToRespected    crackedCount    respectedCount
--  ------------  -------  ------------  --------------  ------------------  --------------  ----------------
 0  original        50            -4.08            5.51                 106               2                 1
 0  day_trade      100            -6.12            6.28                 204               1                 1
 0  conservative   100            -6.12            8.38                 204               1                 1
 0  position        50            -6.12            6.19                 204               2                 1
 0  hodloo          50            -3.29            0                    225               4                 2
 1  original        57.14         -3.28            3.84                 186               7                 4
 1  day_trade        0             0               5.68                   0               1                 0
 1  conservative     0             0               5.68                   0               1                 0
 1  position         0             0               8.16                   0               1                 0
 1  hodloo          75            -3.7             0                     35               4                 3
 2  original       100            -7.18            4.34                  62               2                 2
 2  day_trade      100            -6.19            4.3                   63               1                 1
 2  conservative    66.67         -3.15            4.05                  62               3                 2
 2  position       100            -3.15            4.46                  60               2                 2
 2  hodloo         100            -7.46            0                     62               5                 5

if you want it combined with all the other columns you can use join:

df0.join(df1)

I am not going to post the output of this command as it is rather large