Append data to CSV using a nested loop
I am trying to append data from the list json_response
containing Twitter data to a CSV file using the function append_to_csv
.
I understand the structure of the json_response
. It contains data on users who follow two politicians; 5 and 13 users respectively. 1) author_id
, created_at
, tweet_id
and text
is in data
. 2) description
/bio
is in ['includes']['users']
. 3) url
/image_url
is in ['includes']['media']
. However my nested loop does not append any data to sample_data.csv? and it throws no error. Does it have something to do with my identation?
print(json.dumps(json_response, indent=4, sort_keys=True)) # look at json_response object.
[
{
"data": [
{
"author_id": "2877379617",
"created_at": "2021-03-25T12:11:14.000Z",
"id": "1375057688355336195",
"text": "@prettynobodyco She blocked me in 2015 - for pointing out that Tim Kaine enables sexual assault in the military and the evidence was his killing of the MJIA and publicly stated that Military commanders should remain in charge of military rape cases. She's Tanden level awful. Congrats!"
},
{
"author_id": "1265018154444562440",
"created_at": "2021-03-22T19:48:59.000Z",
"id": "1374085719472361474",
"text": "@MehcatCat @AlasscanIsBack @PattyArquette @timkaine Funny, they blocked me. \ud83e\udd23\ud83e\udd23"
},
{
"author_id": "2378324935",
"created_at": "2021-03-07T21:32:13.000Z",
"id": "1368675879312887810",
"text": "@DrWinarick @KatieOGrady4 I apologize for any drama. Katie O Grady blocked me because we had a disagreement about Tim Kaine on one of your older posts. I guess I can't please everyone haha. :/"
},
{
"author_id": "821870502943817729",
"created_at": "2021-02-12T23:53:59.000Z",
"id": "1360376637385244673",
"text": "She blocked me a long ass time ago when I asked her why we shoulf care about Tim Kaine's personal view on abortion if it didn't impact legislation"
},
{
"attachments": {
"media_keys": [
"16_1341045032732770306"
]
},
"author_id": "17232340",
"created_at": "2020-12-21T15:37:07.000Z",
"id": "1341045038420275205",
"text": "@DSingh4Biden @moomintroll8 @timkaine @GovernorVA That's why I replied to you. She blocked me previously, for what silliness I can't remember. Tough being a troll AND a snowflake!"
}
],
"includes": {
"media": [
{
"media_key": "16_1341045032732770306",
"type": "animated_gif"
}
],
"users": [
{
"created_at": "2014-11-15T02:23:57.000Z",
"description": "",
"id": "2877379617",
"name": "Laura Saylor",
"username": "lauraleesaylor"
},
{
"created_at": "2020-05-25T20:33:36.000Z",
"description": "Weird Writer & Lunatic Linguist\nWicked Witch of the East\nshe/her",
"id": "1265018154444562440",
"name": "Zauberkind",
"username": "Zauberkind2"
},
{
"created_at": "2014-03-08T07:22:31.000Z",
"description": "#Resist, #BLM, #Vaxxed, liberal, autistic, kidney transplant survivor, political nerd, mental health advocate, fighter for equality, truth, justice, etc.",
"id": "2378324935",
"name": "Trevor \"Trev\" McKee Achilles",
"username": "MrTAchilles"
},
{
"created_at": "2017-01-19T00:02:52.000Z",
"description": "statist / Progressive Gun Nut/ Single and hating it\n\n / \n\nstraight????? /\n\npronouns / brain worm survivor\n\n",
"id": "821870502943817729",
"name": "Puppet Enthusiast",
"username": "nihilisticpillo"
},
{
"created_at": "2008-11-07T15:09:46.000Z",
"description": "Liberal-Veteran-Dog Lover | Taste for irony, but in moderation | Humor is reason gone mad. ~Groucho Marx | I follow & unfollow back #VeteransResist #Resist",
"id": "17232340",
"name": "anti-Fascist Jim",
"username": "JimnBL"
}
]
},
"meta": {
"newest_id": "1375057688355336195",
"next_token": "b26v89c19zqg8o3fos5vyedr54ngvtx3nuqvnx6pglrb1",
"oldest_id": "1341045038420275205",
"result_count": 5
}
},
{
"data": [
{
"author_id": "737885223858384896",
"created_at": "2021-03-26T21:56:02.000Z",
"id": "1375567243082338314",
"text": "@hogan_1969 @LindseyGrahamSC LOL She Blocked me.. could not admit the truth could she now. okay so where is her source for the shirts? and that is what he said. I (quote) We immediately surge the border all those seeking asylum. What about his lie about the cages? no Answer lol."
},
{
"author_id": "847612931487416323",
"created_at": "2021-03-26T21:55:24.000Z",
"id": "1375567083791073283",
"text": "@hogan_1969 @TeichTerry @thehill @LindseyGrahamSC @hogan_1969 just blocked me for showing her the actual numbers \ud83e\udd23\n\n#LiberalsHateFacts"
},
{
"author_id": "18634205",
"created_at": "2021-03-08T12:29:00.000Z",
"id": "1368901564363051010",
"text": "Huh. Made me think if @LeaderMcConnell @LindseyGrahamSC @marcorubio @SenTedCruz feel trapped under the thumb of Trumpy. And who else? @IvankaTrump? @MELANIATRUMP ? @DonaldJTrumpJr ? I\u2019d say Eric, but he blocked me."
},
{
"author_id": "27327319",
"created_at": "2021-03-02T11:53:16.000Z",
"id": "1366718245521211393",
"text": "@fedupinNHtoo @LindseyGrahamSC Exactly. I asked that question of a Republican on Facebook last night and she blocked me"
},
{
"author_id": "917634626247647232",
"created_at": "2021-02-28T18:16:45.000Z",
"id": "1366089974907432961",
"text": "@gop this is for you! @tedcruz @LindseyGrahamSC @MittRomney @mikepompeo\n#BitchyMcC blocked me!\ud83d\udc4d\nWatch \"Jack Off Jill - Hypocrite + lyrics\" on YouTube"
},
{
"author_id": "1231059979844456448",
"created_at": "2021-02-26T04:25:49.000Z",
"id": "1365156089554067459",
"text": "@KelleyALynch1 @marwilliamson @therecount @LindseyGrahamSC She's fine with that just as she's fine with Biden's Nazis in Ukraine. She wants war with Russia, too. She blocked me for this tweet because she couldn't even condemn Biden's Nazis in Ukraine. She's a fauxgressive warmonger, a wolf in sheep's clothing. \n"
},
{
"author_id": "1315477593303310336",
"created_at": "2021-02-23T00:00:41.000Z",
"id": "1364002202843451399",
"text": "@MistyKitty3 @BlairMurray83 @FrankAmari2 @LindseyGrahamSC \ud83e\udd23 Someone didn\u2019t like what I said and blocked me."
},
{
"author_id": "1069115263671562240",
"created_at": "2021-02-22T04:36:06.000Z",
"id": "1363709124891070467",
"text": "@trinkity88 @LindseyGrahamSC Apparently, @Trinkitty88 blocked me because FACTS are TOO HARD to handle!\ud83e\udd23\ud83e\udd23\ud83e\udd23\ud83e\udd23\ud83e\udd23\ud83e\udd23"
},
{
"author_id": "1303321972227690496",
"created_at": "2021-02-20T19:38:49.000Z",
"id": "1363211526316969985",
"text": "@horsin64 @GovMurphy @LindseyGrahamSC You blocked me because you\u2019re a nifkin. It\u2019s not cyber tough you Nancy I\u2019d say it to your face. American lives matter before anyone else. America first and you don\u2019t like it because you have trump derangement. You\u2019re a psycho"
},
{
"author_id": "27943005",
"created_at": "2021-02-19T20:00:38.000Z",
"id": "1362854626924650497",
"text": "@TonyRom31334975 @staceyabrams @AnnaForFlorida @LindseyGrahamSC The guy blocked me on Twitter and had to unblock me after the Knight First Amendment Institute sued him and won> I am certain It won't talk to me, but imagine..hehe?!"
},
{
"attachments": {
"media_keys": [
"3_1361344652264280068"
]
},
"author_id": "1126249378279297027",
"created_at": "2021-02-15T16:00:32.000Z",
"id": "1361344654395011079",
"text": "@Jamie1074 @Breaking911 You know what\n\nIt's funny that they blocked me because I actually did agree with them on Lindsey Graham...\n\nCome on, man !"
},
{
"author_id": "1207432044390699008",
"created_at": "2021-02-14T07:58:21.000Z",
"id": "1360860918687559681",
"text": "@LindseyGrahamSC I really don't know why you haven't blocked me yet. Pile of human shit. I just read a letter that John McCain wrote me and for some reason it made me think about you and what he would think about your behavior. I guarantee you'd be in for an ass whippin'. Dick."
},
{
"author_id": "926909484",
"created_at": "2021-02-13T20:53:03.000Z",
"id": "1360693490880032770",
"text": "@LadyReverbs @themariefonseca @styvanswift @LindseyGrahamSC Lady, you might be able to see Marie\u2019s tweets. She blocked me. She may call this a victory for Trump. The reality is that seven members of the @GOP voted to convict. They are the true patriots of the Republican Party."
}
],
"includes": {
"media": [
{
"media_key": "3_1361344652264280068",
"type": "photo",
"url": ""
}
],
"users": [
{
"created_at": "2016-06-01T05:55:21.000Z",
"description": "Biden Inflation the worst in 30 years. His Handlers trying to Rebrand Brandon is Hilarious.",
"id": "737885223858384896",
"name": "Biden is a complete mess and you know it.",
"username": "zelda3024"
},
{
"created_at": "2017-03-31T00:54:05.000Z",
"description": "Love God, Love Family, Love Country, Love Freedom - if we put those things first everything else will be great. MAGA",
"id": "847612931487416323",
"name": "Joey Bagadonuts",
"username": "AmericanGr8ness"
},
{
"created_at": "2009-01-05T15:25:55.000Z",
"description": "small & local garlic farmer; independent American; old surfer dude; working to find and speak truth to power; \ud83c\uddfa\ud83c\uddf8; mahalo and Maluhia",
"id": "18634205",
"name": "MacGregorGarlic",
"username": "MacGregorGarlic"
},
{
"created_at": "2009-03-28T22:53:28.000Z",
"description": "Let's Go Darwin!",
"id": "27327319",
"name": "Karen Kennedy",
"username": "KayKay68"
},
{
"created_at": "2017-10-10T06:15:18.000Z",
"description": "Mom\ud83d\udc95Cannactivist\ud83c\udf3fSecularHumanist\ud83c\udf10 BLM\u270a\ud83c\udfff\ud83c\udf08Ally\ud83e\udd8bCPTSD\u2695\ufe0f FTD\ud83e\udd14MeToo\ud83c\udf38ProChoice\ud83d\udc93CRPS\ud83d\ude23ClimateChange\ud83c\udf0e DACA\ud83c\uddfa\ud83c\uddf2AdoptDontShop\ud83d\udc3e#Steelers \ud83d\udda4\ud83d\udc9b #Vaxxed2TheMax\u270a\ud83d\udc9a",
"id": "917634626247647232",
"name": "Raven The Hemptress #LegalizeGlobally\ud83d\udc9a\ud83c\udf3f\u267f",
"username": "Kraven_Raven24"
},
{
"created_at": "2020-02-22T03:35:56.000Z",
"description": "Monetarism is the underlying cause of our disease; human progress and peace through development is the cure. Eurasian integration will benefit all of humanity!",
"id": "1231059979844456448",
"name": "\ud83c\udd70pocalypsis \ud83c\udd70pocalypseos \u2014 BRI Is The Future",
"username": "apocalypseos"
},
{
"created_at": "2020-10-12T02:21:21.000Z",
"description": "Father of two beautiful boys. Believer in the Constitution of the United States. Protector of my own rights. #Meatatarian",
"id": "1315477593303310336",
"name": "\ud83e\udd85 Steven Duggin \u2665\ufe0f \ud83c\uddfa\ud83c\uddf8\ud83d\uddfd",
"username": "itsStevenDuggin"
},
{
"created_at": "2018-12-02T06:25:16.000Z",
"description": "",
"id": "1069115263671562240",
"name": "Barhag",
"username": "TheBarhag"
},
{
"created_at": "2020-09-08T13:19:17.000Z",
"description": "Not the liberals cup of tea",
"id": "1303321972227690496",
"name": "Christy",
"username": "Christy54177764"
},
{
"created_at": "2009-03-31T19:34:24.000Z",
"description": "NY-grown, FL-tanned, scribe, word nerd, TV junkie, game show champ, yenta, wife, twin mama, hot sauce collector, Bloody Mary maven &, says @NYPost, savvy gadfly",
"id": "27943005",
"name": "Lesley Abravanel",
"username": "lesleyabravanel"
},
{
"created_at": "2019-05-08T22:15:51.000Z",
"description": "\u2600\ufe0f I post Yuuko Aioi pictures daily \u2600\ufe0f\n\nI also like being wholesome, making new friends, posting about games, my everyday life, cats, NASCAR, good vibes, fumos!",
"id": "1126249378279297027",
"name": "Vaxen #DailyYuuko \u2603\ufe0f",
"username": "YuukoEnjoyer"
},
{
"created_at": "2019-12-18T22:47:10.000Z",
"description": "The Republican party is bad for America. The Conservatives are Trump bootlickers who are afraid to stand up to him. This great nation is in serious trouble.",
"id": "1207432044390699008",
"name": "Angry Patriot",
"username": "AngryPatriot20"
},
{
"created_at": "2012-11-05T05:19:37.000Z",
"description": "Employment lawyer. Represent employers and employees. 30 years ago, my mentor told me to seek the truth as a lawyer. Still do that. Tweets are not legal advice.",
"id": "926909484",
"name": "Alfred Southerland",
"username": "TexasEEOLaw"
}
]
},
"meta": {
"newest_id": "1375567243082338314",
"next_token": "b26v89c19zqg8o3fosnr8q7zstmzppg3jgd1cvynkb919",
"oldest_id": "1360693490880032770",
"result_count": 13
}
}
]
# Create file
csvFile = open("sample_data.csv", "a", newline="", encoding='utf-8')
csvWriter = csv.writer(csvFile)
# Create headers for the data I want to save. I only want to save these columns in my dataset
csvWriter.writerow(
["author_id", "created_at", "tweet_id", "text", "bio", "image_url"])
csvFile.close()
def append_to_csv(json_response, csvFile):
# counter variable
global author_id, created_at, tweet_id, text, bio, image_url
# open CSV file
csvFile = open(csvFile, "a", newline="", encoding='utf-8')
csvWriter = csv.writer(csvFile)
# loop through each tweet
for each_dict in json_response:
# loop 1. author ID, time created, tweet ID tweet text
for tweet in each_dict['data']:
# 1. Author ID
author_id = tweet['author_id']
# 2. Time created
created_at = dateutil.parser.parse(tweet['created_at'])
# 3. Tweet ID
tweet_id = tweet['id']
# 4. Tweet text
text = tweet['text']
# loop 2. description/bio loop
for dic in each_dict['includes']['users']:
# 5. description
if 'description' in dic:
bio = dic['description']
else:
bio = " "
# loop 3. image_url/url loop
for element in each_dict['includes']['media']:
# 6. image url
if 'url' in element:
image_url = element['url']
else:
image_url = " "
# assemble all data in a list
res = [author_id, created_at, tweet_id, text, bio, image_url]
csvWriter.writerow(res)
# close CSV file
csvFile.close()
append_to_csv(json_response, "sample_data.csv")
As can be seen df
only contains the predefined column names.
# import sample_data.csv as df
df = pd.read_csv(r'path...\sample_data.csv')
print(df)
Empty DataFrame
Columns: [author_id, created_at, tweet_id, text, bio, image_url]
Index: []
EDITED: Changed indentation in # 3 loop
and csvFile.close()
.
def append_to_csv(json_response, csvFile):
# counter variable
global author_id, created_at, tweet_id, text, bio, image_url
# open CSV file
csvFile = open(csvFile, "a", newline="", encoding='utf-8')
csvWriter = csv.writer(csvFile)
# loop through each tweet
for each_dict in json_response:
# loop 1. author ID, time created, tweet ID tweet text
for tweet in each_dict['data']:
# 1. Author ID
author_id = tweet['author_id']
# 2. Time created
created_at = dateutil.parser.parse(tweet['created_at'])
# 3. Tweet ID
tweet_id = tweet['id']
# 4. Tweet text
text = tweet['text']
# loop 2. description/bio loop
for dic in each_dict['includes']['users']:
# 5. description
if 'description' in dic:
bio = dic['description']
else:
bio = " "
# loop 3. image_url/url loop
for element in each_dict['includes']['media']:
# 6. image url
if 'url' in element:
image_url = element['url']
else:
image_url = " "
# assemble all data in a list
res = [author_id, created_at, tweet_id, text, bio, image_url]
csvWriter.writerow(res)
# close CSV file
csvFile.close()
The issue now is that the append_to_csv
appends the same tweets 5 times for the 5 users following the first politician and 13 times for the 13 users following the second politician resulting in a df
with 194 rows instead of 18 rows.
Solution 1:
Looks like the else branch of if 'description' in dic:
is never executed. If your code is indented correctly, then also the csvWriter.writerow
part is never executed because of this.
That yields that no contents are written to your file.
A comment on code style:
- use
with open(file) as file_variable:
instead of manually using open and close. That can save you some trouble, e.g. the trouble you would get when the else branch would indeed be executed and the file would be closed multiple times :)