Have data appear in first row only in one column of dataframe
I have a column of ticker symbols and from that column I made a comma delimited string of symbols that was placed in a new column called v1 in the same dataframe, DF. I also took the comma delimited string to a new dataframe, DF1. In both cases, I only wanted the string to appear in column 1, not in every column. Is there any way in either dataframe, to have the comma delimited string of symbols only appear in the first row and not repeat in all the rows? If possible could someone explain how. Thanks
Delimited Comma String Code
v1 = df['Ticker'].tolist()
v1 = ",".join(map(str,v1))
df['v1'] = v1
df1 = df[['v1']]
print(df)
print (df1)
Current DF Output
No. Ticker ... AH Change v1
0 1 AAPL ... - AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
1 2 MSFT ... - AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
2 3 TSLA ... - AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
3 4 FB ... - AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
Current DF1 Output
0 AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
1 AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
2 AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
3 AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
Desired DF Output
No. Ticker ... AH Change v1
0 1 AAPL ... - AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
1 2 MSFT ... -
2 3 TSLA ... -
3 4 FB ... -
Desired DF1 Output
0 AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
Full Code
import pandas as pd
import requests
import bs4
import time
import random
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
def testDf(version):
url = 'https://finviz.com/screener.ashx?v={version}&r={page}&f=sh_outstanding_o1000&c=0,1,2,3,4,5,6,7,71,72&f=ind_stocksonly&o=-marketcap'
page = 1
screen = requests.get(url.format(version=version, page=page), headers=headers)
soup = bs4.BeautifulSoup(screen.text, features='lxml')
pages = int(soup.find_all('a', {'class': 'screener-pages'})[-1].text)
data = []
for page in range(1, 1 * pages, 20):
print(version, page)
screen = requests.get(url.format(version=version, page=page), headers=headers).text
tables = pd.read_html(screen)
tables = tables[-2]
tables.columns = tables.iloc[0]
tables = tables[1:]
data.append(tables)
time.sleep(random.random())
return pd.concat(data).reset_index(drop=True).rename_axis(columns=None)
df = testDf('152').copy()
v1 = df['Ticker'].tolist()
v1 = ",".join(map(str,v1))
df['v1'] = v1
df1 = df[['v1']]
print(df)
print (df1)
grouping = df.groupby('v1')
indices = []
for x in grouping.groups.values():
indices.extend(x[1:])
df.loc[indices, 'v1'] = ''
df1 = pd.DataFrame(grouping.groups.keys())
Note: This changes df
and is irreversible.