python动态生成数据库表,Python Pandas动态创建数据框
The code below will generate the desired output in ONE dataframe, however, I would like to dynamically create data frames in a FOR loop then assign the shifted value to that data frame.Example, data..
The code below will generate the desired output in ONE dataframe, however, I would like to dynamically create data frames in a FOR loop then assign the shifted value to that data frame. Example, data frame df_lag_12 would only contain column1_t12 and column2_12. Any ideas would be greatly appreciated. I attempted to dynamically create 12 dataframes using the EXEC statement, google searching seems to state this is poor practice.
import pandas as pd
list1=list(range(0,20))
list2=list(range(19,-1,-1))
d={'column1':list(range(0,20)),
'column2':list(range(19,-1,-1))}
df=pd.DataFrame(d)
df_lags=pd.DataFrame()
for col in df.columns:
for i in range(12,0,-1):
df_lags[col+'_t'+str(i)]=df[col].shift(i)
df_lags[col]=df[col].values
print(df_lags)
for df in (range(12,0,-1)):
exec('model_data_lag_'+str(df)+'=pd.DataFrame()')
Desired output for dymanically created dataframe DF_LAGS_12:
var_list=['column1_t12','column2_t12']
df_lags_12=df_lags[var_list]
print(df_lags_12)
解决方案
I think the best is create dictionary of DataFrames:
d = {}
for i in range(12,0,-1):
d['t' + str(i)] = df.shift(i).add_suffix('_t' + str(i))
If need specify columns first:
d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
d['t' + str(i)] = df[cols].shift(i).add_suffix('_t' + str(i))
dict comprehension solution:
d = {'t' + str(i): df.shift(i).add_suffix('_t' + str(i)) for i in range(12,0,-1)}
print (d['t10'])
column1_t10 column2_t10
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 NaN NaN
7 NaN NaN
8 NaN NaN
9 NaN NaN
10 0.0 19.0
11 1.0 18.0
12 2.0 17.0
13 3.0 16.0
14 4.0 15.0
15 5.0 14.0
16 6.0 13.0
17 7.0 12.0
18 8.0 11.0
19 9.0 10.0
EDIT: Is it possible by globals, but much better is dictionary:
d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
globals()['df' + str(i)] = df[cols].shift(i).add_suffix('_t' + str(i))
print (df10)
column1_t10 column2_t10
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 NaN NaN
7 NaN NaN
8 NaN NaN
9 NaN NaN
10 0.0 19.0
11 1.0 18.0
12 2.0 17.0
13 3.0 16.0
14 4.0 15.0
15 5.0 14.0
16 6.0 13.0
17 7.0 12.0
18 8.0 11.0
19 9.0 10.0
更多推荐
所有评论(0)