python - Iterating over columns and rows in pandas dataframe -
i trying iterate through dataframe have , use values inside of cells, need use names of columns , rows cells come from. because of doing following:
df=pandas.dataframe(data={"c1" : [1,2,3,4,5], "c2":[1,2,3,4,5]}, index=["r1","r2","r3","r4","r5"]) row in df.index.values: column in df.columns.values: if (df[row][column] > 3: if row in df2[column]: print("data present")
i need use row , column names because using them values in data frame has related information. know loops take forever in pandas, haven't been able find examples of how iterate on both row , column , same time. this:
df.applymap()
wont work because gives value in cell, without keeping reference row , column cell in, , this:
df.apply(lambda row: row["column"])
wont work because need name of column without knowing before. this:
df.apply(lambda row: somefunction(row))
wont work because apply uses series object has row name, rather row , column names.
any insight helpful! running loop version takes forever , hogs cpu cores.
import pandas pd df = pd.dataframe(data={"c1": [1, 2, 3, 4, 5], "c2": [1, 2, 3, 4, 5]}, index=["r1", "r2", "r3", "r4", "r5"]) df2 = pd.dataframe({'r3': [1], 'r5': [1], 'r6': [1]})
to of corresponding columns df2 have value greater 3 in df, can use conditional list comprehension:
>>> [idx idx in df[df.gt(3).any(axis=1)].index if idx in df2] ['r5']
to see how works:
>>> df.gt(3) c1 c2 r1 false false r2 false false r3 false false r4 true true r5 true true
then want index of row has value greater three:
df.gt(3).any(axis=1) out[23]: r1 false r2 false r3 false r4 true r5 true dtype: bool >>> df[df.gt(3).any(axis=1)] c1 c2 r4 4 4 r5 5 5 >>> [i in df[df.gt(3).any(axis=1)].index] ['r4', 'r5'] >>> [i in df[df.gt(3).any(axis=1)].index if in df2] ['r5']
Comments
Post a Comment