i have dataframe :
sepallength sepalwidth petallength petalwidth class cluster 0 5.1 3.5 1.4 0.2 iris-setosa cluster1 1 4.9 3 1.4 0.2 iris-setosa cluster1 2 4.7 3.2 1.3 0.2 iris-setosa cluster1 3 4.6 3.1 1.5 0.2 iris-setosa cluster1 4 5 3.6 1.4 0.2 iris-setosa cluster1 5 5.4 3.9 1.7 0.4 iris-setosa cluster1 6 4.6 3.4 1.4 0.3 iris-setosa cluster1 7 5 3.4 1.5 0.2 iris-setosa cluster1 8 4.4 2.9 1.4 0.2 iris-setosa cluster1 9 4.9 3.1 1.5 0.1 iris-setosa cluster1
and dictionary :
{'cluster2': 'iris-virginica', 'cluster0': 'iris-versicolor', 'cluster1': 'iris-setosa'}
i need add column , fill values dictionary of df['cluster'] == key
i have tried using np.where
def counttruth(df): # dictionary mapping cluster frequent class clustersclass = df.groupby(['cluster'])['class'].agg(lambda x:x.value_counts().index[0]).to_dict() eachkey in clustersclass: newv = clustersclass[eachkey] print df df['new'] = np.where(df['cluster']==eachkey , newv)
crashes saying either both or neither of x , y should given
my ultimate goal count true positive , true negatives , fp , fn , based on cluster , class label. step towards..
call map
, pass dict:
in [326]: d={'cluster2': 'iris-virginica', 'cluster0': 'iris-versicolor', 'cluster1': 'iris-setosa'} df['key'] = df['cluster'].map(d) df out[326]: sepallength sepalwidth petallength petalwidth class cluster \ 0 5.1 3.5 1.4 0.2 iris-setosa cluster1 1 4.9 3.0 1.4 0.2 iris-setosa cluster1 2 4.7 3.2 1.3 0.2 iris-setosa cluster1 3 4.6 3.1 1.5 0.2 iris-setosa cluster1 4 5.0 3.6 1.4 0.2 iris-setosa cluster1 5 5.4 3.9 1.7 0.4 iris-setosa cluster1 6 4.6 3.4 1.4 0.3 iris-setosa cluster1 7 5.0 3.4 1.5 0.2 iris-setosa cluster1 8 4.4 2.9 1.4 0.2 iris-setosa cluster1 9 4.9 3.1 1.5 0.1 iris-setosa cluster1 key 0 iris-setosa 1 iris-setosa 2 iris-setosa 3 iris-setosa 4 iris-setosa 5 iris-setosa 6 iris-setosa 7 iris-setosa 8 iris-setosa 9 iris-setosa
Comments
Post a Comment