csv - ValueError in Random forest (Python) -

i trying perform random forest analysis in python. seems ok but, when try run code, following error message:

did of valueerror?

cheers

dataset: https://www.dropbox.com/s/ehyccl8kubazs8x/test.csv?dl=0&preview=test.csv

code:

from sklearn.ensemble import randomforestregressor rf import numpy np import pylab pl   headers = file("test.csv").readline().strip().split('\r')[0].split(',')[1:]  data = np.loadtxt("test.csv", delimiter=',', skiprows=1, usecols = range(1,14))  #yellow==par, green==vpd, blue== tsoil , orange==tair par  = data[:,headers.index("par")] vpd  = data[:,headers.index("vpd")] tsoil= data[:,headers.index("tsoil")] tair = data[:,headers.index("tair")]  drivers = np.column_stack([par,vpd,tsoil,tair])  hour = data[:,-1].astype("int")   #performs random forest hour-wise explain each nee, gpp , reco fluxes importances = np.zeros([24,2,3,4])  ff,flux in enumerate(["nee_f","gpp_f","reco"]):     fid = headers.index(flux)     obs = data[:,fid]      #store importances: dim average/std; obs var; expl var       hh in range(24):         mask = hour == hh         forest = rf(n_estimators=1000)         forest.fit(drivers[mask],obs[mask])            importances[hh,0,ff] = forest.feature_importances_         importances[hh,1,ff] = np.std([tree.feature_importances_ tree in forest.estimators_],axis=0)  fig = pl.figure('importances',figsize=(15,5));fig.clf() xx=range(24)  colors = ["#f0e442","#009e73","#56b4e9","#e69f00"];labels= ['par','vpd','tsoil','tair'] ff,flux in enumerate(["nee_f","gpp_f","reco"]):     ax = fig.add_subplot(1,3,ff+1)     vv in range(drivers.shape[1]):         ax.fill_between(xx,importances[:,0,ff,vv]+importances[:,1,ff,vv],importances[:,0,ff,vv]-importances[:,1,ff,vv],color=colors[vv],alpha=.35,edgecolor="none")         ax.plot(xx,importances[:,0,ff,vv],color=colors[vv],ls='-',lw=2,label = labels[vv])         ax.set_title(flux);ax.set_xlim(0,23)         if ff == 0:             ax.legend(ncol=2,fontsize='medium',loc='upper center') fig.show() fig.savefig('importance-hourly.png')

the problem selected column years stored, not hours are. therefore rf trained on empty arrays.

Search This Blog

Ben

csv - ValueError in Random forest (Python) -

Comments

Post a Comment

Popular posts from this blog

routing - AngularJS State management ->load multiple states in one page -

python - GRASS parser() error -

post - imageshack API cURL -