machine learning - Python scikit svm "ValueError: X has 62 features per sample; expecting 337" -

playing around python's scikit svm linear support vector classification , i'm running error when attempt make predictions:

ten_percent = len(raw_routes_data) / 10  # training training_label = all_labels[ten_percent:] training_raw_data = raw_routes_data[ten_percent:] training_data = dictvectorizer().fit_transform(training_raw_data).toarray()   learner = svm.linearsvc() learner.fit(training_data, training_label)  # predicting testing_label = all_labels[:ten_percent] testing_raw_data = raw_routes_data[:ten_percent] testing_data = dictvectorizer().fit_transform(testing_raw_data).toarray()  testing_predictions = learner.predict(testing_data)   m = metrics.classification_report(testing_label, testing_predictions)

the raw_data represented python dictionary categories of arrival times various travel options , categories weather data:

{'72_bus': '6.0 11.0', 'uber_eta': '2.0 3.5', 'tweet_delay': '0', 'c_train': '1.0 4.0', 'weather': 'overcast', '52_bus': '16.0 21.0', 'uber_surging': '1.0 1.15', 'd_train': '17.6666666667 21.8333333333', 'feels_like': '27.6666666667 32.5'}

when train , fit training data use dictionary vectorizer on 90% of data , turning array.

the provided testing_labels represented as:

[1,2,3,3,1,2,3, ... ]

it's when attempt use linearsvc predict i'm informed:

valueerror: x has 27 features per sample; expecting 46

what missing here? way fit , transform data.

the problem creating , fitting different dictvectorizer train , test.

you should create , fit 1 dictvectorizer using train data , use transform method of object on testing data create feature representation of test data.

Search This Blog

Ben

machine learning - Python scikit svm "ValueError: X has 62 features per sample; expecting 337" -

Comments

Post a Comment

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

post - imageshack API cURL -

dataset - MPAndroidchart returning no chart Data available -