python - load data from csv into Scikit learn SVM -


i want train svm perform classification of samples. have csv file me has 3 columns headers: feature 1,feature 2, class label , 20 rows(= number of samples).

now quote scikit-learn documentation " other classifiers, svc, nusvc , linearsvc take input 2 arrays: array x of size [n_samples, n_features] holding training samples, , array y of class labels (strings or integers), size [n_samples]:"

i understand need obtain 2 arrays(one 2d & 1 1d array) in order feed data svm. unable understand how obtain required array csv file. have tried following code

import numpy np data = np.loadtxt('test.csv', delimiter=',') print data 

however showing error "valueerror: not convert string float: ��ࡱ�"

there no column headers in csv. making mistake in calling function np.loadtxt or should else used?

update: here's how .csv file looks like.

12  122 34 12234   54  23 23  34  23 

you passed param delimiter=',' csv not comma separated.

so following works:

in [378]:  data = np.loadtxt(path_to_data) data out[378]: array([[  1.20000000e+01,   1.22000000e+02,   3.40000000e+01],        [  1.22340000e+04,   5.40000000e+01,   2.30000000e+01],        [  2.30000000e+01,   3.40000000e+01,   2.30000000e+01]]) 

the docs show default delimiter none , treats whitespace delimiter:

delimiter : str, optional string used separate values. default, whitespace.


Comments