python - Deleting rows from numpy array not working -


i trying split numpy array of data points test , training sets. that, i'm randomly selecting rows array use training set , remaining test set.

this code:

matrix = numpy.loadtxt("matrix_vals.data", delimiter=',', dtype=float) matrix_rows, matrix_cols = matrix.shape  # training set  randvals = numpy.random.randint(matrix_rows, size=50) train = matrix[randvals,:] test = numpy.delete(matrix, randvals, 0)  print matrix.shape print train.shape print test.shape 

but output is:

matrix.shape: (130, 14) train.shape: (50, 14) test.shape: (89, 14) 

this wrong since number of rows train , test should add total number of rows in matrix but here it's more. can me figure out what's going wrong?

because generating random integers with replacement, randvals contain repeat indices.

indexing repeated indices return same row multiple times, matrix[randvals, :] guaranteed give output 50 rows, regardless of whether of them repeated.

in contrast, np.delete(matrix, randvals, 0) remove unique row indices, reduce number of rows number of unique values in randvals.

try comparing:

print(np.unique(randvals).shape[0] == matrix_rows - test.shape[0]) # true 

to generate vector of unique random indices between 0 , 1 - matrix_rows, use np.random.choice replace=false:

uidx = np.random.choice(matrix_rows, size=50, replace=false) 

then matrix[uidx].shape[0] + np.delete(matrix, uidx, 0).shape[0] == matrix_rows.


Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -