Partitioning a matrix intro train and test in MATLAB in an efficient manner -
i working user-rating matrix r
real data many users u
in rows , many items i
in columns ratings r(u,i)
rating of user u
item i
. ratings 1-5. want partition data 2 sets train (80%) , test (20%) following requirements:
- each user contains minimum 6 positive ratings (positive ratings 4,5)
- on test, need have minimum 5 positive ratings each user. (because used calculating recall in evaluation section)
- on train, minimum 1 positive rating required each user.
- while maintaining condition 2 , 3, need participating proportion 80-20 withheld.
i start program follows:
- for condition 1, remove users ratings less 6.
- for condition 2 , 3, can start single user. find items has interacted , find positive ratings. randomly put 5 ratings in test , rest (1 or more) in train. repeat process users.
- this process can work can have 2 problems:
i. how make sure partitioning scheme 80-20 kept. not probably.
ii. how can make process fast not need loop on users? (for example using function @bsxfun).
this process done in matlab , not have , memory problem.
thanks opinions in advance
Comments
Post a Comment