machine learning - features' range in logistic regression -


i use logistic regression. know supervised method , needs calculated feature values both in training , test data. there 6 features. although functions produce these features’ values different , maximum value can 1, there 4 features (both in training , test data) have low values. e.g. range between 0 , 0.1 , never 1, more 0.1!!!. these features’ values close each other. other features distributed (they range between 0 , 0.9). difference between these 2 kinds of features high, think causes trouble in learning process logistic regression. right?! need transforming/normalizing these features? highly appreciated.

in short: should normalize features prior training. typically - each either in range (like [0,1]) or whitened (mean 0 , std 1).

why important? in order make "small" features important lr need high weights in dimension. however, use regularized lr (typically l2 regularized) - in such case hard assign high values these vectors, regularization penalty force model rather choose equally distributed weights instead - use normalization. however - if fit lr without regularization, there no point in scaling (up numerical errors) lr not depend on choice of scaling (the solution should same)


Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -