@ajayram198 wrote:
I have a doubt in Practice Problem 1 in the Datahack section http://datahack.analyticsvidhya.com/contest/practice-problem-1. in this problem there is a tutorial solution given in python. In this a group by is done for every variable, whether categorical or numeric, And for some categorical variables, if the frequency count is less than 5%, then it is grouped into another category 'Others'. I dont understand why this is not followed for all variables, For example, this grouping into others is done for all variables except variables Price, Rating, Size, and Season.
Why have these 4 variables been excluded ?
Posts: 1
Participants: 1