Dummy variables, is necessary to standardize them?

@bgarcial wrote:

I have the following dataset represented like numpy array
direccion_viento_pos

    Out[32]:

    array([['S'],
           ['S'],
           ['S'],
           ...,
           ['SO'],
           ['NO'],
           ['SO']], dtype=object)
The dimension of this array is:
direccion_viento_pos.shape
(17249, 8)
I am using python and scikit learn to encode these categorical variables in this way:
from __future__ import unicode_literals
import pandas as pd
import numpy as np
# from sklearn import preprocessing
# from matplotlib import pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
Then I create a label encoder object:
labelencoder_direccion_viento_pos = LabelEncoder()
I take the column position 0 (the unique column) of the direccion_viento_pos and apply the fit_transform() method addressing all their rows
direccion_viento_pos[:, 0] = labelencoder_direccion_viento_pos.fit_transform(direccion_viento_pos[:, 0])
My direccion_viento_pos is of this way:
direccion_viento_pos[:, 0]
array([5, 5, 5, ..., 7, 3, 7], dtype=object)
Until this moment, each row/observation of direccion_viento_pos have a numeric value, but I want solve the inconvenient of weight in the sense that there are rows with a value more higher than others.

Due to this, I create the dummy variables, which according to this reference are:

A Dummy variable or Indicator Variable is an artificial variable created to represent an attribute with two or more distinct categories/levels

Then, in my direccion_viento_pos context, I have 8 values

SO - Sur oeste

SE - Sur este

S - Sur

N - Norte

NO - Nor oeste

NE - Nor este

O - Oeste

E - Este

This mean, 8 categories.
Next, I create a OneHotEncoder object with the categorical_features attribute which specifies what features will be treated like categorical variables.

onehotencoder = OneHotEncoder(categorical_features = [0])

And apply this onehotencoder to our direccion_viento_pos matrix.

direccion_viento_pos = onehotencoder.fit_transform(direccion_viento_pos).toarray()

My direccion_viento_pos with their categorized variables has stayed so:
direccion_viento_pos

array([[0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 0., ..., 1., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 1.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 1.]])
Then, until here, I’ve created dummy variables to each category.

Dirección del viento categorizada1668x675

I wanted to narrate this process, to arrive at my question.

If these dummy encoder variables already in a 0-1 range, is necessary apply the MinMaxScaler feature scaling?

Some say that it is not necessary to scale these fictitious variables. Others say that if necessary because we want accuracy in predictions

I ask this question due to when I apply the MinMaxScaler with the feature_range=(0, 1)
my values have been changed in some positions … despite to still keep this scale.

What is the best option which can I have to choose with respect to my dataset direccion_viento_pos

Posts: 4

Participants: 3

Read full topic

Dummy variables, is necessary to standardize them?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112