continuous_columns = data.select_dtypes(include=['float64']).columns.tolist()
- scaler = StandardScaler()
- scaled_features = scaler.fit_transform(data[continuous_columns])
scaled_df = pd.DataFrame(scaled_features, columns=scaler.get_feature_names_out(continuous_columns))
scaled_data = pd.concat([data.drop(columns=continuous_columns), scaled_df], axis=1)
- categorical_columns = scaled_data.select_dtypes(include=['object']).columns.tolist()
- categorical_columns.remove('NObeyesdad') # Exclude target column
- encoder = OneHotEncoder(sparse_output=False, drop='first')
- encoded_features = encoder.fit_transform(scaled_data[categorical_columns])
encoded_df = pd.DataFrame(encoded_features, columns=encoder.get_feature_names_out(categorical_columns))
prepped_data = pd.concat([scaled_data.drop(columns=categorical_columns), encoded_df], axis=1)
- prepped_data['NObeyesdad'] = prepped_data['NObeyesdad'].astype('category').cat.codes
- prepped_data.head()
- I uses these codes alot and this repository works like a library.
- Codes that are essential to train Neural Networks and use Google Colab are listed.
- Code used for outlier treatment.
- Basic python data visualization code.
