- An Introduction to Machine Learning | The Complete Guide
- Data Preprocessing for Machine Learning | Apply All the Steps in Python
- Regression
- Learn Simple Linear Regression in the Hard Way(with Python Code)
- Multiple Linear Regression in Python (The Ultimate Guide)
- Polynomial Regression in Two Minutes (with Python Code)
- Support Vector Regression Made Easy(with Python Code)
- Decision Tree Regression Made Easy (with Python Code)
- Random Forest Regression in 4 Steps(with Python Code)
- 4 Best Metrics for Evaluating Regression Model Performance
- Classification
- A Beginners Guide to Logistic Regression(with Example Python Code)
- K-Nearest Neighbor in 4 Steps(Code with Python & R)
- Support Vector Machine(SVM) Made Easy with Python
- Kernel SVM for Dummies(with Python Code)
- Naive Bayes Classification Just in 3 Steps(with Python Code)
- Decision Tree Classification for Dummies(with Python Code)
- Random forest Classification
- Evaluating Classification Model performance
- A Simple Explanation of K-means Clustering in Python
- Hierarchical Clustering
- Association Rule Learning | Apriori
- Eclat Intuition
- Reinforcement Learning in Machine Learning
- Upper Confidence Bound (UCB) Algorithm: Solving the Multi-Armed Bandit Problem
- Thompson Sampling Intuition
- Artificial Neural Networks
- Natural Language Processing
- Deep Learning
- Principal Component Analysis
- Linear Discriminant Analysis (LDA)
- Kernel PCA
- Model Selection & Boosting
- K-fold Cross Validation in Python | Master this State of the Art Model Evaluation Technique
- XGBoost
- Convolution Neural Network
- Dimensionality Reduction
XGBoost | Machine Learning
XGBoost in Python Step 1: First of all, we have to install the XGBoost. Now, we need to implement the classification problem. In this problem, we classify the customer into two classes and who will leave the bank and who will not leave the bank. Now, we import the library and we import the dataset churn Modeling csv file. So, we just want to preprocess the data for this churn modeling problem associated with this churn modeling CSV file. Here, XGboost is a great and boosting model with decision trees according to the feature skilling. After building the model, we can understand, XGBoost is so popular because of three qualities, the first quality is high performance and the second quality is fast execution speed. Now, we split the dataset into the training set and testing set. You will get the python code in Google Colab also.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('Churn_Modelling.csv')
X = dataset.iloc[:, 3:13].values
y = dataset.iloc[:, 13].values
# Encoding categorical data
# Encoding categorical data from sklearn.preprocessing import LabelEncoder, OneHotEncoder, OrdinalEncoder from sklearn.compose import ColumnTransformer # Country column ct = ColumnTransformer([("Country", OneHotEncoder(), [1]), ("Gender", OrdinalEncoder(), [2])], remainder = 'passthrough') X = ct.fit_transform(X)
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
XGBoost in Python Step 2: In this tutorial, we gonna fit the XSBoost into the training set. Now, we apply the xgboost library and import the XGBClassifier.Now, we apply the classifier object. And we call the XGBClassifier class. Now, we apply the fit method. Now, we execute this code. Now, we apply the confusion matrix. And we also predict the test set result. And we applying the k fold cross validation code. Now, we execute this code. After executing this code, we get the dataset. Then we get the confusion matrix, where we get the 1521+208 correct prediction and 197+74 incorrect prediction. And we get this accuracy of 86%. After executing the mean function, we get 86%.
from xgboost import XGBClassifier
classifier = XGBClassifier()
classifier.fit(X_train, y_train)
# Predicting the Test set results
y_pred = classifier.predict(X_test)
# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
# Applying k-Fold Cross Validation
from sklearn.model_selection import cross_val_score
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10)
accuracies.mean()
accuracies.std()