Cricket Matches Prediction using Data Science

Objective

To predict the outcome of the cricket match result using data science based on the historical and current data.

Project Overview

In recent times, data science predictive modeling plays a crucial role in the sports. Cricket is one of the famous sports in India. On the given day, any team can win the match with its performance. This makes the challenge in predicting the accurate outcome of the cricket match.
The cricket game involves 3 formats – namely, Test Matches, ODIs andT20s. This project concentrates on the latest format of the game T20. To predict the result of the T20 game, we analyze the type of ground, teams past performance, batting and bowling potentials of the 11 players of both teams using their past performance. Another important parameter considered for prediction is toss decision factor.

Proposed System

With the advanced technology in today’s world, we are in need of predicting the outcome of the match. This paper focuses on predicting the outcome of the T20 matches. Supervised learning algorithms are used to predict the outcome of the match. The proposed system architecture is shown in the figure.

Cricket Matches Prediction Modules

Module 1:Data Selection

The required data is collected from the cricket website. The data should consist of player details with all features.

Batting records

Runs scored
Strike rate
Batting average
Highest score
Home/ Away
Opposite team

Bowling records

Balls bowled
Wickets taken
Economy rate
Best bowling
Number of 4 wickets haul
Home/ Away
Opposite team

Module 2: Data Preparation
Data preparation is an important step in any data science project. It consists of data cleaning, integration, normalization, transformation, reduction, feature extraction, and selection, etc.

Module 3: Correlation
In a T20 match, toss is the crucial factor in deciding the outcome of the match. Most of the toss-winning the captain choose to field first. It’s because of the perception is that, the team fielding first winning the most matches in the T20 match. To identify this relation, correlation techniques are used. Here, the correlation between toss winner and match winners is analyzed.

Module 3 : Implementation of Supervised Learning
The required supervised learning algorithm is applied to the given data set. This algorithm is applied to the data set to analyze the player performance and the accuracy is calculated. The interesting relationships between the player performances are identified using association rules. Predictive analytical techniques are used to predict the outcome of the T20 match using previous historical data and current data.

Module 4 : Predicting the Outcome of the Match
Prediction is a data mining function that discovers the future behaviors. Using the predictive analytics method, the outcome of the cricket match is predicted. Here, supervised learning approach is used.

Software Requirements

Weka 3.8
Netbeans
SQL Server

Hardware Requirements

Hard Disk – 1 TB or Above
RAM required – 4 GB or Above
Processor – Core i3 or Above