Trending YouTube Video Statistics
Objective
To extract the useful information from the trending YouTube video using the statistics modelling techniques.
Project Overview
YouTube is the well-known video sharing platform in the world. The website maintains the mostly viewed topmost videos and trending videos. Most of the trending videos are music videos, comedy, celebrity performances, sports incidents and latest viral videos. To identify the trending videos, it uses the factors like,
- Number of Views
- Shares
- Comments
- Likes
System Design
Proposed YouTube Video Statistics System of this project concentrates on the categories such as visualizing, comparing and analyzing. In this system, the required outcome is obtained using the following,
- Basic Statistics
- Business Intelligence
- Data Analytics
- Machine Learning
- Predictive Analytics
List of modules are
- Data set collection
- Data preparation
- Data analytics
- Data visualization
Module 1 : Data Set Collection
The required data set was collected from YouTube using the YouTube API.The collected data set consists of daily trending videos of last few years. The gathered data includes the regions such as USA, Canada, France and Great Britain. The data includes the factor like channel title, video title, time published, number of tags, number of views, number of likes, number of dislikes, description of the video and comment count.
List of Attributes
The YouTube Video Statistics data set consist of two parts. One is a comment and another one is video statistics. Here video_id is used as a unique field.
The attributes of the video file,
- video_id
- video_title
- date
- channel_title
- category_id
- views
- likes
- dislikes
- thumbnail_link
- tags
The attributes of the comments file
- video_id
- comment_text
- likes
- replies
Module 2: Data Preparation
Data preparation is one of the important step and time-consuming process in the data mining process. Data pre-processing, includes cleaning, transformation, and attribute selection, etc.
Module 3: Data Analytics
From the collected data set following things can be done for the better understanding and better decision making in the future.
- Categorising the videos
- Comments play a huge factor here.
- Different forms of sentiment analysis
- Supervised learning algorithms used here.
- Auto-generation of comments
- Machine learning algorithms play a huge role here.
- Predicting the popularity of the video in advance
- Predictive analytics used here.
Module 4 : Data Visualization
The hidden information between the extracted results is found. These patterns are then displayed in the pictorial format of bar charts for the easy analysis and better understanding. The resultant chart contains relative comparisons between two features combination of the data set. The analyzed result is obtained a relationship between all items as visualized form.
More attributes can be added to the data set while collecting data. This will help to gain much better understanding. Deeper analyzes of the project give furthermore efficient and effective analyzes in the project.
Software Requirements
- Tableau
- R
Hardware Requirements
- Hard Disk – 1 TB or Above
- RAM required – 8 GB or Above
- Processor – Core i3 or Above
Leave a Reply