To Implement Web Crawler in Java BE(IT) CLP-II Pratical

To Implement Web Crawler in Java BE(IT) CLP-II Pratical

Aim : To implement Web Crawler in Java Language .

Web crawler is the program of piece of code that search engine uses to index Web pages across the web. It crawls the HTML Page to find the keywords on that page for search engine indexing of the pages .

Below code Web crawler in Java crawls the “google.com” and finds out the total links to other pages .

 import java.net.*;  
 import java.io.*;  
 import java.util.regex.*;  
 public class crawler {  
      public static void main(String[] args) {  
           String source_url="https://google.com";  
           try  
           {  
                URL url = new URL(source_url);  
                URLConnection yc = url.openConnection();  
                String data=null;  
                BufferedReader in = new BufferedReader(new InputStreamReader(yc.getInputStream()));  
                String inputLine;  
                while ((inputLine = in.readLine()) != null)  
                     data=data+inputLine ;  
                in.close();  
                Integer i=0;  
                Pattern pattern = Pattern.compile("]*href=\"[^>]*>(.*?)");  
                Matcher matcher = pattern.matcher(data);  
                while (matcher.find())   
                {  
                     System.out.println((i+1)+ matcher.group());  
                     i=i+1;  
                }  
                System.out.println("TOTAL LINKS:"+i);  
           }  
           catch(Exception e)  
           {  
                System.out.println(e);  
           }  
      }  
 }

To Implement Web Crawler in Java BE(IT) CLP-II Pratical

Download Java Project

Download Visual Basic Projects

Download .Net Projects

Download VB Projects

Download C++ Projects

Download NodeJs Projects

Download School Projects

Download School Projects

Ask Questions - Forum

Latest Projects Ideas

Assembly Codes

Datastructure Assignments

Computer Graphics Lab

Operating system Lab

Other Projects to Try:

Reader Interactions

Leave a Reply Cancel reply

Footer