Cloning an html document

In this post under jsoup, I will show with example how to clone the entire html document.

For our example we will use the below html document

Input1.html


<html>
    <head>
        <title>Input1</title>
    </head>
    <body>
        <p>Input1</p>
    </body>
</html>

Below is the main code that contains the logic for cloning the entire document.

Main Class


1  import java.io.File;
2  import java.io.IOException;
3  
4  import org.jsoup.Jsoup;
5  import org.jsoup.nodes.Document;
6  
7  public class JsoupDemo7 {
8      public static void main(String[] args) throws IOException {
9          File file = new File("Input1.html");
10         Document sourceDocument = Jsoup.parse(file, "UTF-8");
11         Document destinationDocument = sourceDocument.clone();
12         System.out.println(destinationDocument);
13     }
14 }

At line 9, we create a file object giving the file name containing html data as an constructor argument.

At line 10, we parse the file using the Jsoup’s static method “parse”. This method returns a “Document” class instance named “sourceDocument”, which represents the Input1.html file.

Next at line 11, we call the “clone” method on sourceDocument instance. This will return another “Document” class instance named “destinationDocument” which is a complete clone of “sourceDocument”.

In this way we can clone the entire html document.

Below is the output

Output


<html> 
 <head> 
  <title>Input1</title> 
 </head> 
 <body> 
  <p>Input1</p>  
 </body>
</html>

Leave a Reply