Sunday, July 27, 2014

Why closing external resources after using it, If GC can take care?


It is strongly recommend that you always close the connection when you are finished using it so that the connection will be returned to the pool. Connections are a limited and relatively expensive resource. Any new connection you establish that has exactly the same connection string, will be able to reuse the connection from the pool. If we don't close a DB connection, we get something called a "connection leak" once threshold is reached.


Now come to the GC, GC Cannot guarantee of its execution, neither you can force JVM to run GC on your method (in case you don't close connection.) because GC is a demon thread and hence it has lowest priority. it may execute if it gets the CPU at that time.

So not closing connections and leaving it for GC may result in many problems like web pages hanging, slow page loads, and more.


Also, you should be careful while closing the connections or stream etc. because even though you are closing connection , sometimes it leads to resource leakage if not handled properly. 

Lets take an example: 

public static Properties loadProperties(String fileName) 
throws IOException { 
FileInputStream fis = new FileInputStream(fileName); 
Properties props = new Properties(); 
props.load(fis); 
fis.close(); 
return props; 


In the above example, 'fis' is closed but it still prone to resource leakage. 

How? 
>>It because, if any exception occurred at the call of "load" method only.. method will throw exception but will not release the resource properly. 

Correct way:::: 

public static Properties loadProperties(String fileName) 
throws IOException { 
FileInputStream fis= new FileInputStream(fileName); 
try { 
Properties props = new Properties(); 
props.load(fis); 
return props; 

finally { 
fis.close(); 



And JDK 7 Onwards, you don’t have to be worried about closing resources. You can use TRY,CATCH with resources functionality. 

static String readFirstLineFromFile(String path) throws IOException { 
try (BufferedReader br = 
new BufferedReader(new FileReader(path))) { 
return br.readLine(); 
// Use br…
 }  catch(Exception excp){….}


And one last point... 

Suppose you are using JDBC connection,statement and resultset. Then you should close all 3 resources not only connection. 

And the best way to handle this is>> 

public void someOperation() throws SQLException { 
Statement stmt= null; 
ResultSet rs= null; 
Connection conn= getConnection(); 
try { 
stmt= conn.createStatement(); 
rs= statement.executeQuery("SELECT * FROM EMP"); 
// Use RS 

finally { 
try { 
if (rs!= null) 
rs.close(); 

finally { 
try { 
if (stmt != null) 
stmt.close(); 

finally { 
conn.close(); 



}


Now one more question,  Why to close ResultSet and Statement ??
You must have heard about bad programming practices. So, It is a best programming practice to close ResultSet, Statement along with Connection.

How>> ?

If you are using JDBC connection and not closing ResultSet object it will be automatically closed. So, closing it in finally block will releases this ResultSet object immediately instead of waiting for this to happen when it is automatically closed. Closing ResultSet explicitly enables garbage collector to recollect memory as early as possible because ResultSet object may take lot of memory depending on query.

As per java doc of ResultSet object::

"A ResultSet object is automatically closed when the Statement object that generated it is closed, re-executed, or used to retrieve the next result from a sequence of multiple results."

Also note that, you can get only one ResultSet object at a time from a Statement. So all execution methods in the Statement interface implicitly close a statment's current ResultSet object if an open one exists.

So, now Question is why statement should be closed?

As mentioned above the ResultSet will be closed automatically only if you close the Statement object. Not doing so will lead to resource leak.

This is the reason why JDBC API provided close() method for both ResultSet and Statement. These methods are not showpiece is'nt ?


Now, it’s totally depends on you whether you are writing a good quality code or not.

Saturday, July 26, 2014

Ways to implement Threads in Java.

You can create thread in 4 ways... 

1- Using anonymous class implementation of Thread. 
2- By Implementing Runnable Interface. 
3- By extending the Thread class itself. 

1- Using anonymous class implementation of Thread:

         Using Anonymous Thread>>> 

         public class CreateMultiThread {
            public static void main(String[] args) {
              Thread t1 = new Thread ("First"){
                     @Override
                     public void run() {
                           for (int i = 0; i < 4; i++) {
                                  System.out.println(getName()+" : "+i);
                           }
                     }
              };
             
              Thread t2 = new Thread ("second"){
                     @Override
                     public void run() {
                           for (int i = 0; i < 4; i++) {
                                  System.out.println(getName()+" : "+i);
                           }
                     }
              };
             
              t1.start();
              t2.start();
              System.out.println("In Main()");
         }
    }

    OUTPUT:
      In Main()
      First : 0
      second : 0
      First : 1
      second : 1
      First : 2
      second : 2
      First : 3
      second : 3

Using Anonymous Runnable>>> 
    
 public class CreateMultiThread {
   public static void main(String[] args) { 
      Runnable runnable= new Runnable() {
   @Override
   public void run() {
    for (int i = 0; i < 4; i++) {
     System.out.println(i);
    }
   }
       };
 
     Thread t1 = new Thread(runnable,"First");
     Thread t2 = new Thread(runnable,"Second");
     t1.start();
     System.out.println("Thread: "+t1.getName());
     t2.start();
     System.out.println("Thread: "+t2.getName());
     System.out.println("In Main()");

    }
}

OUTPUT>>
      Thread: First
        0
        1
        2
        3
        0
        1
        2
        3
        Thread: Second
        In Main()


    2- By Implementing Runnable Interface:
      
        class ThreadUsingRunnable implements Runnable {
 private Thread t;
 private String name;

 ThreadUsingRunnable(String name) {
  this.name = name;
  t = new Thread(this, name);
  System.out.println("Child Thread:: " + t);
  t.start();
               }

 public void run() {
  for (int i = 0; i < 4; i++) {
   System.out.println(name + ": " + i);
  }
 }
     }
     
 //Testing the thread
 public class CreateMultiThread {
 public static void main(String[] args) {
 
  new ThreadUsingRunnable("first");
  new ThreadUsingRunnable("second");
  new ThreadUsingRunnable("third");
  try {
   Thread.sleep(1000);
  } catch (InterruptedException ee) {
   System.out.println("Main Intrrupted!!");
  }
  System.out.println("Main Exiting!!");
 }
   }
   
    OUTPUT>>
     Child Thread:: Thread[first,5,main]
     Child Thread:: Thread[second,5,main]
     first: 0
     first: 1
     Child Thread:: Thread[third,5,main]
     first: 2
     second: 0
     first: 3
     second: 1
     second: 2
     second: 3
      third: 0
      third: 1
      third: 2
      third: 3
      Main Exiting!!
   
    
    3By extending Thread class itself:

       class ThreadUsingThreadClass extends Thread {
 private String name;
 ThreadUsingThreadClass(String name) {
  super(name);
  this.name = name;
  System.out.println("Child Thread:: " + this);
  start();
 }

 public void run() {
  for (int i = 0; i < 4; i++) {
   System.out.println(name + ": " + i);
  }
 }
     }

 //Testing the thread
 public class CreateMultiThread {
      public static void main(String[] args) {
  new ThreadUsingThreadClass("first");
  new ThreadUsingThreadClass("second");
  new ThreadUsingThreadClass("third");
  try {
   Thread.sleep(1000);
  } catch (InterruptedException ee) {
   System.out.println("Main Intrrupted!!");
  }
  System.out.println("Main Exiting!!");
          }
      }
  
     OUTPUT>>
      Child Thread:: Thread[first,5,main]
      Child Thread:: Thread[second,5,main]
      first: 0
      first: 1
      first: 2
      first: 3
     Child Thread:: Thread[third,5,main]
     second: 0
     second: 1
     second: 2
     second: 3
     third: 0
      third: 1
      third: 2
      third: 3
      Main Exiting!!


Which method is best for implementing Threads::

1st Reason>>

Creating your thread implementation class by extending the Thread class makes it heavy, as it inherits all the properties of Thread class.
While implementing Runnable , our motive is to provide the implementation of run() method not the actual Thread class. As we all know Thread class also implement Runnable to get the run() method. Hence the object implementing Runnable will have run () method and will be a lightweight object.


2nd Reason>>

In Java we can extend only one class at a time, so if we extend Thread class we can’t extend any other class while by implementing Runnable interface we still have that option to extend some other class. 

Hence, create your thread implementation by extending the Thread class only when you want to override the some behavior of Thread class.


Leave your comments/suggestions below. Otherwise the next time :)


    


Monday, July 14, 2014

Marklogic and Alfresco Integration


What is MarkLogic?

       MarkLogic is a NoSQL database.
       MarkLogic comes in three flavors:

ü  Developer Edition - Free, full-featured version. Included API's extend to all versions of MarkLogic.
ü  Essential Enterprise – Supports replication, backup, high availability, recovery, fine-grained security, location services, and alerting. Semantics and advanced language packs are options.
ü Global Enterprise – It is designed for use for large, globally distributed applications. Semantics, tiered storage, geospatial alerting and advanced language packs are options.

What is Alfresco?

      Alfresco is a friendly team-based collaboration ECM (Enterprise Content Management System).
       Alfresco comes in three flavors:

ü  Community Edition - It has some important limitations in   terms of scalability and availability, since the clustering feature is not available to this edition.
ü  Enterprise Edition – It supports clustering, its design is geared towards users who require a high degree of modularity and scalable performance.
ü  Cloud Edition – It is a SaaS (Software as a service) version of Alfresco.

Why Alfresco?

     Its rich feature set is completely accessible over a REST-based interface and can be extended and enhanced with simple server-side JavaScript (i.e. no compilation required). although Java and Groovy etc. are common choices.
     It allows for management of data of any format (not only document content and images).
     It provides rich collaboration tools such as wiki, forums, issue log etc. and functionality to edit and manage image files.
     It enables easy web designing for people, who are not technical users.
     It provides publishing channels such as Google Docs, YouTube, Flickr, Slide Share, Facebook and LinkedIn out of the box.
     Office documents can be edited within CMS using Google Docs as well as offline using its built-in checkout feature.
     Rich Add-ons from Community i.e. plug-ins/tools can be integrated with Alfresco easily.
     Alfresco is compatible with most commonly used operating systems like Linux, MAC and Windows; it can be fully integrated with an organization's office suites like Microsoft Office or OpenOffice.org.
     Supports workflow with help of Activity and JBPM.
     Supports multiple databases.
      Provide search for the uploaded documents using Lucene API

Note: Alfresco use a library called Apache PDBox library (open source java lib) for extracting the texts from PDF and index them.  (http://pdfbox.apache.org/)

Why MarkLogic?

       High-performing XML database.
    Use Google style search engine.
    It's a document-centric NoSQL solution using XML as the content model, meaning you get a very scalable repository that can adapt to content changes effortlessly.
    It unifies structured, semi-structured and unstructured data into a single database from where organizations can store, retrieve, analyze and manipulate these very large data sets.
    With its immediate consistency and a searchable-everything philosophy, there's no need to compromise on your ACID promise.

Union Benefits of MarkLogic and Alfresco:                                  

       Alfresco can be used as an editorial and content production system, so you can create, curate, edit, and workflow and semantically enrich your content and finally publish the documents to MarkLogic.
    Once published to Mark Logic you have the power to perform fast searches using the powerful standards-based XQuery language.
     Versions of documents can be controlled in MarkLogic, Alfresco also supports document versioning but you cannot customize it.
    Using MarkLogic as your publishing target you gain not only a Big Data content store, but a rich and expressive query language where Search and Retrieval are combined.
    As MarkLogic is a document centric content store you don’t need to worry about the type of content and structure of content.

Supported Alfresco versions : 4.x and 5.x
Supported MarkLogic versions : 5.x and above

Steps to integrate Alfresco with MarkLogic>>>>>

1.     Download the alfresco marklogic integration plugin from 
2.     Create the jar file from the plugin maven (for 4.x version)
3.   Create amp using ant (5.0.a community version). 
4.     Add the jar file in /alfresco/WEB-INF/lib.
5.     Copy the config directories given in "/marklogic-integration-alfresco/src/main/config"     
      of plugin to /tomcat/shared/classes/
6.     Restart the Alfresco.
7.     Download the alfresco marklogic integration RESTApi from


7.     Configuring MarkLogic>>

                  I. Create a database named as "alfresco-pub".

                 II. Create a forest named as "alfresco-pub-forest"

                III. Attach the "alfresco-pub-forest" with "alfresco-pub" database.

               IV. Create a role named as "alfresco-publisher" and assign following ‘Execute
 Privileges' to the alfresco-publisher role under 'Execute Privileges’ section.

   A- admin-module-read
   B- admin-module-write
   C- any-collection
   D- xdmp:add-response-header
   E- xdmp:eval
   F- xdmp:document-get
   G- any-uri
   H- xdmp:get-session-field
I-    xdmp:http-delete,xdmp:http-get,xdmp:http-head,xdmp:http-post,
    xdmp:http- put,xdmp:http-options
   
  J- xdmp:invoke

                V. Save the role.

               VI. Assign following permissions to 'alfresco-publisher' role under '
 default permissions' section.

·         Select the role 'alfresco-publisher' and select 'update' from dropdown and save role.
·         Select the role 'alfresco-publisher' and select 'insert' from dropdown and save role.
·         Select the role 'alfresco-publisher' and select 'read' from dropdown and save role.
·         Select the role 'alfresco-publisher' and select 'execute' from dropdown and save role.

          VII. Create a user 'alfrescopub-admin', select the password & assign the 'alfresco-publisher' role to it and save the user.

       Note: You can enable/disable marklogic authentication by setting the value of ml.auth.enabled=true/false in marklogic-integration/alfresco-global.properties .Update the marklogic username and password in "marklogic-integration/alfresco-global.properties" in order to use authentication,

             VIII. Create an HTTP Server on Marklogic server,name it as 'Alfresco-Publishing-HTTP'.

               IX. Select port number as '9000'

                X. Under the root section provide the path of api which you have downloaded.

         For example: 
         If the download location is "d:\alfresco-marklogic-publish-unpublish-webservices"

               XI. Select the modules as 'file-system'.
              XII. Select the database as 'alfresco-pub' created at (step-7.1).
             XIII. Select the authentication 'digest'. If marklogic authentication is enabled in marklogic-integration/alfresco-global.properties. Otherwise Select the authentication 'application-level'.

            XIV. Select the default user 'alfrescopub-admin' created at (step-7.8).
             XV. Add the /url-rewriter.xqy inside the url rewriter section.
            XVI. Click 'OK' to save the changes on HTTP Server.

Your REST services are ready for use::::::
For publish uri should be : http://127.0.0.1:9000/alfrescopub/publish?uri=someuri
For unpublish uri should be : http://127.0.0.1:9000/alfrescopub/unpublish?uri=someuri


 8.  Configure Alfresco, Go to "Admin Console > Channel Publishing > Channel Manager"

          



     9.     Select "MarkLogic" as channel, Alfresco will authorize it.


         10.     Channel added to Alfresco


         11.    Click on "MarkLogic" channel icon to configure channel endpoint.



Provide MarkLogic server hostname e.g. "127.0.0.1" and MarkLogic server port e.g. "9000".

         12.    Click on "Save", Now channel is ready for use.



Test publishing>>>>>

1.  Go to "Project Library" > "Documents" > "Agency File" > "Images"
2.  Select an image,
       



3.  Click on "Publish" link on right hand side .
4.  Select "MarkLogic" as publishing channel, and click "Publish"
      
      







4.  Image published to MarkLogic, You can see the status at the bottom of the page.






You  can also verify the published content via MarkLogic qconsole or via following service



     http://127.0.0.1:9000/alfrescopub/get?id=workspace://SpacesStore/e9528c29-dbbc-49c5-ae63-ae35b67bea33



     Where value of id is the uri of content inside Alfresco, it can be seen in the URL of browser while content is viewed




5.  You can "Unpublish", by clicking on unpublish link right side in the history section,
      and Click "OK". Content will be queued for "Unpublishing"


5.  You can see the unpublish status in the "Publishing History" section.




References: 
https://docs.alfresco.com/4.2/
https://docs.marklogic.com

Leave your comments/suggestions below. Otherwise the next time :)