Performance improvement techniques in Serialization
This topic illustrates the performance improvement techniques in
Serialization with the following sections:
Overview of Serialization
Serialization is the process of writing complete state of java object into
output stream, that stream can be file or byte array or stream associated with
TCP/IP socket.
Deserialization is the process of reading back that serialized java object
stream from input stream.
A java object is serializeable and deserializeable if that class follows the
following rules
A) The java class must implement java.io.Serializable
interface or java.io.Externalizable interface or inherit that implementation
from any one of it's super class implementation.
B) All instance variables of that class must implement Serializable interface
or Externalizable interface or inherit from one of it's super class.
All primitive data types and some of standard java API classes are
serializable. You need not explicitly implement Serializable or Externalizable
interfaces for those classes. Serialization process ignores class (static)
variables.
Externalizable interface allow to do your own custom implementation of
serialization. In this section, focus is only on Serializable interface.
We will talk initially about Serializable interface. This is a marker interface and does not have any methods. All
major java technologies like RMI, EJB are based on serialization process to pass
the objects through network. These technologies implicitly do all the
serialization work for you. You need to simply implement the java.io.Serialzable
interface, but If you want to do your own serialization, that is reading from or
writing to streams, ObjectInputStream and ObjectOutputStream can be used.
These methods help to write into stream and read from stream
ObjectInputStream.readObject();
// to read object
ObjectInputStream.writeObject(Object obj); // to write object
Initially, We need to understand the default mechanism of serialization process in order
to improve performance
The default mechanism:
When you write or read an object to a file or network or other stream using
serialization process, It writes/reads the complete object state that means it
writes the object, it's instance variables, and super class instance
variables except
transient variables and class (static) variables. Look at this object hierarchy.

In this class hierarchy, when I write CorporateEmployee object into file and
and read from that file, Initially Address is called, second HomeAddress is
called, third Employee is called and finally CorporateEmployee is called. So
Total object hierarchy will be written into file except transient and class (static)
variables. Initially super class will be called and so on till end of heirarchy. You need
to keep an eye on this mechanism and act up on that, otherwise you will end up
with writing everything. The next section explains how to avoid unnecessary data
in to streams and improve performance.
This section examples are tested on Windows millennium, 320mb RAM and JDK 1.3.
Note: This section assumes that reader has some basic knowledge of Java.
Optimization with 'transient'
Variables that have access modifier 'transient' will not be read
from or written into streams. It gives facility to avoid writing unnecessary data
into streams. In other words, it boosts the performance by avoiding writing
unnecessary data into streams. Here is the code snippet to show the Serialization process
with transient and non transient variation bench marks
package
com.performance.serialization;
import java.util.Vector;
import java.io.*;
public class
SerializationTest
{
static long
start,end;
OutputStream
out = null;
InputStream
in = null;
OutputStream
outBuffer = null;
InputStream
inBuffer = null;
ObjectOutputStream objectOut = null;
ObjectInputStream objectIn = null;
public Person getObject(){
Person
p
= new Person("SID","austin");
Vector v =
new Vector();
for(int i=0;i<7000;i++){
v.addElement("StringObject"+i);
}
p.setData(v);
return
p;
}
public static void
main(String[] args){
SerializationTest st = new SerializationTest();
start =
System.currentTimeMillis();
st.writeObject();
st.readObject();
end =
System.currentTimeMillis();
System.out.println("Time taken for writing and reading :"+ (end-start) + "milli
seconds");
}
public void readObject(){
try{
in = new FileInputStream("c:/temp/test.txt");
inBuffer = new BufferedInputStream(in);
objectIn = new ObjectInputStream(inBuffer);
objectIn.readObject();
}catch(Exception
e){e.printStackTrace();}
finally{
if(objectIn != null)
try{ objectIn.close();}catch(IOException
e){e.printStackTrace();}
}
}
public void writeObject(){
try{
out = new FileOutputStream("c:/temp/test.txt");
outBuffer = new BufferedOutputStream(out);
objectOut = new ObjectOutputStream(outBuffer);
objectOut.writeObject(getObject());
}catch(Exception
e){e.printStackTrace();}
finally{
if(objectOut != null)
try{ objectOut.close();}catch(IOException
e){e.printStackTrace();}
}
}
}
class Person implements
java.io.Serializable
{
private
String name;
private
Vector data;
private
String address;
public
Person(String name,String address){
this.name = name;
this.address = address;
}
public
String getAddress(){
return address;
}
public
Vector getData(){
return data;
}
public
String getName(){
return name;
}
public void
setData(Vector data){
this.data = data;
}
} |
It writes the Person Object into file and reads from that file.
The output is
Time taken for writing
and reading : 390 milli seconds |
If I use 'transient' modifier for the Vector in the Person Object, then the output
is
Time taken for writing
and reading : 110 milli seconds |
It almost increases the speed more than 3 times.
You need to use 'transient' keyword for unnecessary variables to increase
performance.
Key Points
- Use 'transient' key word for unnecessary variables that need not be read from/written into streams.
- When you write RMI, EJB or any other technologies that uses built in Serialization to pass objects through network, use 'transient' key word for
unnescessary variables.
- Class (static) variables ignores by Serialization
process like 'transient' variables.
|
|