Thursday, March 31, 2005


ANT - Another Neat Tool for doing make like utility in java. It achieves its nteroperaibility based on the java principle.

The creators of ANT claim they wanted a tool to edit class paths in tomcat. but it has evolved over time , you have tags like project, property, tasks etc.

XML Compression

Today i had a presentation on XML compression and it turned out to be good. We prepared well and in the end were able to answer most of the questions which students had , the presentation had the following things:

Motivation for XML Compression
Techniques for achieving XML compression
XMill Architecture

Structured nature of XML makes it understandable to humans.
Downside: XML is verbose
Each non-empty element tag must end with a matching closing tag -- data
Ordering of tags is often repeated in a document (e.g. multiple records)

XML documents are text-based: well-known compression schemes such as Huffman and LZ can be easily applied
Can gain a significant savings from compression, due to highly structured nature of XML
XML is being used more frequently in real-time applications (e.g. web service-based e-commerce applications); increasing interest in finding ways to reduce overall size of XML documents

Usually some degree of repetition in an XML document (multiple occurrences of tags, attribute or data values)
Compression schemes like Huffman and LZ can use this repetition to achieve some degree of compression

Many existing (and efficient) implementations of these algorithms are readily available (e.g. gzip)
Downside is that these techniques aren’t fully capable of exploiting the structure of XML to achieve greater compression

Since these are 6 characters, this text is 6 bytes or 48 bits long
tree is build that replaces the symbols by shorter bit sequences. In this particular case, the algorithm would use the following substitution table: A=0, B=10, C=110, D=111
01101110100 (ACDABA = 11 bits)

Lempel-Ziv 77 algorithm
Dictionary is a portion of encoded sequence
The encoder examines the input sequence through a sliding window
The window consists of two parts:
a search buffer that contains a portion of the recently encoded sequence, and
a look-ahead buffer that contains the next portion of the sequence to be encoded.

Relies heavily on zlib, the compression library used in gzip
Also defines a few data type specific compressors; user-defined compressors can be added using SCAPI (Semantic Compressor API)
During compression, each XML tag is examined to see which compression technique(s) should be applied

View XML as a tree
Separate the tree structure and what is stored in leaves
Save the tree structure so that it can be restored
The compressed file may or may not remember the tree structure

XMill applies 3 principles during compression:
Separate structure (element tags and attribute names) from data
Group related data items in a single container; compress each container separately
Apply appropriate semantic compressors to each container

Start tags and attribute names are dictionary-encoded (as T1, T2, etc.)
End tags replaced with ‘/’ token
Data values replaced with their container number

SAX Parser
sends tokens to the path processor.

Path Processor
determines how to map data values to containers.

Semantic Compressors
compresses the input and copies it to the container – in the memory window.

Wednesday, March 30, 2005

Design Patterns

A good break on how to go about the design patterns reading

Design Patterns Navigation

l Factory Method Session 1
Begin with Factory Method. This pattern is used by a number of patterns in the book and throughout the patterns literature.
u Strategy Session 2
Strategy is used frequently throughout the book, and an early knowledge of it helps in understanding other patterns.
n Decorator Session 3
For an early dose of elegance, nothing is better than the Decorator. The discussion of "skin" vs. "guts" is a great way to differentiate Decorator from the previous pattern, Strategy.
n Composite Session 4
The Composite pattern appears everywhere and is often used with Iterator, Chain of Responsibility, Interpreter, and Visitor patterns.
u Iterator Session 5
Reenforce the reader's understanding of Composite by studying Iterator.
u Template Method Session 6
The author's footnote to Iterator explains that a method called "Traverse" in the Iterator example code is an example of a Template Method. This pattern also reenforces Strategy and Factory Method.
l Abstract Factory Session 7
The reader now returns to the second-easiest creational pattern, the Abstract Factory. This pattern also helps reenforce Factory Method.
l Builder Session 8
The reader now may compare another creational pattern, the Builder, with the Abstract Factory.
l Singleton Session 9
Singleton is often used to model Abstract Factories, as the "Related Patterns" section of Singleton describes.
n Proxy Session 10
The reader now has a chance to learn how Proxy is used to control access to an object. This pattern leads directly into the next pattern, Adapter.
n Adapter Session 11
The Adapter pattern may be compared with what the reader understands about Decorator, Proxy, and later, Bridge.
n Bridge Session 12
Finally, the reader learns how the Bridge pattern differs from both the Adapter and Proxy patterns.
u Mediator Session 13
Now the reader learns the Mediator pattern, in preparation for understanding Observer and the Model-View-Controller design.
u Observer Session 14
Discover how the Mediator is used by the Observer to implement the classic Model-View-Controller design.
u Chain of Responsibility Session 15
After exploring how messages are passed using the Observer and Mediator patterns, the reader now may contrast how messages are handled by the Chain of Responsibility pattern.
u Memento Session 16
The reader now moves on to Memento. This pattern leads directly into a discussion of undo and redo, which is related to the next pattern, Command.
u Command Session 17
The Command pattern is used in a number of ways, one of which relates to the previous pattern, Mediator.
l Prototype Session 18
Perhaps the most complex creational pattern, Prototype is often used with the Command pattern.
u State Session 19
The reader may now study State to understand another way an object's behavior changes.
u Visitor Session 20
Visitor is often combined with the Composite and/or Iterator patterns.
n Flyweight Session 21
The Flyweight pattern is one of the more complex patterns. An examples use of this pattern is described in the next pattern, Interpreter.
u Interpreter Session 22
The Interpreter pattern is complex. It makes reference to and helps reenforce one's understanding of Flyweight and Visitor.
n Facade Session 23
The final pattern to read is Facade. Facade is relatively straightforward and follows nicely after Interpreter since the example code is similar in theme to example code in the Interpreter.

Thursday, March 24, 2005


Had this nice presentation in the class about the XMLHTTPRequest, this thing sounds cool and had a goodd debate in the class.

Some of the good points about it
i) When i submit in a form the data is sent to the server and returned on the same page , rather than submitting the command and executing the code.
ii) Normally uses javascript
iii) Gmail , suggest of google , ta-da list etc uses this technology.

A good debate on why XMLHttpRequest should be used , and there was a good conclusion is that we needed structure, this might well be attained by putting { } and get strucutre, i.e if i require structure in C i would use struct and i require structure on web i use XML , but XML is not the only method, bytestrams seems the other but they are not standardized, parentheses coudl be uses( apple properties scripts when in methods could be put into that this is not possible from XML of putting methods inside it.

So why use it is more of convenience.

RSS is another important thing that came out of this talk, this is used in newsfeeds the data has date times , links and stardardixed for that , i would not rather require it in the chat environment.

Learnt a bit of XPATH ( how does it act as a devil in comaprison) to be seen



Considering , i have not been posting a lot , does it make me busy perhpas not but my lethargy wins over me again .

Good link for shuffle

public class Shuffle {
public static void main(String[] args) {
int N = Integer.parseInt(args[0]);
String[] a = new String[N];

// read in data
for (int i = 0; i < N; i++) {
a[i] = StdIn.readString();

// shuffle
for (int i = 0; i < N; i++) {
int r = (int) (Math.random() * (i+1)); // int between 0 and i
String swap = a[r];
a[r] = a[i];
a[i] = swap;

// print permutation
for (int i = 0; i < N; i++)


Thursday, March 17, 2005

XSL and Adaline

This blog this is really cool because it reflects on what i learnt today.

1. Squeak installation had no sucess ( with iconv_open failing again)
2. Learnt XSL , xmlns ( this is just a trick) , namespaces, xsl:stylesheet, xsl:attribute.
Entities in DTD. ( need to study XSL and will update this site).
3. Really need to catch up with my Neural networks class because i think i am lagging behind with respect to perceptron and adaline.

So presentations that I need to study on

1. XMill
2. SVM in neural networks
3. C# Neural networks framework.
4. Query management in sensor networks


Wednesday, March 16, 2005

XML DTD ( Document Type Definition)

So what the hec is a DTD
i) It is like a grammar to the document to check its well formedness
DTD can be internal or external to the XML file as such. It is kind of space sensitive with respect to the space between the element name and element value . An example for the inline DTD could look like

Tuesday, March 15, 2005

Oracle Accounts

How do you check whether an account exist in database for a paritucalr user?

In oracle as we are aware there are system tables which give us the information. Oracle internally converts a lot of things captialized, so dba_users is the table to view. How do you check with the query

1. select username from dba_users where username = '';

If this shows an entry that means the user exists in the database.

Some more interesting views for user management are :

i) dba_ts_quotas ( for viewing tablespace quotas )
ii) dba_blockers
iii) dba_waiters

( Will update this more as I find out about them)

Ok so today i get some time to update this site some more tables that might interest DBA in Oracle 9i are

i) dba_role_privs
iii) dba_data_files
iv) dba_tablespaces

Monday, March 14, 2005

XML, Sensor Networks

So I finished my XML assignment, got XMLSpy to verify the wellformedness of the document. Other than that it looked a decent assignment, actually the second one was good with respect to some recursion and for loops it involved the output of ls -laR to be shown in an XML format. So i was able to do it, with most of the metadata information shown as attributed and the rest whatever i wanted to show on page as element. I need to write the ReadMe for it and install rxp to test it again and i am rather too bored to do anything now.

When i get bored i ususally change the topic so that something might interest me, so here I choose sensor networks, the last time we discussed the routing algorithms in sensor networks. That was an interesting discussion, we discussed all kinds of things
i) should there be acknowlwdgment in sensor networks?
ii) should i always broadcast ?
iii) should it have all the concepts of header and all as with TCP/IP?
( Hmm need to check notes as to discuss what all we discussed and again my lethargy wins overr me and i am in no mood ot get my notes to write that)

Sending message to the bast station is called UPSTREAM and sending message from the base station down to the sensor nodes is called down stream.

So we finally came to the conclusion that the sensor networks use broadcasting with similar to the concept of OS a mutex wherein the Base station initiates the transmission by braodcasting and nodes which are in the vicinity of th base station gets the message . Every node in the sensor network has a route table which it needs to fill so that it can send packets accordingly.

Friday, March 11, 2005

Squeak and XML

Today I tried to get my hands dirty on Squeak , never heard about it but because of a profesor reuqest to install , so Squeak is a small talk implementation. Never knew something about it till I heard about it. Have to see about its installation and see what it is all about.

And then trying my hands on XML , now the million dollar question that occurs is "what should be the element and what should be the attribute", and upon reading some articles it seems like the whatever that you want to show on the page should be an element and the rest should be an attribute , some people claim that element should be the information used to preserve the structure while the attributes are the one which are used for metadata information.

I know of some classes in .NET like XMLReader and XMLWriter that could be used for the assignment that I am doing but this has to be done in java without any classes.

Apart from all this India is on the verge of a victory against pakistan which keeps me happy today :)

Thursday, March 10, 2005

Solaris NIS and NFS

I think I understand a bit of NIS , that it is a kind of database used by solaris to maintain accounts and NFS is the file system for solaris networking.

But what are the intricacies withe respect to it, what happens when the following things are run
i) make passwd, passwd -r nis and make passwd.adjunct?

This is something that interests me and may be worthwhile to spend some time on it? So plan for the day is to atleast what these are for me. Good place to start

Request Tracker Installation
1. ( This has to be followed)

Wednesday, March 09, 2005

Request Tracker Installation

Have been trying to get the request tracker installation on solaris, have got it installed on one machine and getting to shift it on another machine I am facing some serious problems.

Need to get this out of my head soon, because high time I need to finish it.
It is just not request tracker but the RTFM (it is not Read The Fascinating (;)) manuals ) but Request Tracker FAQ Manager

The final steps for installing RT which are very important are
h3. error could be resolved by modifying the file in /lib/

and adding line of socket=> 'inet' in the last occurence of stderr.

4. Installing RTFM requires modofuing the makefile path of RTFM to point to the current instance of RT and the path of mysql needs to be therre on the shell.

Some packages needs to be installed from CPAN for RTFM.


Tuesday, March 08, 2005

Shell Programming


. Precede a script name with dot-slash when executing interactively so UNIX knows that the script is in the current directory.

.Redirect stderr, either to the same destination as stdout or to a unique file.

./ >> ${LOG_FILE} 2>&1

2. Permutations algorithm?
Is it difficult, lets try to work through an example "abc"
abc bac cab
acb bca cba

i) As seen from the pattern the first string is fixed and the last strings keeps on changing.
Thinking of the algorithm ( Natural thought is a recursion, but currently i want
to avoid it and do it by iteration)