Monday, May 23, 2005

Concept HiearArchy

1. From Han and Kamber

Concept hierarchy is also important

    Discovered knowledge might be more understandable when represented at high level of abstraction

    Interactive drill up/down, pivoting, slicing and dicing provide different perspective to data

Different kinds of knowledge require different representation: association, classification, clustering, etc.


Monday, May 16, 2005

Writing a Good Technical Paper

From stanfords site


The Actual Paper: Writing

  1. start from the outline.
  2. Make the outline reflect the level of subsections: for each subsection, write no more than two lines describing the purpose/goal of that subsection. This text will NOT be part of the paper - it is only there to remind you what you are trying to accomplish. It is ESSENTIAL that you be able to capture the purpose of a subsection in one or two lines. If you cannot do this, then you probably don't understand what the subsection is really about, and when you try to write the text, it will be jumbled.
  3. Then, for each subsection, map out specific paragraphs: for each paragraph, write one sentence that explains the topic or main goal of just that paragraph. Again, this sentence probably will NOT make it into the actual text. It's important to keep it to one sentence. (As every style manual will tell you, including Strunk & White, virtually every well-formed paragraph does indeed have one sentence that explains the point of the paragraph, with the other sentences supporting or expanding on the point of the topic sentence.) If you cannot fit the point of the pargraph into 1 sentence, the paragraph is probably making >1 point, so should be split into multiple paragraphs.
  4. Read through everything you have written and see if it has a logical flow, ie if you believe it represents your work adequately.
  5. Give what you have written to a technical colleague completely unfamiliar with your work (but able to understand the computer science part), have them read it, then have them tell you (without looking at it) what s/he thinks the main point and contributions are.
  6. If all goes well, now replace the topic sentences with complete paragraphs.

    This way of writing will not yield a shakespearean work of literature, but it is consistent and will result in readable, logically organized prose by construction.

The Actual Paper: Revising/Editing

  • Your section organization will change. Sometimes it will be shuffled dramatically. This is fine; it means you're understanding what presentation order works best. If you don't go through at least three or four major revisions (where you move around or chop entire sections), it's probably lousy.
  • After doing some edits on each draft, give it a full top-to-bottom reading to evaluate its coherence and flow of ideas. Then, take a couple of hours and do something else; once you get close enough to your paper, you start missing the forest for the trees.
  • Even early drafts are valuable for getting your colleagues' comments. Get comments from people who you think may be skeptical of your approach. Get comments from people who will really rip your writing style apart. Remember, at least they are your friends; the conference referees probably are not.
  • Cite, cite, cite! Ask your colleagues for suggestions and pointers. You never want to be asked: "What about the work done by xxx, which obviously has something in common with your own?" (or worse: "...which refutes your own?") Give due credit to those whose efforts you build on, as well as pointing out how your approach is different from (and better than) previous ones.

About Writing

It's often said, correctly I think, that most technical people don't write well. This doesn't mean that they lack knowledge of grammar or spelling (though this is sometimes the case), but that they don't know how to organize their writing at the level of paragraphs.

  • Don't artificially formalize your writing style. Technical writing must be clear and concise. Overblown writing rarely fools anyone and it makes the paper boring to read.
    Bad: "Problem X is clearly a critical area that impacts our research agenda and hypothesis. Our ideas about problem X are embryonic and still evolving, and doubtless our ongoing work in this area will quickly yield fruitful results."
    Better: "We recognize that problem X is central to our agenda, but we have only begun to investigate it."
  • If you haven't read Strunk and White's The Elements of Style, read it now. If you have, read it again. If you can't organize a paragraph, you won't have much luck organizing a chapter.
  • Omit needless words. Don't be surprised if this turns out to be 30-40% of the words you originally wrote. Your first effort rarely captures the most vigorous or concise way to say something. Spend time tersifying.
  • Run your paper by someone who is anal retentive about grammar to catch common errors: misuse of which and that, non-words and non-phrases such as for all intensive purposes or irregardless, lack of parallel sentence structure...

Final Checks

Remember that this will be read by people who (a) have never heard of you and the review is anonymous anyway, (b) have never heard of your project, (c) are reading about 15-20 papers apiece, all in different subject areas. They will spend the first 5 minutes deciding if your paper is actually good enough to be worth a fully detailed read; they will then spedn an hour or so reading it in detail, trying to figure out (a) what your contribution is, (b) if the contribution is substantial enough to be worth publishing, (c) if the contribution is "feasible" (ie it is implementable and therefore would be useful to someone).

  • Does the paper make clear precisely what your new contributions are, and how they are different/better than existing approaches to this or similar problems?
  • Does the outline of the paper (sections, subsections, etc.) cohere regardless of the granularity at which you view it? (The Outline mode of MS Word is a valuable feature for this check. I also wrote a simple Perl script that does this for LaTeX files.)
  • Have you observed the following invariant: Before telling me what you did, tell me why I should care.
  • Have you made every important point three times--once in the introduction/abstract, once in the body of the paper, and once in the conclusions? (Bulleted conclusions are usually a good idea)
  • Have you had it read by at least one person familiar with each of the areas the paper impinges on? (Think of them as consultants in that area. There is a risk that you will get some of the details wrong in talking about an area that is tangential to the paper but that you're not very familiar with, and if a reviewer happens to be versed in that area, it decreases your credibility. Such references are easy to get right, so there is no excuse.)
  • Have you searched carefully for any related work, and properly acknowledged it? The availability of papers and search indices on the Web makes it worse than ever to overlook significant related work.
  • Are you able to capture the non-experts in the audience with the opening of your paper, and impress the experts in the body of the paper?
  • Can you read only the abstract and conclusions and be able to give someone else a 30-second digest of what the paper claims it says?

The Mac OS X Kernel

http://www3.interscience.wiley.com:8100/legacy
/college/silberschatz/0471694665/appendices
/appb.pdf

Sunday, May 15, 2005

XML Exam Fever Contd.... (XPATH)

Well one of the basics that we keep on forgettins is to learn XSL we need to know XPATH,
XPATH could be thougt of as like traversing the unix file tree structure and commands to traverse the tree

XPATH identifies the following 7 types of nodes
i) The root
ii) Elements nodes
iii) Attribute nodes
iv) text nodes
v) Comment nodes
vi) Processing instruction nodes
vii) Namespace nodes

XPATH is used to traverse the XML document or parts of the XML document.
XML declaration and the DTD are not addressable via XPATH.

The grammar of XPATH could be shown in this fashion

location-path: "/"? location-step ( "/" location-step )*;
location-step: axis "::" node-test ( "[" predicate "]" )*;

XML Exam Fever Contd....

XSL

Today I shall talk about XSL

1. XSL as we know contains of the
i) XSLT
ii) XPATH
iii) XSL-FO

Now XSLT has the following builtin templates
1. < xsl:template match='/|*' >
< xsl:apply-templates >
< /xsl:template match >
Arranges for visit on all elements

2. < xsl:template match='text()|@*' >
< xsl:value-of select='.' >
< /xsl:template match >
Arranges to output the text nodes however attributed are not visited by default

3. < xsl:template match='processing-instruction()|comment()' />
Skips the processing instructions and comments in the document.

-Kalyan

Saturday, May 14, 2005

XML Exam Fever Contd...

Checking the Document for Well Formedness

Some of the characteristics:

1. A document mus have one root element.
2. Elements must not overlap.
3. Attributes must be quoted.
4. Cannot have two attributes with the same name
5. The data of < and and signal need to be escaped.
6. Every start tag must have a matching end tag/
7. Comments and processing instructin may not appear inside tags

I like to end the chapter in punch line so here is the punch line for the chapter

XML names can have numbers of alphabets, the following special characters are the only special characters allowed
i) underscore (_)
ii) Hyphen (-)
iii) fullstop .

Element names cannot start with the following
Hyphen, . , number and they cannot have any other special characters other than those mentioned.

XML Exam Fever

1. So we knowXML is case sensitive.
2. Well formedness realtes to
i) ALL OPEN TAGS BEING CLOSED
ii) ALL ATTRIBUES NEEDS TO BE QUOTED.
3. XML documents are nothing but trees.
4. XML gives each child exactly one parent, if an element starts tag appears in an element tag it has to end also in that tag.
5. < strong > < em > < /strong> < /em> is NOT allowed in XML

6. Root element is also called document element. Every well formed XML document has only one root element, elements might not overlap and they have exactly one parent.

7. Mixed content model:
An element can have child elements in them as well as text, this is called the mixed content model.
8. Attributes:
Attributes must be quoted in XML documents.
An element cannot have the same attribute names repeated twice.
Single quotes in attribute names might be useful when the attribute name itself contain quotes.

9. Some of the legal names in XML are
1. A-Z, a-z, 1-9 , Special characters being
i)underscore _
ii) hyphen -
iii) period .
XML names cannot have spaces in their name , they need to only start with letters, a hyphen a period. No limit to the length of an elelment or other XML name

Query Cost Estimates

A query is normally associated with various costs
1. Access Cost
2. Storage Cost
3. Computation Cost
4. Communication Cost

To estimate the cost it needs some information , this information could be got through catalogs. Catalogs stored all kinds of statistics which are needsed by this cost functions to find out which the cost associated with the query.

So in the paper I can still use the concept of cost of a query in a sensor network and the statistics are nothing but the parameters (characteristics) of the sensor network.`

-Kalyan

Thursday, May 12, 2005

XPATH

XPATH has 7 types of nodes

1. Document root
2. element
3. attributes
4. text
5. Processing instruction
6. comments
7. namespace

XSL

What we keep forgetting about XSL, XSL consists of

1. XSLT - XSL Transformations
2. XPATH - navigational language in XML
3, XSL:FO - XSL formatting object

XSLT is the most important part of XSL where it converts a source XML tree to a XML result tree.

XSLT uses XPATH

w3cschools, XSLT tutorial is a good reference to begin with.

< xsl:template > element contains the rules to apply when a specified node is matched
The match attribute is used to asssociate the template with the XML element, match = "/" matches the whole document.

Monday, May 02, 2005

Gaussian Values

Gaussian Distribution
------------------------

The normal distribution percentage confidence values

percentage (confidence interval) 90 95 99
z 1.28 1.645 2.33

-Kalyan

Some Patterns in Java Programming

Need to read the file line by line

1. Take filename
2. FileReader fr = new FileReader(filename);
3. BufferedReader br = new BufferedReader(fr);

String line = "";

while ( line = bufferedReafder.readLine() ) {

}