Wednesday, August 24, 2005



1. Binary Search
2. Permutation
3. Combination
4. Telephone words


Friday, August 19, 2005

Condor Project

Site Reference:

High throughput computing
- Develop

Specialized workload management system for computer intersive jobs.
Condor provides
- a job queuing system
- scheduling policy
- priority scheme
- resource monitoring
- resource management

Users submit their serial or parallel jobs to Condor, Condor places them into a queue, chooses when and where to run the jobs based upon a policy, carefully monitors their progress, and ultimately informs the user upon completion.

Similar to batch queueing system. Condor can be used to seamlessly combine all of an organization's computational power into one resource.

Condor provides high throughput
- Available resources more efficient. (That is resource usage more efficient)
- Expands the resources availabe to users.

Thursday, August 18, 2005

What is a Random Variable?

Random Variable

The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we often want to represent outcomes as numbers. A random variable is a function that associates a unique numerical value with every outcome of an experiment. The value of the random variable will vary from trial to trial as the experiment is repeated.

There are two types of random variable - discrete and continuous.

A random variable has either an associated probability distribution (discrete random variable) or probability density function (continuous random variable).


  1. A coin is tossed ten times. The random variable X is the number of tails that are noted. X can only take the values 0, 1, ..., 10, so X is a discrete random variable.
  2. A light bulb is burned until it burns out. The random variable Y is its lifetime in hours. Y can take any positive real value, so Y is a continuous random variable.
Probability Distribution

The probability distribution of a discrete random variable is a list of probabilities associated with each of its possible values. It is also sometimes called the probability function or the probability mass function.

More formally, the probability distribution of a discrete random variable X is a function which gives the probability p(xi) that the random variable equals xi, for each value xi: p(xi) = P(X=xi)

It satisfies the following conditions:

  1. 0 <= p(xi) <= 1
  2. sum of all p(xi) is 1

Cumulative Distribution Function

All random variables (discrete and continuous) have a cumulative distribution function. It is a function giving the probability that the random variable X is less than or equal to x, for every value x.

Formally, the cumulative distribution function F(x) is defined to be: F(x) = P(X<=x)
for -infinity < x < infinity

For a discrete random variable, the cumulative distribution function is found by summing up the probabilities as in the example below.

For a continuous random variable, the cumulative distribution function is the integral of its probability density function.

Discrete case : Suppose a random variable X has the following probability distribution p(xi):
0 1 2 3 4 5
1/32 5/32 10/32 10/32 5/32 1/32
This is actually a binomial distribution: Bi(5, 0.5) or B(5, 0.5). The cumulative distribution function F(x) is then:
0 1 2 3 4 5
1/32 6/32 16/32 26/32 31/32 32/32

F(x) does not change at intermediate values. For example:
F(1.3) = F(1) = 6/32
F(2.86) = F(2) = 16/32

Tuesday, August 16, 2005

SQLPLUS formatting

set recsep off
set linesize 100
set newpage 0
set pagesize 999
column famname format a15
column rufname format a10
column gebdat format a10
select famname, rufname, gebdat, om from e01e001 where...
FAMNAME        RUFNAME    GEBDAT             OM
-------------- ---------- ---------- ----------
Lippmann Richard 09.07.1966 12
Meier Luise 01.01.1834 13


Oracle Tablespace usage query

select tablespace_name, owner, sum(bytes)/1024/1024 Mb
from dba_segments
where owner not in ('SYS','SYSTEM')
group by tablespace_name, owner
order by 3

A good reference for sqlplus


Monday, August 15, 2005

Market Basket Data set


THE X Window System
Wikipedia site are good references for this


Wednesday, August 10, 2005

Perl Tricks

The code:

BEGIN{unshift @INC, "/tmp"}

can be replaced with the more elegant:

use lib "/tmp";

Which is almost equivalent to our BEGIN block and is the recommended approach.

@alphabet_array = ('a' .. 'z');


Tuesday, August 09, 2005

/etc/shadow file structure


As with the passwd file, each field in the shadow file is also separated with ":" colon characters, and are as follows:

  • Username, up to 8 characters. Case-sensitive, usually all lowercase. A direct match to the username in the /etc/passwd file.

  • Password, 13 character encrypted. A blank entry (eg. ::) indicates a password is not required to log in (usually a bad idea), and a ``*'' entry (eg. :*:) indicates the account has been disabled.

  • The number of days (since January 1, 1970) since the password was last changed.

  • The number of days before password may be changed (0 indicates it may be changed at any time)

  • The number of days after which password must be changed (99999 indicates user can keep his or her password unchanged for many, many years)

  • The number of days to warn user of an expiring password (7 for a full week)

  • The number of days after password expires that account is disabled

  • The number of days since January 1, 1970 that an account has been disabled

  • A reserved field for possible future use

/etc/passwd structure



Monday, August 08, 2005

SQL tutorial

For a quick SQL tutorial,


Friday, August 05, 2005

Unix for advanced users

2.3. Who is permitted: /etc/passwd, /etc/shadow, and /etc/group

The system needs some way of verifying that you are who you say you are when you log in. Likewise, the system needs to know what you are authorized to do, once you have gained access.

2.3.1. What is /etc/passwd?

/etc/passwd is the authentication database for a Unix machine. (It is also a file which maps usernames to user IDs or UIDs by which the Unix kernel recognizes a user.) It contains a list of users that the system recognizes. Each line in the file represents a different user account.

You can look at the password file on your machine.

cat /etc/passwd at the prompt. Entries in /etc/passwd look something like this:

arushkin:Igljf78DS:132:20:Amy Rushkin:/usr/people/arushkin:/bin/csh
trsmith:*:543:20:Trent Smith, sys adm:/usr/people/trsmith/:/bin/tcsh

Although these entries differ in terms of the way the information is presented within the fields, they are both valid /etc/passwd entries. Note how each line contains seven different fields, each separated by a colon.

  • Login name - The username of the account

  • Encrypted password - the password that has been encrypted by the system. Note that in the second example the encrypted password has been replaced by an asterisk.

    In the entry for arushkin the password has been encrypted by the system and appears as a nonsensical string of characters. In the entry for trsmith the password field is occupied by a placeholder. This can mean that the user does not have a password, or that a shadow password file is in use. In the latter case, the actual password is kept in /etc/shadow.

    If an account does not use a password, a placeholder is put in the password field rather than leaving the field blank. A blank field constitutes a security hole through which an unauthorized user could gain access to the system.

  • User ID - each user on the system is assigned a unique identification number. That number is contained here.

  • Default Group ID - The Group ID is the number of the group that the user is a member of when they log in.

  • GCOS field - this field has no defined syntax and is generally used for personal information about the user; full name, phone number, room number, etc. Sometimes this field is not used at all.

    The curious may ask what GCOS means. The acronym GCOS comes from GECOS - General Electric Comprehensive Operating System. This was later shortened to General Comprehensive Operating System while competitors at Honeywell sarcastically referred to it as God's Chosen Operating System. The name is merely nostalgic residue from a General Electric machine that spooled print jobs from one of the first UNIX machines at Bell Labs.

  • Home directory - contains the path of the user's home directory

  • Login shell - contains the path of the user's default shell after login

2.3.2. What is /etc/shadow?

/etc/passwd has to be world readable so that programs can make User ID to username translations. Using an encrypted password in that file would mean that anyone with access to the machine could use password cracking programs (such as crack) to break into the accounts of others. To fix this problem, the shadow password system was created.

The /etc/passwd file in the shadow system is world-readable but does not contain the encrypted passwords. Another file, /etc/shadow, which is readable only by root contains the passwords. SVR4 based systems support a command called pwconv, which creates and updates /etc/shadow with information from /etc/passwd. When /etc/shadow is used an 'X' is placed in the password field of each entry in /etc/passwd. This tells pwconv not to modify this field because the passwords are kept in /etc/shadow.

If /etc/shadow doesn't exist pwconv will create it using the information in the /etc/passwd file. Any password aging controls found in /etc/passwd will be copied to /etc/shadow. If the /etc/shadow file already exists, pwconv adds entries in /etc/passwd to it as well as removing entries that are not found in /etc/passwd.

Entries in /etc/shadow look something like this:


The various fields are:

  • Login name.

  • Encrypted password.

  • Date that the user's password was last changed.

  • Minimum number of days that a password must be in existence before it can be changed.

  • Password's life span. This is the maximum number of days that a password can remain unchanged. If this time elapses and the user does not change the password, the system administrator must change it for them.

  • The sixth field is used to dictate how long before the password's expiration the user will begin receiving messages about changing the password.

  • The seventh field contains the number of days that an account can remain inactive before before it is disabled and the user can no longer log in.

  • The eighth field can be used to specify an absolute date after which the account can no longer be used. This is useful for setting up temporary accounts.

  • The last field is the flag field and is not used.

Not all flavors of Unix use all of these controls. In addition, the syntax of aging controls varies from platform to platform. To find out which aging controls can be set on a particular system it is best to consult the man page for passwd, usermod, etc. On some systems aging controls can also be added to an account at the time it is created using graphic tools.

2.3.3. What is /etc/group?

/etc/group contains the names of valid groups and the usernames of their members. This file is owned by root and only root may modify it. When a new user is added information on what groups they are a member of must be added here. Group IDs (GID's) from the /etc/passwd file are mapped to the group names kept in this file.

Each user in a system belongs to at least one group. Users may belong to multiple groups, up to a limit of eight or 16. A list of all valid groups for a system are kept in /etc/group. This file contains entries like:


Each entry consists of four fields separated by a colon. The first field holds the name of the group. The second field contains the encrypted group password and is frequently not used. The third field contains the GID (group ID) number. The fourth field holds a list of the usernames of group members separated by commas.

The commands id or groups can be used to see which group(s) you belong to.

GID's, like UID's, must be distinct integers between 0 and 32767. GID's of less then 10 are reserved for system groups. These default GID's are assigned during the installation of the operating system. Typical system groups and GID's are listed below.

For Linux:

    GID 0 root
    GID 1 bin
    GID 2 daemon
    GID 3 sys
    GID 4 adm
    GID 5 tty
    GID 6 disk
    GID 7 lp
    GID 8 mem
    GID 9 kmem

For Solaris:

    GID 0 root
    GID 1 other
    GID 2 bin
    GID 3 sys
    GID 4 adm
    GID 5 uucp
    GID 6 mail
    GID 12 daemon


    GID 0 sys, root
    GID 1 daemon
    GID 2 bin
    GID 3 adm
    GID 4 mail
    GID 5 uucp
    GID 20 user

For HP-UX:

GID 0 root
GID 1 other
GID 2 bin
GID 3 sys
GID 4 adm
GID 5 uucp
GID 6 mail
GID 20 users

Tuesday, August 02, 2005

phpATM vulnerability

(7) HIGH: phpATM Remote File Include Vulnerability
phpATM version 1.21 and earlier

Description: phpATM software provides file upload and download functions
for web severs. This software contains a file include vulnerability. An
attacker can pass a PHP file location to the "include_location"
parameter, and execute arbitrary PHP code on the webserver running
phpATM. This flaw has reportedly been exploited in the wild.

Status: phpATM has released version 1.30 that fixes the issue.

Council Site Actions: The affected software and/or configuration is not
in production or widespread use, or is not officially supported at any
of the council sites. They reported that no action was necessary.

Posting by Ingvar
Vendor Homepage
SecurityFocus BID

Monday, August 01, 2005

MySQL Import , Export Utility

Database Backups in Mysql

Database Backups

Because MySQL tables are stored as files, it is easy to do a backup.

mysqldump --user=root --password=xxxx --opt mysql > mysql.sql
mysqldump --user=root --password=xxxx --quick mysql > mysql.dump
mysqlhotcopy --user=root --password=xxxx --allowold --keepold mysql /home/zahn/backup