1. Introduction To Database

  2. Data Base Management System

  3. Data Base Administrator

  4. Data Base Models, Integrity

  5. E/R Diagram and Data Base Schema

  6. Normalization

     

  7. Introduction To SQL

  8. Data Dictionary

  9. AS/400 Useful Commands

  10. DBS201 Material

  11. DBS301 Material

 

Introduction To Database

Definition:
--------------
Data base system is essentially nothing more than computerized record
keeping system (Date).


In other words Data base can be regarded as a kind of electronic filing cabinet,
i.e. repository for a collection of computerized data files.
Users are given the same facilities as they have with a normal filing system.

-- Creating new files
-- Adding new information
-- Updating existing information
-- Retrieving information
-- Discarding unwanted files
-- Controlling access to the files.

As indicated earlier Data base is a file system therefore, the
terminology associated with the files is also applicable to the
Data bases. Some key definitions are:

Data Item: Its the smallest unit of named data. Its equivalent to
the field in the file system.

Data Attrinute: Its a logical grouping of the data items. Essentially
data in the data base exists as data items.

Record: Its group of data items that are logically connected.

File: Its group of records.

Physical Record: Its a basic unit of data which is read or written
by a single input/output operation.

Extent: Its a collection of physical records that are contiguous
in the secondary storage.

Need for Data Bases
----------------------------

-- Data is a vital part of all the businesses, organizations and employers.
-- Services which collect, organize and process data are increasing.
-- Data is just not numbers its include images, graphs, speech...
-- With advancement of technology, data production is growing and
storage cost is decreasing. Its important that data is stored
intelligently and managed efficiently.

The key element that makes Data bases different from the file system
is the property of that requires files comprising Data bases must
be inter-related. Therefore, the definition of the Data base is:

Its a file system in which files are inter-related.

Some key desirable properties of the Data base are:

-- Minimum Redundancy: This means data duplication or data replication.
Based on the definition of the data base given above we cannot
eliminate data redundancy but objective is to minimize it as much as
possible.

-- Performance: This property requires response time to be as fast as
possible. This means that request from the user should be processed
by the system with minimum possible delay.

-- Integrity: This property implies that data in the data base system
is trust worthy. This means no surprises when accessing data from the
data base.

-- Security and Privacy: Data base system must be secured that means
no unauthorized users should be allowed access and data in the data
base should be made avialble on per need basis. Elements of security
are - Authorization (user id) and authetication (password). Privacy
involves data hiding.

-- Interface with Past and Future: this property enables data bases
to be flexible. This means that the design of the data bases must
ensure that existing data can be accomadated as well its easy to
expand the data base without any problem to accomadate the expansion.

-- Turnability: It implies that changes into the data bases can be
made easily and put into operation without any difficulty.

-- Migration: This property means data bases can be easily ported
from one system to another.

The objective of the Data base is - To make application development
easier, faster, cheaper and flexible.
The objectives can be grouped under 2 categories:
Primary and Sceondary.

Primary Objectives
-----------------------

-- Performance
-- Clarity
-- Intellectual Investment Protection
-- Ease of use and flexibility
-- Adaptability to instant requests
-- Privacy and Security
-- Integrity
-- Less data proliferation
-- Easier to maintain
-- Low cost

Secondary Objectives
-------------------------

-- Physical data independence
-- Logical data independence
-- Controlled redundancy
-- Suitable retrieval
-- Data standardization
-- Data Dictionary
-- High Level language interface
-- Integrity control (range checking)
-- Design and monitoring tools
-- Turnability
-- Automatic migration and reorganization
-- Easy convertibility to distributed data base

Back To Content

***

Data Base Management System

Definition:
---------------
Database management system is a software that handles access
to the database.


Operational Process

--User issues an access request (via DB language)
--DBMS intercepts the request and analyses the request.
--DBMS inspects it, retrieves relevant mappings and
executes the request.
--DBMS presents the request to the user.

Following diagram describes the above operational process.



Following are some of the reasons for the success of the DBMS.

-- Top management understanding and the support of the objectives.
-- Concentration of well specified, profitable uses of DB.
-- Development of DB which relates to corporate subjects rather
than computer applications.
-- Corporate wide planning by the DBA.
-- Co-existence of old DB with the new DB.
-- Thorough education/trainning.
-- Tight control by the DBA.
-- Powerful DB integration language.
-- End user involvement.
-- Good data dictionary.

Following are some of the reasons for the failure of the DBMS.

-- Dissenting opinions, factions etc..
-- High expectation.
-- Trying to satisfy each business phase and application.
-- Fragmented plans.
-- No support between old and new DBs.
-- Lack of knowledge/training.
-- Lack of control.
-- No user involvement.
-- Casual approach to DB standardization and version control.

Back To Content

***

Data Base Administrator

Definition:
--------------
Data Base Administrator is a person or a group of persons,
responsible for the overall control of the whole DBMS.

Roles and Responsibilities of the DBA.


-- Deciding the information content of the DB (interface
with end users).
-- Deciding the storage structure and access strategy (Interface
with the system architectures and key technical persons).
-- Defining security and integrity criteria.
-- Defining strategy for Backup and Recovery.
-- Monitoring performance and responding to changing
requirements (Interface with 3rd party vendors).

Some of the tools or utilities used by the DBA for
perforing its tasks include:

-- Load routines.
-- Dump/Restore.
-- Reorganizing.
-- Statistics collection.
-- Analysis.

Some of the daily functions of DBA include:

-- Data Definition, standardization and liasion.
-- Design of DB organization and techniques.
-- Perforamnce measurement and improvement.
-- Software and hardware selection.
-- Security control.

Back To Content

***

Data Base Models, Integrity

There are 3 main types of Data Base Models.

1. Hierarchical
2. Network
3. Relational

Hierarchical Model
----------------------
This model is referred to as a upside down Tree structure.
The elements of this model are known as nodes.
The uppermost node is called Root node. The later nodes are either Parent node or Child node depending upon the situation. Parent- child nodes are inter-changeable.
The main restriction of this model is - Each child can have only One Parent. One parent can have more than 1 child.

Network Model
-----------------
This model is also known as PLEX structure. In this model the restriction of child having 1 parent does not exist. Every child can have more than 1 parent and every parent can have more than 1 child.
This model is very flexible. Network model usually consists of records and links.
In the network model any item can be related to any item.
In the network model, mapping between parents and children is similar to hierarchical model, but mapping between parent-to-children is very complex.

Relational Model
--------------------
Relational model represents a model that combines the simplicity of the hierarchical model with the flexibility of the network model.
Relational model is constructed using the entities. An entity is defined as an item about which information is stored in the data base. An entity can be Tangible or non-tangible. An example of Tangible entity would be Employee. An example of Non-tangible entity would be Customer account.
Entities are defined using attributes. An attribute is the property of the entity for which information is stored.
These attributes are also known as columns. The group of column is known as row or a tuple. Also, a row can be defined as an instant of an entity.
The entities are linked to each other using realtionships. The relationship between entities can be of different types. The main types of the relationship are:

One-To-One
--------------
In this type of relationship one entity is connected or linked to another entity. Eg.,
Manager<------------->Department
Every manager can manage only 1 department and every department can have only 1 manager.

One-To-Many
---------------
The realtionship between 2 entities is of more than 1. Eg.,
Employee<--------------->Department
Every employee can be in 1 department only but 1 department can have more than 1 employee.

Many-To-Many
----------------
At a given time each entity can be linked to another in many form. Eg.,
Employee<-------------->Project
An employee can be assigned to many projects and 1 project can have many employees.
In relational model the entities and their relationships are represented by 2 dimensional array or table.
Every table represents an entity.
Every table consists of Rows and Columns.
Relationship between entities are represented by columns.
Each column represents an attribute of the entity.
The values in the columns are drawn from a domain or set of all possible values.
The columns of the entity that are used to link the entities are known as Keys.
There are 2 types of keys. Primary and Foreign.
Primary key is defined as the Entity identifier. It uniquely identifies the entity. Example: The Social-Insurance-Number. The SIN identifies every person uniquely.
Foreign Key is defined as a Primary key of 1 entity that exists as an Attribute in another entity.

Advantages of Relational Model.

1. Ease of Use.
2. Flexibility.
3. Data Independence.
4. Security.
5. Ease of implementation.
6. Data merging.
7. Data integrity.

Disadvantages.

1. Redundancy.
2. Performance.

Data Base Integrity
----------------------
Following are the types of integrities.

1. Entity Integrity.
2. Column Integrity.
3. Row Integrity.
4. Referential Integrity.
Entity Integrity is ensured by not allowing duplicate table or entity names in the same data base. All entities in the data base have unique name.
Column Integrity is ensured by not allowing 2 columns within the same entity have same names. In other words no duplicate names are allowed for the column within same entity.
Row Integrity is ensured by the Primary Key value. Every value of the Primary key represents an unique instant of the entity.
Referential Integrity refers to the link or relationship. This integrity ensures that No child in the data base are exists for which there is no parent. In other words no Foreign key can exists without a Primary key.

Back To Content

***

E/R Diagram and Data Base Schema

To design the Data Base it is desirable to design system by:

· Listing files/tables with their properties or attributes.
· Identifying relationship between the tables without any restriction.

Such approach is possible with the Entity Relationship Model or Diagram.
This model was introduced by P.P.Chen in 1976. The model is based on the
fundamental concepts of data base:

· Entities
· Relationship

The other advantage of this model is that it allows us to represent relationships which are of many-to-many types.
Some of the key points about this model are:

· Entities are represented by rectangular boxes.
· Relationships are shown by straight lines with arrow heads representing the relationship type.
Single head represents 1 and double head represents many. Some authors represent relationships using number and alphabets. In this notation 1 is used to represent One and any alphabet from A to Z can be used to represent many.

Following are some of the optional symbols that are also used by the model.

· Diamond is used to show the flow of the relationship.
· Ovals are used to represent attributes of the entity.
· Double rectangle is used to represent Weak entity.

Weak entity is defined as an entity that is dependent upon the other entity. For example if we have 2 entities, Employee and Employee-History then Employee-History will be called Weak entity as when employee leaves its history will have to be eliminated. In other words if we delete Employee entity from the data base we will have to delete Employee-History from the data base as well.
Lets construct the E/R diagram for the following problem.

A company has number of employees that are working on number of projects. These projects are being developed for various clients using different equipment. The company assigns a representative to every client for the follow up. Company maintains the work history of each employee. Draw the E/R diagram and create the Data Base Schema.
The given problem has the following entities:

· Company
· Employee
· Work-History
· Client
· Project
· Tool
· Rep

The relationship between the entities is of the following type.

· Company and Employee, One-To-Many
· Employee and Work-History, One-To-Many
· Employee and Project, Many-To-Many
· Project, Client and Equipment, Many-To-Many
· Client and Representative, One-To-One.

E/R Diagram



The above E/R diagram represents the Relations or Entities with their interaction or Relationships with each other. The E/R diagram needs to be converted into a schema or layout so that it can be stored into the system. This schema is called Data Base Schema. To convert an E/R diagram into Data Base Schema we can apply the following rules.

· Represent every Entity of an E/R diagram as a Table with its attributes.
· If the 2 entities have One-To-One relationship then take the PK of one entity and put as an attribute into the other entity.
· If the 2 entities have One-To-Many relationship then take the PK of the entity represented by 1 and place it as an attribute in the other entity represented by Many.
· If 2 or more entities have Many-To-Many relationship the create a new entity with PK being the composite key consisting of PK of the connecting entities.

Applying the above rules the Data Base Schema for the E/R diagram given above would be as follows.
DATA BASE Schema

Company[Company-id, Company-Name]
Employee[Employee-id, Employee-Name, Company-id]
Work-History[Record-id, Employee-id]
Assign[Project-id, Employee-id]
Project[Project-id, Project-description]
Tool[Tool-id, Tool-description]
Client[Client-id, Client-Name, Rep-id]
User-For[Project-id, Tool-id, Client-id]
Rep[Rep-id, Rep-Name]

In the above Schema the attributes that are underlined is the Primary Key (PK)


Back To Content

***

Normalization

Data Normalization is a valuable logical design technique or method in data base.
This method can be used to normalize any data bases but it is mainly used to normalize relational data bases.

Benefits of Data Normalization
 

 

Back To Content

***

Introduction To SQL

General

SQL (structured Query Language) is a language that allows Data base interaction. SQL can be used to create Data base, maintain the data base and retrieve data from the data base. The SQL statements can be grouped into 3 major categories:

DDL, DML and DCL.

DDL allows the users to create data base objects. DML allows users to manipulate the objects and DCL allows the users to control the objects.

The CREATE statement of the SQL is the major statement of the DDL component. The DML component consists of the SELECT, INSERT, DELETE, UPDATE, ALTER, and DROP statements. DCL component consists of GRANT and REVOKE statements.

SQL in principle is a standard language that is controlled by the body called ANSI (American National Standard Institute). However, different vendors like IBM, ORACLE have different versions of  SQL as the rules governing the DDL component of the language are not very well defined. The major area is the definition of the fields or columns that define the Table object. The naming conventions for fields or columns are different and the size for the vendor defined data types could be different as well. Therefore, when moving from one vendor to the other we have to only understand the creation of the objects.

FORMAT DDL

Following is the general format for entering the SQL statements (consult user provided manual for exact syntax).

CREATE: Create object_type user_defined_object_name (user_defined_fields_OR_Columns);

Object_type can be Table, View, Index or vendor specific objects.

User_defined_fields_OR_Columns contain 3 parts, name, datatype with size id applicable and constraint.

Constraints can be put on the column or can be defined at the end after defining all the fields. Constraints are not mandatory. To define a field or column you must give it a name, datatype with size.

Constraints are: Primary Key, Foreign Key, Not Null, Unique, Check. Default values can also be assigned to the column using Default clause.

FORMAT DML

SELECT: Select field_names_or_column_names From Object_name Where User_defined_condition;

Where clause is optional. This clause can be used to select data from the object that meets required criteria. Condition is build using conditional operators, =, <,>,<>(not equal),=<,=> and Logical operators, NOT, AND, OR.

The Select statement is the most used statement as this statement allows retrieval of the data from the object, Table. Therefore, it has number of clauses, Functions and Predicates. Clause can be to sort data, group data, perform arithmetic operations or select unique values. Functions are dependent upon data types. They allow the user to perform operations to provide specific value or manipulate the value contained in the field or column in a specific way. The predicates are used to manipulate existing data to return True or false result.

To merge data from multiple objects perform the Join operation.

Select can also use another select as part of the conditional clause (where). In this case the other select is called sub-query.

INSERT: Insert Into Object_name (field_or_column_names) Values (user_defined_values_for_field_or_column_names_that_meet_the_data_type);

Field_or_column_names are not mandatory if you are using the same names that are defined in the object. If you are giving the column or field names then they must be separated by comma(,). The values must match the data types of the column or field. Every value is separated by a comma(,). Numeric value is specified without using single quotes but character or string or alphanumeric value is specified within single quotes(‘value’). Value for Date data type is dependent upon the vendor so refer to the manual.

DELETE: Delete object_name where user_defined_condition;

Delete only removes the rows. It does not remove the object. The where clause is not mandatory, if you want to empty the entire object. To remove only particular rows then you have to use the Where clause.

UPDATE: Update Object_name Set field_or_column_name = new_value where user_defined_contition; Again the where clause is not mandatory if you want to reset the existing value to the new value in the entire object. You can specify values for more than 1 column or field by separating each one using comma(,).

ALTER: Alter object_type Object_name option. The option can be ADD to add new column, Modify to modify the existing columns or Drop to drop existing columns or constraints. Refer to the vendor manual for exact format as Alter has to be used with caution.

DROP: Drop object_type Object_name;

This statement removes the specified object.

FORMAT DCL

GRANT: Grant Object_privileges ON object_name To Users;

Object_privileges can be any DML statements discussed above. Column names can be specified depending upon the statement. User can be public or specific Ids.

REVOKE: Revoke object_privileges ON Object-name From Users;

 

All vendors have Transaction Control statements as part of their SQL. These statements are COMMIT, ROLLBACK and SAVEPOINT. Commit is to ensure data base is updated after the DML statement (except Select) has been issued. Rollback is used to restore the original value and Savepoint is used to selectively restore the original value. Read the vendor manual for proper syntax.

Back To Content

***

Data Dictionary

Coming soon

















Back To Content

 

AS/400 Useful Commands

The AS/400 Cheat Sheet lists commands useful for performing basic operational tasks on the AS/400. The command parameters can be viewed by utilizing the AS/400's online help.

How do I...?

Hit Counter


Contact: Arun Agarwal
E-Mail: arun@nithosting.com
Main Phone Number: (416) 490-1897