Thursday, August 28, 2008 by Rajeev Singh
Data ModelsA model is a representation of reality, ‘real world’ objects and events, and their associations. It is an abstraction that concentrates on the essential, inherent aspects of an organization and ignore the accidental properties. A data model represents the organization itself. Let should provide the basic concepts and notations that will allow database designers and end users unambiguously and accurately to communicate their understanding of the organizational data.
Data Model can be defined as an integrated collection of concepts for describing and manipulating data, relationships between data, and constraints on the data in an organization.
A data model comprises of three components:
• A structural part, consisting of a set of rules according to which databases can be constructed.
• A manipulative part. Defining the types of operation that are allowed on the data (this includes the operations that are used or updating or retrieving data from the database and for changing the structure of the database).
• Possibly a set of integrity rules, which ensures that the data is accurate.
The purpose of a data model is to represent data and to make the data understandable.
There have been many data models proposed in the literature. They fall into three broad categories:
• Object Based Data Models
• Physical Data Models
• Record Based Data Models
The object based and record based data models are used to describe data at the conceptual and external levels, the physical data model is used to describe data at the internal level.
Thursday, August 14, 2008 by Ankit Goyal
A DBMS is a software package that carries out many different tasks including the provision of facilities to enable the user to access and modify information in the database. The database is an intermediate link between the physical database, computer and the operating system and the users. To provide the various facilities to different types of users, a DBMS normally provides one or more specialized programming languages called database languages.
Database languages come in different forms. They are: -
- Data Description Language (DDL)
- Data Manipulation Language (DML)
Data Description Language (DDL)
As the name suggests, this language is used to define the various types of data in the database and their relationship with each other.
The basic functions performed by DDL are: -
- Create tables, files, databases and data dictionaries.
- Specify the storage structure of each table on disk.
- Integrity constraints on various tables.
- Security and authorization information of each table.
- Specify the structure of each table.
- Overall design of the Database.
Data Manipulation Language (DML)
A language that enables users to access or manipulate data (retrieve, insert, update, delete) as organized by a certain Data Model is called the Data Manipulation Language (DML). It can be of two types: -
- Procedural DML - It describes what data is needed and how to get it. For example: - Relational Algebra.
- Non Procedural DML - It describes what data is needed without specifying how to get it. For example: - Relational calculus.
Friday, May 23, 2008 by Ankit Goyal
Functions of DBMS
• DBMS free the programmers from the need to worry about the organization and location of the data i.e. it shields the users from complex hardware level details.
• DBMS can organize process and present data elements from the database. This capability enables decision makers to search and query database contents in order to extract answers that are not available in regular Reports.
• Programming is speeded up because programmer can concentrate on logic of the application.
• It includes special user friendly query languages which are easy to understand by non pro¬gramming users of the system.
The various common examples of DBMS are Oracle, Access, SQL Server, Sybase, FoxPro, Dbase etc.
The service provided by the DBMS includes:
• Authorization services like log on to the DBMS, start the database, stop the Database etc.
• Transaction supports like Recovery, Rollback etc,
• Import and Export of Data.
• Maintaining data dictionary
• User's Monitoring
by Ankit Goyal
ADVANTAGES OF DBMS
The DBMS (Database Management System) is preferred ever the conventional file processing system due to the following advantages:
1. Controlling Data Redundancy - In the conventional file processing system, every user group maintains its own files for handling its data files. This may lead to
• Duplication of same data in different files.
• Wastage of storage space, since duplicated data is stored.
• Errors may be generated due to updation of the same data in different files.
• Time in entering data again and again is wasted.
• Computer Resources are needlessly used.
• It is very difficult to combine information.
2. Elimination of Inconsistency - In the file processing system information is duplicated through¬out the system. So changes made in one file may be necessary be carried over to another file. This may lead to inconsistent data. So we need to remove this duplication of data in multiple file to eliminate inconsistency.
For example: - Let us consider an example of student's result system. Suppose that in STU¬DENT file it is indicated that Roll no= 10 has opted for 'Computer'course but in RESULT file it is indicated that 'Roll No. =l 0' has opted for 'Accounts' course. Thus, in this case the two entries for z particular student don't agree with each other. Thus, database is said to be in an inconsistent state. Sc to eliminate this conflicting information we need to centralize the database. On centralizing the data base the duplication will be controlled and hence inconsistency will be removed.
Data inconsistency are often encountered in every day life Consider an another example, w have all come across situations when a new address is communicated to an organization that we deal it (Eg - Telecom, Gas Company, Bank). We find that some of the communications from that organization are received at a new address while other continued to be mailed to the old address. So combining all the data in database would involve reduction in redundancy as well as inconsistency so it is likely to reduce the costs for collection storage and updating of Data.
Let us again consider the example of Result system. Suppose that a student having Roll no -201 changes his course from 'Computer' to 'Arts'. The change is made in the SUBJECT file but not in RESULT'S file. This may lead to inconsistency of the data. So we need to centralize the database so that changes once made are reflected to all the tables where a particulars field is stored. Thus the update is brought automatically and is known as propagating updates.
3. Better service to the users - A DBMS is often used to provide better services to the users. In conventional system, availability of information is often poor, since it normally difficult to obtain information that the existing systems were not designed for. Once several conventional systems are combined to form one centralized database, the availability of information and its updateness is likely to improve since the data can now be shared and DBMS makes it easy to respond to anticipated information requests.
Centralizing the data in the database also means that user can obtain new and combined information easily that would have been impossible to obtain otherwise. Also use of DBMS should allow users that don't know programming to interact with the data more easily, unlike file processing system where the programmer may need to write new programs to meet every new demand.
4. Flexibility of the System is Improved - Since changes are often necessary to the contents of the data stored in any system, these changes are made more easily in a centralized database than in a conventional system. Applications programs need not to be changed on changing the data in the database.
5. Integrity can be improved - Since data of the organization using database approach is centralized and would be used by a number of users at a time. It is essential to enforce integrity-constraints.
In the conventional systems because the data is duplicated in multiple files so updating or changes may sometimes lead to entry of incorrect data in some files where it exists.
For example: - The example of result system that we have already discussed. Since multiple files are to maintained, as sometimes you may enter a value for course which may not exist. Suppose course can have values (Computer, Accounts, Economics, and Arts) but we enter a value 'Hindi' for it, so this may lead to an inconsistent data, so lack of Integrity.
Even if we centralized the database it may still contain incorrect data. For example: -
• Salary of full time employ may be entered as Rs. 500 rather than Rs. 5000.
• A student may be shown to have borrowed books but has no enrollment.
• A list of employee numbers for a given department may include a number of non existent employees.
These problems can be avoided by defining the validation procedures whenever any update operation is attempted.
6. Standards can be enforced - Since all access to the database must be through DBMS, so standards are easier to enforce. Standards may relate to the naming of data, format of data, structure of the data etc. Standardizing stored data formats is usually desirable for the purpose of data inter¬change or migration between systems.
7. Security can be improved - In conventional systems, applications are developed in an adhoc/temporary manner. Often different system of an organization would access different components of the operational data, in such an environment enforcing security can be quiet difficult. Setting up of a data¬base makes it easier to enforce security restrictions since data is now centralized. It is easier to control who has access to what parts of the database. Different checks can be established for each type of access (retrieve, modify, delete etc.) to each piece of information in the database.
Consider an Example of banking in which the employee at different levels may be given access to different types of data in the database. A clerk may be given the authority to know only the names of all the customers who have a loan in bank but not the details of each loan the customer may have. It can be accomplished by giving the privileges to each employee.
8. Organization's requirement can be identified - All organizations have sections and de¬partments and each of these units often consider the work of their unit as the most important and therefore consider their need as the most important. Once a database has been setup with centralized control, it will be necessary to identify organization's requirement and to balance the needs of the competating units. So it may become necessary to ignore some requests for information if they conflict with higher priority need of the organization.
It is the responsibility of the DBA (Database Administrator) to structure the database system to provide the overall service that is best for an organization.
For example: - A DBA must choose best file Structure and access method to give fast response for the high critical applications as compared to less critical applications.
9. Overall cost of developing and maintaining systems is lower - It is much easier to re¬spond to unanticipated requests when data is centralized in a database than when it is stored in a conventional file system. Although the initial cost of setting up of a database can be large, one normal expects the overall cost of setting up of a database, developing and maintaining application programs to be far lower than for similar service using conventional systems, Since the productivity of program¬mers can be higher in using non-procedural languages that have been developed with DBMS than using procedural languages.
10. Data Model must be developed - Perhaps the most important advantage of setting up of database system is the requirement that an overall data model for an organization be build. In conven¬tional systems, it is more likely that files will be designed as per need of particular applications demand. The overall view is often not considered. Building an overall view of an organization's data is usual cost effective in the long terms.
11. Provides backup and Recovery - Centralizing a database provides the schemes such as recovery and backups from the failures including disk crash, power failures, software errors which may help the database to recover from the inconsistent state to the state that existed prior to the occurrence of the failure, though methods are very complex.
Tuesday, April 1, 2008 by Ankit Goyal
Relational algebra is a collection of operations used to manipulate relations (tables). These operations enable the users to specify the retrieval requests which results in a new relation built from one or more relations.
Relational Algebra is a Procedural language, which specifies, the operations to be performed on the existing relations to derive result relations. It is a procedural language which means that user has to specify what is required and what is the sequence of steps performed on the database to obtain the required output. Whenever the operations are performed on the existing relations to produce new relations then the original relations(s) are not effected i.e. they remain the same, and the resultant relation obtained can act as an input to some other operation, so relational algebra operations can be composed together into a relational algebra expression. Composing relational algebra operation into relational expression is similar to composing arithmetic operations (+, -, *) into arithmetic expressions. R1+R2 is a relational expression where R1 and R2 are relations.
It is important that the results of use of relational algebric operations on Relations (Tables) must themselves be a Relation (Tables). This is because these operators can be used sequentially in various combinations to obtain desired results. Thus each operation on completion must leave data as a relation (table) for the next operator to use. So, this property which all the above operators must have is referred to as Relational Closure.
Relational Algebra is a formal and non-user friendly language. It illustrates the basic operations required for any Data Manipulation languages but it is very less commonly used in the commercial languages because it lacks the syntactic details, although it acts as a fundamental technique for extracting data from the database.
Relational Algebric Operations
The Relational Algebric Operations can be divided into two groups.
- Basic Set Oriented Operations or Traditional set operations – These are derived from Mathematical Set theory. They are applicable because each relation is defined to be set of Tuples. These include Union, Intersection, Difference, Cartesian Product. All of these operations are binary operations which means that operation applies to pair of Relations.
- Special Relational operations - These include join, selection, projection and division. These operations were designed specifically for relational databases. These operations don't add only power to the algebra but simply for common queries that are lengthy to express using basic set oriented operations.
These operations were introduced by Dr. Codd. But these could not meet all the requirements, so some additional operations were introduced. These included aggregate functions like SUM, AVERAGE, COUNT, OUTER JOIN etc.
Tuesday, March 18, 2008 by Ankit Goyal
Database Administrator (DBA)
The DBA is a person or a group of persons who is responsible for the management of the database. The DBA is responsible for authorizing access to the database by grant and revoke permissions to the users, for coordinating and monitoring its use, managing backups and repairing damage due to hardware and/or software failures and for acquiring hardware and software resources as needed. In case of small organization the role of DBA is performed by a single person and in case of large organizations there is a group of DBA's who share responsibilities.
They are responsible for identifying the data to be stored in the database and for choosing appropriate structure to represent and store the data. It is the responsibility of database designers to communicate with all prospective of the database users in order to understand their requirements so that they can create a design that meets their requirements.
End Users are the people who interact with the database through applications or utilities. The various categories of end users are:
- Casual End Users - These Users occasionally access the database but may need different information each time. They use sophisticated database Query language to specify their requests. For example: High level Managers who access the data weekly or biweekly.
- Native End Users - These users frequently query and update the database using standard types of Queries. The operations that can be performed by this class of users are very limited and effect precise portion of the database.
For example: - Reservation clerks for airlines/hotels check availability for given request and make reservations. Also, persons using Automated Teller Machines (ATM's) fall under this category as he has access to limited portion of the database.
- Standalone end Users/On-line End Users - Those end Users who interact with the database directly via on-line terminal or indirectly through Menu or graphics based Interfaces.
For example: - User of a text package, library management software that store variety of library data such as issue and return of books for fine purposes.
Application Programmers are responsible for writing application programs that use the database. These programs could be written in General Purpose Programming languages such as Visual Basic, Developer, C, FORTRAN, COBOL etc. to manipulate the database. These application programs operate on the data to perform various operations such as retaining information, creating new information, deleting or changing existing information.
Tuesday, March 18, 2008 by Ankit Goyal
Entity - Relationship Model
The Entity - Relationship Model (E-R Model) is a high-level conceptual data model developed by Chen in 1976 to facilitate database design. Conceptual Modeling is an important phase in designing a successful database. A conceptual data model is a set of concepts that describe the structure of a database and associated retrieval and updation transactions on the database. A high level model is chosen so that all the technical aspects are also covered.
The E-R data model grew out of the exercise of using commercially available DBMS's to model the database. The E-R model is the generalization of the earlier available commercial models like the Hierarchical and the Network Model. It also allows the representation of the various constraints as well as their relationships.
So to sum up, the Entity-Relationship (E-R) Model is based on a view of a real world that consists of set of objects called entities and relationships among entity sets which are basically a group of similar objects. The relationships between entity sets is represented by a named E-R relationship and is of 1:1, 1: N or M: N type which tells the mapping from one entity set to another.
The E-R model is shown diagrammatically using Entity-Relationship (E-R) diagrams which represent the elements of the conceptual model that show the meanings and the relationships between those elements independent of any particular DBMS and implementation details.
Features of the E-R Model:
- The E-R diagram used for representing E-R Model can be easily converted into Relations (tables) in Relational Model.
- The E-R Model is used for the purpose of good database design by the database developer so to use that data model in various DBMS.
- It is helpful as a problem decomposition tool as it shows the entities and the relationship between those entities.
- It is inherently an iterative process. On later modifications, the entities can be inserted into this model.
- It is very simple and easy to understand by various types of users and designers because specific standards are used for their representation.
Saturday, February 23, 2008 by Ankit Goyal
STRUCTURE OF DBMS
DBMS (Database Management System) acts as an interface between the user and the database. The user requests the DBMS to perform various operations (insert, delete, update and retrieval) on the database. The components of DBMS perform these requested operations on the database and provide necessary data to the users. The various components of DBMS are shown below: -
Fig. 2.1 Structure Of DBMS
- DDL Compiler - Data Description Language compiler processes schema definitions specified in the DDL. It includes metadata information such as the name of the files, data items, storage details of each file, mapping information and constraints etc.
- DML Compiler and Query optimizer - The DML commands such as insert, update, delete, retrieve from the application program are sent to the DML compiler for compilation into object code for database access. The object code is then optimized in the best way to execute a query by the query optimizer and then send to the data manager.
- Data Manager - The Data Manager is the central software component of the DBMS also knows as Database Control System.
The Main Functions Of Data Manager Are: –
- Convert operations in user's Queries coming from the application programs or combination of DML Compiler and Query optimizer which is known as Query Processor from user's logical view to physical file system.
- Controls DBMS information access that is stored on disk.
- It also controls handling buffers in main memory.
- It also enforces constraints to maintain consistency and integrity of the data.
- It also synchronizes the simultaneous operations performed by the concurrent users.
- It also controls the backup and recovery operations.
- Data Dictionary - Data Dictionary is a repository of description of data in the database. It contains information about
- Data - names of the tables, names of attributes of each table, length of attributes, and number of rows in each table.
- Relationships between database transactions and data items referenced by them which is useful in determining which transactions are affected when certain data definitions are changed.
- Constraints on data i.e. range of values permitted.
- Detailed information on physical database design such as storage structure, access paths, files and record sizes.
- Access Authorization - is the Description of database users their responsibilities and their access rights.
- Usage statistics such as frequency of query and transactions.
Data dictionary is used to actually control the data integrity, database operation and accuracy. It may be used as a important part of the DBMS.
Importance of Data Dictionary -
Data Dictionary is necessary in the databases due to following reasons:
- It improves the control of DBA over the information system and user's understanding of use of the system.
- It helps in documentating the database design process by storing documentation of the result of every design phase and design decisions.
- It helps in searching the views on the database definitions of those views.
- It provides great assistance in producing a report of which data elements (i.e. data values) are used in all the programs.
- It promotes data independence i.e. by addition or modifications of structures in the database application program are not effected.
- Data Files - It contains the data portion of the database.
- Compiled DML - The DML complier converts the high level Queries into low level file access commands known as compiled DML.
- End Users - They are already discussed in previous section.
Friday, February 15, 2008 by Ankit Goyal
REQUIREMENTS FOR A DBMS
The various softwares which handle the data in a database i.e. DBMS (like Oracle, FoxPro, SQL Server etc.) should meet the following requirements: -
- Provide data definition facilities.
- Define Data Definition Language (DDL)
- Provide user accessible catalog (Data Dictionary)
- Provide facilities for storing, retrieving and updating data.
- Define Data Manipulation Language (DML)
- Support Multiple View of Data
- End User or application should see only the need data and information required.
- Provides facilities for specifying Integrity constraints.
- Primary Key Constraints
- Foreign Key Constraints
- More General Constraints
- Provide facilities for controlling access to data.
- Prevent unauthorized access and update.
- Allow simultaneous access and update by multiple users.
- Provide concurrency control mechanism.
- Support Transactions.
- A sequence of operations to be performed as a whole.
- All operations are performed or none.
- Provide facilities for database recovery.
- Bring database back to consistent state after a failure such as disk failure, faulty program etc.
- Provide facilities for database maintenance.
- Maintenance operations: unload, reload, mass Insertion and deletion, validation etc.
Wednesday, February 13, 2008 by Ankit Goyal
A database system is a computer based record keeping System whose overall purpose is to record and maintain information that is relevant to the organization necessary for making decisions.
With the growth of the database, these systems are used in various applications of real world such as
- Banking System and ATM's machines.
- Stock Trading Systems.
- Flight Reservation Systems.
- Computerized Library Systems.
- Super Market Product Inventory System.
- Credit Card/Credit Limit Check System.
Database can range from those of a single user with a desktop computer to those on mainframe computers with thousands of users.
COMPONENTS OF DATABASE SYSTEM
A database system is composed of four components;
which coordinate with each other to form an effective database system.
Fig. 1.1 Data Base System
- Data - It is a very important component of the database system. Most of the organizations generate, store and process 1arge amount of data. The data acts a bridge between the machine parts i.e. hardware and software and the users which directly access it or access it through some application programs.
Data may be of different types.
- User Data - It consists of a table(s) of data called Relation(s) where Column(s) are called fields of attributes and rows are called Records for tables. A Relation must be structured properly.
- Metadata - A description of the structure of the database is known as Metadata. It basically means "data about data". System Tables store the Metadata which includes.
- Number of Tables and Table Names
- Number of fields and field Names
- Primary Key Fields
- Application Metadata - It stores the structure and format of Queries, reports and other applications components. '
- Hardware - The hardware consists of the secondary storage devices such as magnetic disks (hard disk, zip disk, floppy disks), optical disks (CD-ROM), magnetic tapes etc. on which data is stored together with the Input/Output devices (mouse, keyboard, printers), processors, main memory etc. which are used for storing and retrieving the data in a fast and efficient manner. Since database can range from those of a single user with a desktop computer to those on mainframe computers with thousand of users, therefore proper care should be taken for choosing appropriate hardware devices for a required database.
- Software - The Software part consists of DBMS which acts as a bridge between the user and the database or in other words, software that interacts with the users, application programs, and database and files system of a particular storage media (hard disk, magnetic tapes etc.) to insert, update, delete and retrieve data. For performing these operations such as insertion, deletion and updation we can either use the Query Languages like SQL, QUEL, Gupta SQL or application softwares such as Visual 3asic, Developer etc.
- Users - Users are those persons who need the information from the database to carry out their primary business responsibilities i.e. Personnel, Staff, Clerical, Managers, Executives etc. On the basis of the job and requirements made by them they are provided access to the database totally or partially.
The various types of users which can access the database are:-
- Database Administrators (DBA)
- Database Designers
- End Users
- Application Programmers
A database is a very well organized collection of data so as to be able to carry out operations like Insertion, deletion, updation and retrieval. Thus, a database needs to be managed by an appropriate package of software which is called DBMS (Database Management System).
The Primary purpose of a DBMS which is basically a collection of programs is to allow a user to store, update, retrieve data and thus make it easy to maintain and retrieve information from a database. The DBMS relieves the user from knowing how the data is stored physically and complex algorithms used for performing operations on the database. It only concentrates on how the operations are to be performed to retrieve the data from the database.
The DBMS is in charge of access, security, storage and host of other functions for the database system. It does this through a selection of computer programs. This allows it to manage the large, structured sets of data, which makes up the database and provides access to the data for multiple, concurrent users while maintaining the Integrity of the data.
The DBMS provides security facilities in a variety of forms, both to prevent unauthorized access and to prevent authorized users from accessing data concurrently without any inconsistency in the database. To prevent data from unauthorized users from accessing the system it users passwords to identify operators, programs and individual machines and set of privileges assigned to them. These privileges can include the ability to read, write and update data in database.
As we know that, DBMS is a collection of programs which acts as an intermediator between the user and the database. Since databases can be of small size like database maintained in a small organization or huge ones like database of large organizations so there are different types of DBMS ranging from small systems that run on personal computers to huge systems that run on mainframes.
The Various Applications Using DBMS Are: -
- Super Market Product Inventory System
- Stock Trading Systems
- Computerized Library Systems
The various commercially available database Management Systems.
For the small organizations-MS-Access, File Maker Pro, DB Text Works, Superbase etc.
- For the enterprise (client/server) - Oracle, SQL Server, FoxPro, Sybase, DB2, Informix, Paradox, Dbase etc.