Overview on RDBMS
Overview on RDBMS
RDBMS is a term used to describe an entire suite of programs for both managing a relational database and communicating with that relational database engine. Sometimes Software Development Kit (SDK) front-end tools and complete management kits are included with relational database packages (eg: MS Access) In other words, an RDBMS is both the database engine and any other tools that come with it.
What is RDBMS?
RDBMS stands for Relational Database Management System. RDBMS data is structured in database tables, fields and records. Each RDBMS table consists of database table rows. Each database table row consists of one or more database table fields.
RDBMS store the data into collection of tables, which might be related by common fields (database table columns). RDBMS also provide relational operators to manipulate the data stored into the database tables. Most RDBMS use SQL as database query language.
What is database?
A database is a logically coherent collection of data with some inherent meaning, representing some aspect of real world and which is designed, built and populated with data for a specific purpose.
What is DBMS?
It is a collection of programs that enables user to create and maintain a database. In other words it is general-purpose software that provides the users with the processes of defining, constructing and manipulating the database for various applications.
What is a Database system?
The database and DBMS software together is called as Database system.
Disadvantage in File Processing System
Data redundancy & inconsistency.
Difficult in accessing data.
Concurrent access is not possible.
Describe the three levels of data abstraction
The are three levels of abstraction:
Physical level: The lowest level of abstraction describes how data are stored.
Logical level: The next higher level of abstraction, describes what data are stored in database and what relationship among those data.
View level: The highest level of abstraction describes only part of entire database.
What is System R? What are its two major subsystems?
System R was designed and developed over a period of 1974-79 at IBM San Jose Research Center. It is a prototype and its purpose was to demonstrate that it is possible to build a Relational System that can be used in a real life environment to solve real life problems, with performance at least comparable to that of existing system.
Its two subsystems are
System Relational Data System.
How is the data structure of System R different from the relational structure
Unlike Relational systems in System R
Domains are not supported
Enforcement of candidate key uniqueness is optional
Enforcement of entity integrity is optional
Referential integrity is not enforced
What is Data Independence
Data independence means that ‘the application is independent of the storage structure and access strategy of data’. In other words, The ability to modify the schema definition in one level should not affect the schema definition in the next higher level.
Two types of Data Independence:
Physical Data Independence: Modification in physical level should not affect the logical level.
Logical Data Independence: Modification in logical level should affect the view level.
NOTE: Logical Data Independence is more difficult to achieve
What is a view?How it is related to data independence
A view may be thought of as a virtual table, that is, a table that does not really exist in its own right but is instead derived from one or more underlying base table. In other words, there is no stored file that direct represents the view instead a definition of view is stored in data dictionary. Growth and restructuring of base tables is not reflected in views. Thus the view can insulate users from the effects of restructuring and growth in the database. Hence accounts for logical data independence.
What is Data Model?
A collection of conceptual tools for describing data, data relationships data semantics and constraints.
What is E-R model
This data model is based on real world that consists of basic objects called entities and of relationship among these objects. Entities are described in a database by a set of attributes
The Entity-Relationship data model is based on a perception of the real world which consists of basic objects called entities and relationships among theseobjects.
A key is an attribute or set of attributes of an entitywhich can be used to identify it.
Super Keyis a set of one or more attributes which, takencollectively, allows an entity to be uniquelyidentified in an entity set
Candidate Keyis a super key for which no proper subset is a superkey (i.e., a minimal super key)
Primary Keyis a candidate key chosen by the database designeras the principle means of identifying entities withinan entity set.
ER Model: Constraints
Key Constraints: These are constraints implied by the existence of candidate keys. The table definition includes aspecification implying uniqueness of the attributesconstituting the primary key or alternate keys.
A primary key constraint also implies a no-nulls constraint.Referential Constraints:Constraints implied by the existence of foreign keys inthe table definition.Other Constraints:Constraints enforcing checks of business logic of theapplication in the table definition.
Features of RDBMS
Relational database management system has various following features:-
1-It can solve any complex queries.
2-RDBMS is very secure.A fully RDBMS can prevent from any unauthorized access.
Advantages of RDBMS over DBMS:
• Reliability is improved because the data is not spread across the network and several applications. Only one process handles the data.
• Network traffic is greatly reduced. Let’s say in desktop database model – entire database along with indexes is to be sent to client. Where as in Client/server database model only result is to be sent to client.
• Upgrading a heavily used desktop database to a well-designed client/server database will reduce database-related network traffic by more than 95%.
• Performance is improved as database operations are handled over server, so client PCs will have less processing requirement.
• Security is improved as data are kept within a single server. Hacking into a data file that is protected within the server is much more difficult than hacking into a data file on desktop database model.
• Data integrity constraints and business rules can be enforced at server level. You can specify rules such as “Marks of any subject can not exceed 100 in any subject”
• Data Sharing: SQL is used to coordinate data sharing by concurrent users, ensuring that they do not interfere with one another.
• Data can be integrated using multiple platform i.e. LAN, WAN and WAN. One can incorporate data from INTERNET
A relational database management system (RDBMS) is a database management system (DBMS) that is based on the relational model as introduced by E. F. Codd. Most popular commercial and open source databases currently in use are based on the relational model.
A short definition of an RDBMS may be a DBMS in which data is stored in the form of tables and the relationship among the data is also stored in the form of tables.
E.F. Codd, the famous mathematician has introduced 12 rules for the relational model for databases commonly known as Codd’s rules. The rules mainly define what is required for a DBMS for it to be considered relational, i.e., an RDBMS. There is also one more rule i.e. Rule00 which specifies the relational model should use the relational way to manage the database. The rules and their description are as follows:-
Rule 0: Foundation Rule
A relational database management system should be capable of using its relational facilities (exclusively) to manage the database.
Rule 1: Information Rule
All information in the database is to be represented in one and only one way. This is achieved by values in column positions within rows of tables.
Rule 2: Guaranteed Access Rule
All data must be accessible with no ambiguity, that is, Each and every datum (atomic value) is guaranteed to be logically accessible by resorting to a combination of table name, primary key value and column name.
Rule 3: Systematic treatment of null values
Null values (distinct from empty character string or a string of blank characters and distinct from zero or any other number) are supported in the fully relational DBMS for representing missing information in a systematic way, independent of data type.
Rule 4: Dynamic On-line Catalog Based on the Relational Model
The database description is represented at the logical level in the same way as ordinary data, so authorized users can apply the same relational language to its interrogation as they apply to regular data. The authorized users can access the database structure by using common language i.e. SQL.
Rule 5: Comprehensive Data Sublanguage Rule
A relational system may support several languages and various modes of terminal use. However, there must be at least one language whose statements are expressible, per some well-defined syntax, as character strings and whose ability to support all of the following is comprehensible:
a. data definition
b. view definition
c. data manipulation (interactive and by program)
d. integrity constraints
f. Transaction boundaries (begin, commit, and rollback).
Rule 6: View Updating Rule
All views that are theoretically updateable are also updateable by the system.
Rule 7: High-level Insert, Update, and Delete
The system is able to insert, update and delete operations fully. It can also perform the operations on multiple rows simultaneously.
Rule 8: Physical Data Independence
Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representation or access methods.
Rule 9: Logical Data Independence
Application programs and terminal activities remain logically unimpaired when information preserving changes of any kind that theoretically permit unimpairment are made to the base tables.
Rule 10: Integrity Independence
Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage and storable in the catalog, not in the application programs.
Rule 11: Distribution Independence
The data manipulation sublanguage of a relational DBMS must enable application programs and terminal activities to remain logically unimpaired whether and whenever data are physically centralized or distributed.
Rule 12: Nonsubversion Rule
If a relational system has or supports a low-level (single-record-at-a-time) language, that low-level language cannot be used to subvert or bypass the integrity rules or constraints expressed in the higher-level (multiple-records-at-a-time) relational language.
• Normalisation is a design techniques that is widely used as guide in designing relational database.
• It is a two-step process that puts data into tabular form by removing repeating groups ant then removes duplicated data from relational table.
• Thus Normalisation is the process of structuring an unstructured relation into structural one with the purpose of removing redundancy and anomalies.
• Normalisation theory is based on normal forms.A relational table is said to be particular normal forms if it satisfy a certain set of constraints.
Functional Dependency : It describe the relationship between 2 attributes of same relational database tables.One of the attributes is called determinant and other attribute is called determined.
For eg.- For each value of determinant there is associated one and only one value of determined.
A -> B
The A is determinant and B is determined then we say that A functionally determines B or Bis functionally dependent on A.
Fully Functional Dependence : It states that if A and B are two attributes of a relation then B is Fully Functionaly Dependent on A if B is Functional Dependent on A and not a proper subset of A.
Transitive Dependency : If Bis functional Dependent on A and C is functional dependent on B then C is transitive dependent on A.
The following are the types of normal forms:
First Normal Form (1NF)
When a table is broken up (decomposed)into more tables with all repeating groups(records) of data eliminated, table data is said to be in the first normal form (1NF).
A table is said to be in 1st Normal form if:
(a) There is no repeating group.
(b) All the key attributes are defined.
(c) All the non-key attributes are dependent on a primary key.
Second Normal Form (2NF)
A table is said to be in 2nd Normal Form(2NF) if each record is in the table is in the First Normal form(1NF) and each column in the record is fully dependent on its primary key.
A table is in 2nd Normal Form if:
(a) It is in 1st Normal Form.
(b) If no non-key attribute is dependent on a part of composite key( combination of two or more attributes declared as primary key), that is all the attributes must be dependent on the whole composite key not just a part of it.
Note: 2nd Normal Form can be applied only to the table which has any composite key.
Third Normal Form (3NF)
Table is said to be in 3rd Normal Form when all the transitive dependencies are removed from the data.
Transitive dependency is the dependency of a non-key attribute on another non-key attribute of the table.
A table is said to be in 3NF if:
(a) It is in 2NF.
(b) It doesn’t contain any transitive dependencies.
There are certain situations when normalization can be avoided. Those situations are as follows:
(a) There is/are no repeating group/groups in the table.
(b) A primary key is defined.
(c) All non-key attributes are fully dependent on key attribute(primary key) or key attributes (composite key).
(d) There is no transitive dependencies.
Boyce Codd Normal Form (BCNF): A relation is said to be in BCNF if and only if the determinants are candidate keys. BCNF relation is a strong 3NF, but not every 3NF relation is BCNF.