Database management system

Database management system is software that is used to manage the database.

Our DBMS Tutorial includes all topics of DBMS such as introduction, ER model, keys, relational model, join operation, SQL, functional dependency, transaction, concurrency control, etc.

What is Database

The database is a collection of inter-related data which is used to retrieve, insert and delete the data efficiently. It is also used to organize the data in the form of a table, schema, views, and reports, etc.PlayNextMute

Current Time 0:11

/

Duration 18:10

Loaded: 5.14%FullscreenBackward Skip 10sPlay VideoForward Skip 10s

For example: The college Database organizes the data about the admin, staff, students and faculty etc.

Using the database, you can easily retrieve, insert, and delete the information.

Database Management System https://techzone360.shop/dbms-tutorial/

Database management system is a software which is used to manage the database. For example: MySQL, Oracle, etc are a very popular commercial database which is used in different applications.

DBMS provides an interface to perform various operations like database creation, storing data in it, updating data, creating a table in the database and a lot more.

It provides protection and security to the database. In the case of multiple users, it also maintains data consistency.

DBMS allows users the following tasks:

Data Definition: It is used for creation, modification, and removal of definition that defines the organization of data in the database.

Data Updation: It is used for the insertion, modification, and deletion of the actual data in the database.

Data Retrieval: It is used to retrieve the data from the database which can be used by applications for various purposes.

User Administration: It is used for registering and monitoring users, maintain data integrity, enforcing data security, dealing with concurrency control, monitoring performance and recovering information corrupted by unexpected failure.

Characteristics of DBMS

It uses a digital repository established on a server to store and manage the information.

It can provide a clear and logical view of the process that manipulates data.

DBMS contains automatic backup and recovery procedures.

It contains ACID properties which maintain data in a healthy state in case of failure.

It can reduce the complex relationship between data.

It is used to support manipulation and processing of data.

It is used to provide security of data.

It can view the database from different viewpoints according to the requirements of the user.

Advantages of DBMS

Controls database redundancy: It can control data redundancy because it stores all the data in one single database file and that recorded data is placed in the database.

Data sharing: In DBMS, the authorized users of an organization can share the data among multiple users.

Easily Maintenance: It can be easily maintainable due to the centralized nature of the database system.

Reduce time: It reduces development time and maintenance need.

Backup: It provides backup and recovery subsystems which create automatic backup of data from hardware and software failures and restores the data if required.

multiple user interface: It provides different types of user interfaces like graphical user interfaces, application program interfaces

Disadvantages of DBMS

Cost of Hardware and Software: It requires a high speed of data processor and large memory size to run DBMS software.

Size: It occupies a large space of disks and large memory to run them efficiently.

Complexity: Database system creates additional complexity and requirements.

Higher impact of failure: Failure is highly impacted the database because in most of the organization, all the data stored in a single database

ER (Entity Relationship) Diagram in DBMS

ER model stands for an Entity-Relationship model. It is a high-level data model. This model is used to define the data elements and relationship for a specified system.

It develops a conceptual design for the database. It also develops a very simple and easy to design view of data.

In ER modeling, the database structure is portrayed as a diagram called an entity-relationship diagram.

For example, Suppose we design a school database. In this database, the student will be an entity with attributes like address, name, id, age, etc. The address can be another entity with attributes like city, street name, pin code, etc and there will be a relationship between them.

Component of ER Diagram

  1. Entity:

An entity may be any object, class, person or place. In the ER diagram, an entity can be represented as rectangles.

Consider an organization as an example- manager, product, employee, department etc. can be taken as an entity.

a. Weak EntityPauseNextMute

Current Time 0:10

/

Duration 18:10

Loaded: 4.77%Fullscreen

An entity that depends on another entity called a weak entity. The weak entity doesn’t contain any key attribute of its own. The weak entity is represented by a double rectangle.

  1. Attribute

The attribute is used to describe the property of an entity. Eclipse is used to represent an attribute.

For example, id, age, contact number, name, etc. can be attributes of a student.

a. Key Attribute

The key attribute is used to represent the main characteristics of an entity. It represents a primary key. The key attribute is represented by an ellipse with the text underlined.

b. Composite Attribute

An attribute that composed of many other attributes is known as a composite attribute. The composite attribute is represented by an ellipse, and those ellipses are connected with an ellipse.

c. Multivalued Attribute

An attribute can have more than one value. These attributes are known as a multivalued attribute. The double oval is used to represent multivalued attribute.

For example, a student can have more than one phone number.

d. Derived Attribute

An attribute that can be derived from other attribute is known as a derived attribute. It can be represented by a dashed ellipse.

For example, A person’s age changes over time and can be derived from another attribute like Date of birth.

  1. Relationship

A relationship is used to describe the relation between entities. Diamond or rhombus is used to represent the relationship.

Types of relationship are as follows:

a. One-to-One Relationship

When only one instance of an entity is associated with the relationship, then it is known as one to one relationship.

For example, A female can marry to one male, and a male can marry to one female.

b. One-to-many relationship

When only one instance of the entity on the left, and more than one instance of an entity on the right associates with the relationship then this is known as a one-to-many relationship.

For example, Scientist can invent many inventions, but the invention is done by the only specific scientist.

c. Many-to-one relationship

When more than one instance of the entity on the left, and only one instance of an entity on the right associates with the relationship then it is known as a many-to-one relationship.

For example, Student enrolls for only one course, but a course can have many students.

d. Many-to-many relationship

When more than one instance of the entity on the left, and more than one instance of an entity on the right associates with the relationship then it is known as a many-to-many relationship.

For example, Employee can assign by many projects and project can have many employees.

Relational Model in DBMS

Relational model makes the query much easier than in hierarchical or network database systems. In 1970, E.F Codd has been developed it. A relational database is defined as a group of independent tables which are linked to each other using some common fields of each related table. This model can be represented as a table with columns and rows. Each row is known as a tuple. Each table of the column has a name or attribute. It is well knows in database technology because it is usually used to represent real-world objects and the relationships between them. Some popular relational databases are used nowadays like Oracle, Sybase, DB2, MySQL Server etc.

Relational Model Terminologies:

Following are the terminologies of Relational Model:

RelationTableTupleRow, RecordAttributeColumn, FieldDomainIt consists of set of legal valuesCardinalityIt consists of number of rowsDegreeIt contains number of columns

Let’s explain each term one by one in detail with the help of example:

Example: STUDENT RelationPauseNextMute

Current Time 0:16

/

Duration 18:10

Loaded: 5.87%Fullscreen

Stu_NoS_NamePHONE_NOADDRESSGender10112Rama9874567891Islam ganjF12839Shyam9026288936DelhiM33289Laxman8583287182GurugramM27857Mahesh7086819134GhaziabadM17282Ganesh9028939884DelhiM

Relation: A relation is usually represented as a table, organized into rows and columns. A relationship consists of multiple records. For example: student relation which contains tuples and attributes.

Tuple: The rows of a relation that contain the values corresponding to the attributes are called tuples. For example: in the Student relation there are 5 tuples.

The value of tuples contains (10112, Rama, 9874567891,islam ganj, F) etc.

Data Item: The smallest unit of data in the relation is the individual data item. It is stored at the intersection of rows and columns are also known as cells. For Example: 10112, “Rama” etc are data items in Student relation.

Domain: It contains a set of atomic values that an attribute can take. It could be accomplish explicitly by listing all possible values or specifying conditions that all values in that domain must be confirmed. For example: the domain of gender attributes is a set of data values “M” for male and “F” for female. No database software fully supports domains typically allowing the users to define very simple data types such as numbers, dates, characters etc.

Attribute: The smallest unit of data in relational model is an attribute. It contains the name of a column in a particular table. Each attribute Ai must have a domain, dom(Ai). For example: Stu_No, S_Name, PHONE_NO, ADDRESS, Gender are the attributes of a student relation. In relational databases a column entry in any row is a single value that contains exactly one item only.

Cardinality: The total number of rows at a time in a relation is called the cardinality of that relation. For example: In a student relation, the total number of tuples in this relation is3 so the cardinality of a relation is 3. The cardinality of a relation changes with time as more and more tuples get added or deleted.

Degree: The degree of association is called the total number of attributes in a relationship. The relation with one attribute is called unary relation, with two attributes is known a binary relation and with three attributes is known as ternary relation. For example: in the Student relation, the total number of attributes is 5, so the degree of the relations is 5. The degree of a relation does not change with time as tuples get added or deleted.

Relational instance: In the relational database system, the relational instance is represented by a finite set of tuples. Relation instances do not have duplicate tuples.

Relational schema: A relational schema contains the name of the relation and name of all columns or attributes.

Relational key: In the relational key, each row has one or more attributes. It can identify the row in the relation uniquely.

Properties of Relations

Each attribute in a relation has only one data value corresponding to it i.e. they do not contain two or more values.

Name of the relation is distinct from all other relations.

Each relation cell contains exactly one atomic (single) value

Each attribute contains a distinct name

Attribute domain has no significance

tuple has no duplicate value

Order of tuple can have a different sequence

It also provides information about metadata.

Transaction

The transaction is a set of logically related operation. It contains a group of tasks.

A transaction is an action or series of actions. It is performed by a single user to perform operations for accessing the contents of the database.

Example: Suppose an employee of bank transfers Rs 800 from X’s account to Y’s account. This small transaction contains several low-level tasks:

X’s Account

Open_Account(X)  

Old_Balance = X.balance  

New_Balance = Old_Balance – 800  

X.balance = New_Balance  

Close_Account(X)  

Y’s Account

Open_Account(Y)  

Old_Balance = Y.balance  

New_Balance = Old_Balance + 800  

Y.balance = New_Balance  

Close_Account(Y)  

Operations of Transaction:

Following are the main operations of transaction:PauseNextMute

Current Time 0:15

/

Duration 18:10

Loaded: 5.87%Fullscreen

Read(X): Read operation is used to read the value of X from the database and stores it in a buffer in main memory.

Write(X): Write operation is used to write the value back to the database from the buffer.

Let’s take an example to debit transaction from an account which consists of following operations:

1.  R(X);  

2.  X = X – 500;  

3.  W(X);  

Let’s assume the value of X before starting of the transaction is 4000.

The first operation reads X’s value from database and stores it in a buffer.

The second operation will decrease the value of X by 500. So buffer will contain 3500.

The third operation will write the buffer’s value to the database. So X’s final value will be 3500.

But it may be possible that because of the failure of hardware, software or power, etc. that transaction may fail before finished all the operations in the set.

For example: If in the above transaction, the debit transaction fails after executing operation 2 then X’s value will remain 4000 in the database which is not acceptable by the bank.

To solve this problem, we have two important operations:

Commit: It is used to save the work done permanently.

DBMS Concurrency Control
Concurrency Control is the management procedure that is required for controlling concurrent execution of the operations that take place on a database.

But before knowing about concurrency control, we should know about concurrent execution.

Concurrent Execution in DBMS
In a multi-user system, multiple users can access and use the same database at one time, which is known as the concurrent execution of the database. It means that the same database is executed simultaneously on a multi-user system by different users.
While working on the database transactions, there occurs the requirement of using the database by multiple users for performing different operations, and in that case, concurrent execution of the database is performed.
The thing is that the simultaneous execution that is performed should be done in an interleaved manner, and no operation should affect the other executing operations, thus maintaining the consistency of the database. Thus, on making the concurrent execution of the transaction operations, there occur several challenging problems that need to be solved.
Problems with Concurrent Execution
In a database transaction, the two main operations are READ and WRITE operations. So, there is a need to manage these two operations in the concurrent execution of the transactions as if these operations are not performed in an interleaved manner, and the data may become inconsistent. So, the following problems occur with the Concurrent Execution of the operations:

Problem 1: Lost Update Problems (W – W Conflict)
The problem occurs when two different database transactions perform the read/write operations on the same database items in an interleaved manner (i.e., concurrent execution) that makes the values of the items incorrect hence making the database inconsistent.
Pause

Next
Mute
Current Time
0:12
/
Duration
18:10

Fullscreen

For example:

Consider the below diagram where two transactions TX and TY, are performed on the same account A where the balance of account A is $300.

DBMS Concurrency Control
At time t1, transaction TX reads the value of account A, i.e., $300 (only read).
At time t2, transaction TX deducts $50 from account A that becomes $250 (only deducted and not updated/write).
Alternately, at time t3, transaction TY reads the value of account A that will be $300 only because TX didn’t update the value yet.
At time t4, transaction TY adds $100 to account A that becomes $400 (only added but not updated/write).
At time t6, transaction TX writes the value of account A that will be updated as $250 only, as TY didn’t update the value yet.
Similarly, at time t7, transaction TY writes the values of account A, so it will write as done at time t4 that will be $400. It means the value written by TX is lost, i.e., $250 is lost.
Hence data becomes incorrect, and database sets to inconsistent.

Dirty Read Problems (W-R Conflict)
The dirty read problem occurs when one transaction updates an item of the database, and somehow the transaction fails, and before the data gets rollback, the updated database item is accessed by another transaction. There comes the Read-Write Conflict between both transactions.

For example:

Consider two transactions TX and TY in the below diagram performing read/write operations on account A where the available balance in account A is $300:

DBMS Concurrency Control
At time t1, transaction TX reads the value of account A, i.e., $300.
At time t2, transaction TX adds $50 to account A that becomes $350.
At time t3, transaction TX writes the updated value in account A, i.e., $350.
Then at time t4, transaction TY reads account A that will be read as $350.
Then at time t5, transaction TX rollbacks due to server problem, and the value changes back to $300 (as initially).
But the value for account A remains $350 for transaction TY as committed, which is the dirty read and therefore known as the Dirty Read Problem.
Unrepeatable Read Problem (W-R Conflict)
Also known as Inconsistent Retrievals Problem that occurs when in a transaction, two different values are read for the same database item.

For example:

Consider two transactions, TX and TY, performing the read/write operations on account A, having an available balance = $300. The diagram is shown below:

DBMS Concurrency Control
At time t1, transaction TX reads the value from account A, i.e., $300.
At time t2, transaction TY reads the value from account A, i.e., $300.
At time t3, transaction TY updates the value of account A by adding $100 to the available balance, and then it becomes $400.
At time t4, transaction TY writes the updated value, i.e., $400.
After that, at time t5, transaction TX reads the available value of account A, and that will be read as $400.
It means that within the same transaction TX, it reads two different values of account A, i.e., $ 300 initially, and after updation made by transaction TY, it reads $400. It is an unrepeatable read and is therefore known as the Unrepeatable read problem.
Thus, in order to maintain consistency in the database and avoid such problems that take place in concurrent execution, management is needed, and that is where the concept of Concurrency Control comes into role.

Concurrency Control
Concurrency Control is the working concept that is required for controlling and managing the concurrent execution of database operations and thus avoiding the inconsistencies in the database. Thus, for maintaining the concurrency of the database, we have the concurrency control protocols.

Challenges in Concurrent Transactions
Isolation is one of the major issues which the community has to consider to achieve data consistency, integrity and good performance of a DBMS when it supports concurrent transactions. These are a few of the major difficulties:

Data Consistency:
Lost Updates: Happens when two or more transactions attempt to read the same record, modify it and then write it back, it would result in writing only one of the changes.

Temporary Inconsistencies: Raise when a transaction is reading data that another transaction is simultaneously processing, resulting in interim or unpredictable values.

Data Integrity:
Non-Repeatable Reads: Occur when one transaction retries a read operation and obtains a different value than before because another transaction has updated the value in between.

Phantom Reads: Happen when a transaction resubmit a query that returns a set of rows which meet a condition and discover that the set of rows has been changed by another transaction that was inserting or deleting rows.

Isolation Levels and Performance
Low Isolation Levels: Read Uncommitted and Read Committed for example, makes the query run faster but at the same time, bring about the possibility of dirty reads and non-repeatable reads.

High Isolation Levels: As such, we have techniques such as Repeatable Read and Serializable options that ensure data consistency but harm concurrency and system performance.

Deadlocks:
Detection and Resolution: After that, to identify deadlocks, it is necessary to monitor ongoing transactions, which are interested in resources and their allocation; the computation of these is computationally intensive, means that resolving deadlocks require the aborting of one or several transactions; this has negative impacts on the system throughput and users satisfaction.

Prevention and Avoidance: Policies or measures to address or minimize deadlocks in operating systems can reduce concurrency and hence underutilize resources.

Resource Contention
Lock Contention: If there are many transactions that require the use of locks, there is a potential to have small raw transaction rates, and long waits where there is a large transaction rate.

Hardware Resource Constraints: This generate contention issues on a limited number of CPUs, memory and I/O operation when there are a high number of concurrent transactions.

Concurrency Control Mechanisms
The use of locks is very important in DBMS concurrency control because it is used in the control of multiple transactions without allowing unsynchronized changes to the database. Some of the concepts that play an important role in these protocols are shared locks and exclusive locks in addition to two phase locking (2PL) and strict two phase locking.

Concurrency Control Protocols
The concurrency control protocols ensure the atomicity, consistency, isolation, durability and serializability of the concurrent execution of the database transactions. Therefore, these protocols are categorized as:

Lock Based Concurrency Control Protocol
Time Stamp Concurrency Control Protocol
Validation Based Concurrency Control Protocol

File Organization

  • The File is a collection of records. Using the primary key, we can access the records. The type and frequency of access can be determined by the type of file organization which was used for a given set of records.
  • File organization is a logical relationship among various records. This method defines how file records are mapped onto disk blocks.
  • File organization is used to describe the way in which the records are stored in terms of blocks, and the blocks are placed on the storage medium.
  • The first approach to map the database to the file is to use the several files and store only one fixed length record in any given file. An alternative approach is to structure our files so that we can contain multiple lengths for records.
  • Files of fixed length records are easier to implement than the files of variable length records.

Objective of file organization

  • It contains an optimal selection of records, i.e., records can be selected as fast as possible.
  • To perform insert, delete or update transaction on the records should be quick and easy.
  • The duplicate records cannot be induced as a result of insert, update or delete.
  • For the minimal cost of storage, records should be stored efficiently.

Indexing in DBMS
Indexing is used to optimize the performance of a database by minimizing the number of disk accesses required when a query is processed.
The index is a type of data structure. It is used to locate and access the data in a database table quickly.
Index structure:
Indexes can be created using some database columns.

DBMS Indexing in DBMS
The first column of the database is the search key that contains a copy of the primary key or candidate key of the table. The values of the primary key are stored in sorted order so that the corresponding data can be accessed easily.
The second column of the database is the data reference. It contains a set of pointers holding the address of the disk block where the value of the particular key can be found.
Indexing Methods
DBMS Indexing in DBMS
Ordered indices
The indices are usually sorted to make searching faster. The indices which are sorted are known as ordered indices.

Example: Suppose we have an employee table with thousands of record and each of which is 10 bytes long. If their IDs start with 1, 2, 3….and so on and we have to search student with ID-543.

In the case of a database with no index, we have to search the disk block from starting till it reaches 543. The DBMS will read the record after reading 54310=5430 bytes. In the case of an index, we will search using indexes and the DBMS will read the record after reading 5422= 1084 bytes which are very less compared to the previous case.
Primary Index
If the index is created on the basis of the primary key of the table, then it is known as primary indexing. These primary keys are unique to each record and contain 1:1 relation between the records.
As primary keys are stored in sorted order, the performance of the searching operation is quite efficient.
The primary index can be classified into two types: Dense index and Sparse index.
Dense index
The dense index contains an index record for every search key value in the data file. It makes searching faster.
In this, the number of records in the index table is same as the number of records in the main table.
It needs more space to store index record itself. The index records have the search key and a pointer to the actual record on the disk.

RAID (Redundant Array of Independent Disk)

RAID refers to redundancy array of the independent disk. It is a technology which is used to connect multiple secondary storage devices for increased performance, data redundancy or both. It gives you the ability to survive one or more drive failure depending upon the RAID level used.

It consists of an array of disks in which multiple disks are connected to achieve different goals.

RAID technology

There are 7 levels of RAID schemes. These schemas are as RAID 0, RAID 1, …., RAID 6.

These levels contain the following characteristics:

https://imasdk.googleapis.com/js/core/bridge3.686.0_en.html#fid=goog_1930701341
  • It contains a set of physical disk drives.
  • In this technology, the operating system views these separate disks as a single logical disk.
  • In this technology, data is distributed across the physical drives of the array.
  • Redundancy disk capacity is used to store parity information.
  • In case of disk failure, the parity information can be helped to recover the data.

Standard RAID levels

RAID 0

  • RAID level 0 provides data stripping, i.e., a data can place across multiple disks. It is based on stripping that means if one disk fails then all data in the array is lost.
  • This level doesn’t provide fault tolerance but increases the system performance.

Example:

Disk 0Disk 1Disk 2Disk 3
20212223
24252627
28293031
32333435

In this figure, block 0, 1, 2, 3 form a stripe.

In this level, instead of placing just one block into a disk at a time, we can work with two or more blocks placed it into a disk before moving on to the next one.

Disk 0Disk 1Disk 2Disk 3
20222426
21232527
28303234
29313335

In this above figure, there is no duplication of data. Hence, a block once lost cannot be recovered.

Pros of RAID 0:

  • In this level, throughput is increased because multiple data requests probably not on the same disk.
  • This level full utilizes the disk space and provides high performance.
  • It requires minimum 2 drives.

Cons of RAID 0:

  • It doesn’t contain any error detection mechanism.
  • The RAID 0 is not a true RAID because it is not fault-tolerance.
  • In this level, failure of either disk results in complete data loss in respective array.

RAID 1

This level is called mirroring of data as it copies the data from drive 1 to drive 2. It provides 100% redundancy in case of a failure.

Example:

Disk 0Disk 1Disk 2Disk 3
AABB
CCDD
EEFF
GGHH

Only half space of the drive is used to store the data. The other half of drive is just a mirror to the already stored data.

Pros of RAID 1:

  • The main advantage of RAID 1 is fault tolerance. In this level, if one disk fails, then the other automatically takes over.
  • In this level, the array will function even if any one of the drives fails.

Cons of RAID 1:

  • In this level, one extra drive is required per drive for mirroring, so the expense is higher.

RAID 2

  • RAID 2 consists of bit-level striping using hamming code parity. In this level, each data bit in a word is recorded on a separate disk and ECC code of data words is stored on different set disks.
  • Due to its high cost and complex structure, this level is not commercially used. This same performance can be achieved by RAID 3 at a lower cost.

Pros of RAID 2:

  • This level uses one designated drive to store parity.
  • It uses the hamming code for error detection.

Cons of RAID 2:

  • It requires an additional drive for error detection.

RAID 3

  • RAID 3 consists of byte-level striping with dedicated parity. In this level, the parity information is stored for each disk section and written to a dedicated parity drive.
  • In case of drive failure, the parity drive is accessed, and data is reconstructed from the remaining devices. Once the failed drive is replaced, the missing data can be restored on the new drive.
  • In this level, data can be transferred in bulk. Thus high-speed data transmission is possible.
Disk 0Disk 1Disk 2Disk 3
ABCP(A, B, C)
DEFP(D, E, F)
GHIP(G, H, I)
JKLP(J, K, L)

Pros of RAID 3:

  • In this level, data is regenerated using parity drive.
  • It contains high data transfer rates.
  • In this level, data is accessed in parallel.

Cons of RAID 3:

  • It required an additional drive for parity.
  • It gives a slow performance for operating on small sized files.

RAID 4

  • RAID 4 consists of block-level stripping with a parity disk. Instead of duplicating data, the RAID 4 adopts a parity-based approach.
  • This level allows recovery of at most 1 disk failure due to the way parity works. In this level, if more than one disk fails, then there is no way to recover the data.
  • Level 3 and level 4 both are required at least three disks to implement RAID.
Disk 0Disk 1Disk 2Disk 3
ABCP0
DEFP1
GHIP2
JKLP3

In this figure, we can observe one disk dedicated to parity.

In this level, parity can be calculated using an XOR function. If the data bits are 0,0,0,1 then the parity bits is XOR(0,1,0,0) = 1. If the parity bits are 0,0,1,1 then the parity bit is XOR(0,0,1,1)= 0. That means, even number of one results in parity 0 and an odd number of one results in parity 1.

C1C2C3C4Parity
01001
00110

Suppose that in the above figure, C2 is lost due to some disk failure. Then using the values of all the other columns and the parity bit, we can recompute the data bit stored in C2. This level allows us to recover lost data.

RAID 5

  • RAID 5 is a slight modification of the RAID 4 system. The only difference is that in RAID 5, the parity rotates among the drives.
  • It consists of block-level striping with DISTRIBUTED parity.
  • Same as RAID 4, this level allows recovery of at most 1 disk failure. If more than one disk fails, then there is no way for data recovery.
Disk 0Disk 1Disk 2Disk 3Disk 4
0123P0
567P14
1011P289
15P3121314
P416171819

This figure shows that how parity bit rotates.

This level was introduced to make the random write performance better.

Pros of RAID 5:

  • This level is cost effective and provides high performance.
  • In this level, parity is distributed across the disks in an array.
  • It is used to make the random write performance better.

Cons of RAID 5:

  • In this level, disk failure recovery takes longer time as parity has to be calculated from all available drives.
  • This level cannot survive in concurrent drive failure.

RAID 6

  • This level is an extension of RAID 5. It contains block-level stripping with 2 parity bits.
  • In RAID 6, you can survive 2 concurrent disk failures. Suppose you are using RAID 5, and RAID 1. When your disks fail, you need to replace the failed disk because if simultaneously another disk fails then you won’t be able to recover any of the data, so in this case RAID 6 plays its part where you can survive two concurrent disk failures before you run out of options.
Disk 1Disk 2Disk 3Disk 4
A0B0Q0P0
A1Q1P1D1
Q2P2C2D2
P3B3C3Q3


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *