Home > Research Essay > Distributed Database – Nested Transaction

Distributed Database – Nested Transaction

October 19th, 2006 Leave a comment Go to comments


Author: T. Y. Fam
Research Assignment

Distributed Database Systems (Advanced Topics in Distributed Systems)

Nested Transaction

In the mid 80’s, all record-keeping required database application. The users started by using the traditional record-keeping application, and then advanced to more complex applications such as Office Information System or Computer Aided Design. Information plays an important role in this sophisticated world. Without this information, we cannot know what is happening in or outside our country. Information is very important as a mean of communication in the business world. Since information is so vital to everyone, the only way to keep the information secure is by using record keeping. Record keeping can be done using applications; most people will use database applications because it will enhance the efficiency of their work.

There are two main types of database systems, centralized systems and decentralized systems; also known as distributed systems. The main advantage for centralized systems is the quality control. This is so as the database will only work within a system and it is also easily maintained by a small group of people. However, the centralized system is not a cost effective system due to the duplication of data. The storage cost for the centralized system will be high compared to a distributed system. The source of distributed systems will always be online because all the systems will not be offline at the same time. A good distributed system can reduce the duplication of data as well, thus reducing the cost for storage media within the network. The only disadvantage of the distributed system is the difficulty of keeping the database up-to-date, because the data can be located all over the world (Centralized versus distributed database systems, 1997, pg.54).

Nowadays, most of the people use nested transaction model, this model is used to satisfy the requirement of advanced database applications. What is nested transaction model? This model was originally introduced to increase transaction reliability in the distributed system. Transactions in advanced database application are considered long and complex. However, conventional flat transaction model cannot support all requirements in advanced applications. As a result, the basic nested transaction model has been extended and several new models have been added as well. So, the nested transaction model and its extensions are currently being used in most of the advanced database systems. A nested transaction is a hierarchy of sub transactions, whereby each sub transaction may consist of other sub transactions. A nested transaction can be defined as a collection of sub transactions that are combined together to form one atomic unit for execution (Hassanein, El-Sharkawi, 2000, pg.4).

A transaction is a series of processes executing on a database to perform any kind of calculation. Each of the transactions has to terminate at the end of process. There are only two possibilities for a transaction result, a failure or a success. When a transaction has failed, it needs to rollback or perform with any other action depending on how the system is designed. If the transaction is successful, it will be saved and will continue on to the next transaction. Transactions such as a flat transaction were widely used in current DBMS system. But the technology changes from time to time and requires very complex transaction calculation. The traditional flat transaction will not be powerful enough to handle all these new queries. In this stage, nested transaction is introduced to handle all the problems (Alkhatib, Labban, 2002, pg.95).

The concept of nested transactions is much more complicated than other systems. Therefore, it is impossible to avoid every single possible occurrence of the error. When an error occurs, we should have a fine transaction recovering system in the nested transactions. There are a few common failures that may happen in nested transaction such as transaction failure, node failure, system failure and media failure. Transaction failure often happens in large DBMS when the transaction program could not be saved correctly due to deadlock. If this problem happens in a small transaction, only a short amount of time is needed to recover the error or recovering during the same execution time when the transaction is executed. For a large transaction error, it may require a longer time to completely do a rollback-and-recover process for the lost transaction. Another type of failure is node failure, that is, one of the single systems not functioning. This kind of failure should only affect that particular failed process and thus reducing the problem area as much as possible. System failure means that the entire computer system crashed for some reason and needs to be rebooted. This case is similar to node failure, which can be handled without affecting other running processes in a distributed environment. Media failure refers to a disk or any other storage failure and thus the process needs to be recovered or rollback to the last known backup (Haerder and Rothermel, 1987, pp.239-248).

There are two main types of algorithms when dealing with recovery in a nested transaction. First algorithm is UNDO, or called recovering without save points. UNDO will wipe out all the remaining transactions and processes to the beginning, and do the transaction from the start. The second algorithm is recovering using save points. The main difference compared to the first algorithm is that it will add save points to the UNDO list. It allows for partial rollback which reduces the loss of work effort compared to the first algorithm. Each algorithm has two types or interfaces which are single call interfaces and conversational interfaces. In a single call interface, a successful transaction recovery will require the complete removal of the entire transaction made. However, this will affect all its relation either in process or saved data. When a transaction failure happens, the parent of that transaction will receive a message called ‘aborted’ rather than ‘committed’. After understanding the failure and error message, a new transaction should be created to recover the failed transaction (Haerder and Rothermel, 1987, pp.239-248).

In conversational interface, they have a better parent and child control system. When a transaction is being made between grandparent, parent and child, it makes sure no error or wrong information is passed to the grandparent level. If any transaction error occurs between transaction and child level, it may only need to rollback and recover within that transaction rather than rolling back the whole system (Haerder and Rothermel, 1987, pp.239-248).

The second recovery system uses save points for tracking the recovering point in the transaction. Save points is actually an upgrade or better add-on for UNDO. It allows failed transaction to be recovered at any stage which is needed, thus creating a greater flexibility for recovering in nested transaction. Save points are also known as restart points, when the transaction failed, it will try to locate the best stage and restart the transaction recovering procedure (Haerder and Rothermel, 1987, pp.239-248).

There are many well known transaction models in the world. A majority of them is the flat transaction model, which is a modification of basic transaction. Flat transaction is a collection of ACID properties which are Atomicity, Consistency, Isolation and Durability. Atomicity is a process to make sure all the transaction processes are in a single unit. Consistency is to control every single execution when transactions are being concurrently executed. The function of Isolation is to isolate each transaction to prevent them from reading the same database at the same time. Durability is to make sure everything is saved properly to reduce the risk of data loss (Bertino, Catania, Ferrari, 2001, pp.321-370).

The main problem of flat transaction is that there is no way to stop the transaction halfway. If the transaction is being forced to stop or abort, the whole transaction will be cancelled and restarted from the beginning. The following example will show the limitation of flat transaction. Mr. A is in Melbourne and wish to travel to Sandakan, a small town in Malaysia. There is no direct flight from Melbourne to Sandakan. Therefore Mr. A should fly to Singapore or Kuala Lumpur, then transit to Kota Kinabalu. The travel agent will start making the transaction and book the ticket from Melbourne to Kuala Lumpur and Kuala Lumpur to Kota Kinabalu. Soon, he finds out that there is no way to travel from Kota Kinabalu to Sandakan, but there is an available flight from Brunei to Sandakan. So what the travel agent needs to do is to change the route from Kuala Lumpur to Kota Kinabalu with Kuala Lumpur to Brunei and then Brunei to Sandakan. In a flat transaction model, the agent has to either rollback or cancels all the operations that were incorrect. However, both operations will require a massive rollback that will destroy all the previous efforts that have been made; these methods will be costly if the transaction is much longer than this given example (Bertino, Catania, Ferrari, 2001, pp.321-370).

The best method for this example is using a selective rollback, rather than throwing away all the previously-made transaction. By using selective rollback, it is possible to just cancel the flight from Kuala Lumpur to Kota Kinabalu and replace it with a flight from Kuala Lumpur to Brunei. To improve the function of a flat transaction, a new or good extension has to be implemented into the transaction system. One of the more popular and powerful system is the use of nested transaction. The characteristics of nested transaction as discussed before; has a different level that controls each single layer of transaction. Each transaction can be rollback individually without affecting the whole transaction itself (Bertino, Catania, Ferrari, 2001, pp.321-370).

There are three rules of nested transaction: commit rule, rollback rule and visibility rule. Bear in mind that there are parent and children layer in a nested transaction system. Parent will not commit until all its children or sub transactions are committed or aborted. Rollback rule in nested transaction are much more flexible compared to a normal transaction system. Parent level transaction will still be able to be processed even when the child level is having problems. Parent level can carry on by just issuing a new process to replace that aborted sub-transaction. However, if the parent level transaction is aborted, all the sub transactions should be terminated and rollback. All the processes made within sub transactions are visible to parent level whether their execution is completed or aborted (Bertino, Catania, Ferrari, 2001, pp.321-370).

To prevent any lost updates or uncommitted dependency problems, locking is a kind of concurrency control mechanism that can solve the entire problem. The basic algorithm is such that it will lock the object of the database to prevent the transaction to make any changes with it. By doing so, the working transaction will be able to continue its task until some stages that it wish to stop. Thus, the locked data will remain safe throughout the whole process. There are two types of locks: exclusive locks and shared locks. If transaction A uses the exclusive lock method to lock data, the other transaction that tries to access the data will not be immediately granted the access (Date, 2004, pp.465-479).

However, the shared lock mechanism functions differently. Transaction B will be able to access the data, but will also lock the data with the shared lock method. All these methods are able to solve most concurrency problems. Nonetheless, this method is not perfect – it has a problem called deadlock. The reason for using locking is to prevent different transactions accessing the same data and causing problems. But when two transactions are in a locked state, none of them will be able to read each other. This is known as a deadlock. When this problem happens, Date (2004, pp.465-479) claims that a good system will be able to detect it, but some other systems might require a timeout mechanism to detect a deadlock. This may not be accurate because sometimes a delay lock may not be an error but a human delay. When someone execute a transaction, he may use a long period to collect information from the customer. The system might consider this as a deadlock and do a rollback, thus this will cause some inconvenience for the staff and customer. When a deadlock has been detected, one of the transactions should be rolled back and the lock released, so that both transactions will resume normal execution (Date, 2004, pp.465-479).

Security is an important criterion to be focused on when dealing with nested transaction in a multilevel environment. All sub transaction has the same security level with the single nested transaction. A transaction does not have written permission; it cannot create any low-level sub transactions. A transaction does not allow creating any high-level sub transactions as well; otherwise it will lose all the advantages in a nested transactions environment. Basically, data access is based on two rules, L (d) <= L (T) and L (T) = L (d). These two formulas are based on two transactions d and T. L (d) <= L (T) show that the security level of T is greater than transaction d, therefore T can only be able to read the data. For L (T) = L (d), transaction T security level is equal to transaction d, it will then allow the transaction T to write the data. From these rules we understand that a lower level transaction cannot process any writing command to any transaction level on a higher level (Bertino, Catania, Ferrari, 2001, pp.321-370).

Most of the commercial DBMS systems are using 2PL protocol (Bertino, Catania, Ferrari, 2001, pp.321-370). The algorithm for this protocol is when a transaction is trying to read or write a particular data, it will first perform a lock on that data. They are read lock and write lock, read lock is applied when a transaction is reading the data and write lock being applied when trying to write the data. In an interactive manner, when a data does not have any lock on it, any transaction can apply a read-or-write lock on it. When the data has a write lock on it, no transaction is able to do a read or write lock on it. However multiple transactions can have read lock on the same data but not write lock. A write lock will only be applicable when the data is totally free with any lock. Otherwise, they need to wait until the transaction, that holds the lock, releases it. Within this 2PL protocol, transactions will need to go through two steps when locking and releasing a lock, which are growing phase and shrinking phase. To ensure the serializability for the transaction, no other lock can be held by a transaction after entering a shrinking phase. Bertino, Catania, Ferrari (2001, pp.321-370) mention that 2PL is not a good protocol for the current execution of nested transactions because of the bad timing channels in a multilevel environment. However, a better protocol called Bertino et al uses save points which is more secure for timing channels (Bertino, Catania, Ferrari, 2001, pp.321-370).

A new lock called signal lock is introduced in order to avoid the timing channels. In 2PL protocol, a transaction needs to apply a read lock when reading a data. But in Bertino et al protocol, the transaction can read the data without applying any locks. Signal lock is acquired by a transaction when it wants to access a lower security level data. A nested transaction is similar to a normal transaction method, which uses read, signal and write lock before accessing any data. The significant difference compared to normal transaction is that there is no shrinking process within the nested transaction. Nested transaction is too complicated to apply shrinking process. Therefore, when a process is done any lock will be automatically released to the upper level in a nested transaction rather than releasing the lock itself by using the shrinking phase as discussed before in 2PL transaction (Bertino, Catania, Ferrari, 2001, pp.321-370).

Locking rules for a nested transaction is significantly different compared to a normal transaction. If transaction A wants to apply a write lock to data B, the data must not have any lock held by other transactions. This is similar to non-nested transaction; however, nested transaction has a special rule. When all the transactions that hold either read or write lock on the data B are under the same tree of transaction A, A will still be able to apply a write or read lock on data A. Signal lock will only be applied if data A does not hold any write locks. When a transaction is complete, its parent will keep all the locks that have made by the transaction. If the transaction is aborted, all the locks will be released and will not have any relation with their parents. The reason for all the above rules is to prevent any delay of the data read by lower level transaction. Signal lock will not be blocked by any lower level transaction to lock a data. High-level transaction will have to wait if the data is locked by a write lock. This is to ensure that the high level transaction is reading the most updated version of the data (Bertino, Catania, Ferrari, 2001, pp.321-370).

The benefits of using nested transaction over flat transaction are immense; nonetheless, nested transaction is not perfect. There are still many problems which exist in certain areas, such as the complexity of coding structure. It is very complicated to implement a nested transaction system. A company might need to spend a very high cost to upgrade their system to a nested transaction system. Deadlock still happens in a nested transaction environment just that nested transaction is able to solve the problem flawlessly.

In conclusion, nested transaction model can bring many advantages to their users. Why do users still need to use the traditional transaction instead of using nested transaction model for making transaction and record keeping? Living in this sophisticated world, everyone must focus on nested transaction model in order to get the best out of database application. Nested transaction model has been introduced for a long time; it has been use in many large scale DBMS systems. Therefore, the use of nested transaction will bring more benefits to the transaction systems compared to normal flat transaction systems.

List of References

Alkhatib, G., Labban, R. S. (2002), “Transaction management in distributed database systems”, Journal of Information Systems Education, Vol.13, pg.95

Bertino, E., Catania, B., Ferrari, E. (2001), “A nested transaction model for multilevel secure database management systems”, ACM Transactions on Information and System Security (TISSEC), Vol.4, pp.321-370

“Centralized versus distributed database systems” (1997), Bioscience, Vol.47, pg.54

Date, C. J. (2004), An introduction to Database Systems, (8th edn), Pearson/Addison Wesley

Haerder, T., Rothermel, K. (1987), “Concepts for Transaction Recovery in Nested Transactions”, ACM SIGMOD, pp.239-248

Hassanein, H. S., El-Sharkawi, M. E. (2000), “Performance modeling of nested transactions in database systems”, IBM Centre for Advanced Studies Conference, pg.4

Categories: Research Essay Tags: