Often you will come across a use case where you have lot of data in your database which has to be processed from multiple machines. There are few ways to do this
One master server which allocates batches or work to different slave machines. For example first 100k records in slave 1, second 200k records in slave 2..and so on. In this case the master is responsible for ensuring that the same record is not processed in two different slave machines. The disadvantage of this approach is, the master is now controlling the shots and if for some reason the master node fails then it becomes tedious to control. Also master has to keep track of what slave machines are online and keep checking whether they are available and processing.
However in Oracle this new functionality called for update skip locked is introduced. An example of how this works is
You have a table which contains 100 records. Another process is adding records to this table at regular intervals.
select * from <table> where <condition> for update skip locked;
Slave 1 issues this query which picks all the records which meets the condition given. For example all the 100 records are now picked up by slave 1 for processing. Please note the for update, locks those records so if you give any update query at this point in time, the update query has to wait for slave 1 to commit or roll back the transaction. But a select query will return all the 100 records.
Now if another process adds 20 records in the table so the count is now 120. If a normal select query issued at this moment from Slave 2, then all the 120 records will be fetched. However if the select query contains skip locked class added then only the records which are not locked by any other process is returned.
select * from <table> where <condition> for update skip locked;
This will only return the new added 20 records not the 120 records. So you can process the records from multiple machines and you dont have to allocate data ranges for different machines to process.
However this does have a catch. If you want to add rownum to the query then you have a problem. In the same example above, in the initial 100 records, if you select only the top 50 by adding rownum to the query like below.
select * from <table> where <condition> for update skip locked where rownum < 51;
It will return the first 50 records and if the Slave 2 issues the same query with the rownum < 51 , it will return zero rows. It looks like the Oracle adds the skip locked filter after applying row number filter. So it selects the first 50 by applying rownum and then it skips the locked records which returns only zero rows.
I am not sure is there any straight forward solution for this problem. For now, if you dont have to restrict your records using rownum concept then you can use skip locked for general purposes.
One master server which allocates batches or work to different slave machines. For example first 100k records in slave 1, second 200k records in slave 2..and so on. In this case the master is responsible for ensuring that the same record is not processed in two different slave machines. The disadvantage of this approach is, the master is now controlling the shots and if for some reason the master node fails then it becomes tedious to control. Also master has to keep track of what slave machines are online and keep checking whether they are available and processing.
However in Oracle this new functionality called for update skip locked is introduced. An example of how this works is
You have a table which contains 100 records. Another process is adding records to this table at regular intervals.
select * from <table> where <condition> for update skip locked;
Slave 1 issues this query which picks all the records which meets the condition given. For example all the 100 records are now picked up by slave 1 for processing. Please note the for update, locks those records so if you give any update query at this point in time, the update query has to wait for slave 1 to commit or roll back the transaction. But a select query will return all the 100 records.
Now if another process adds 20 records in the table so the count is now 120. If a normal select query issued at this moment from Slave 2, then all the 120 records will be fetched. However if the select query contains skip locked class added then only the records which are not locked by any other process is returned.
select * from <table> where <condition> for update skip locked;
This will only return the new added 20 records not the 120 records. So you can process the records from multiple machines and you dont have to allocate data ranges for different machines to process.
However this does have a catch. If you want to add rownum to the query then you have a problem. In the same example above, in the initial 100 records, if you select only the top 50 by adding rownum to the query like below.
select * from <table> where <condition> for update skip locked where rownum < 51;
It will return the first 50 records and if the Slave 2 issues the same query with the rownum < 51 , it will return zero rows. It looks like the Oracle adds the skip locked filter after applying row number filter. So it selects the first 50 by applying rownum and then it skips the locked records which returns only zero rows.
I am not sure is there any straight forward solution for this problem. For now, if you dont have to restrict your records using rownum concept then you can use skip locked for general purposes.
No comments:
Post a Comment