由事务隔离级别分隔的并发进程的死锁

时间:2011-08-23 13:36:06

标签: sql sql-server-2005 transactions deadlock sql-server-agent

我们有一张工作订单表。服务器代理程序作业从游标中的此表中获取100个条目并执行一些操作。要并行化这个,有10个服务器代理作业,它们调用以下过程(每个都有自己的@process_id):

CREATE PROCEDURE sp_do_workorder @process_id INT
AS
BEGIN TRY
    DECLARE @wo_id NCHAR(40),
        @wo_action NVARCHAR(100),
        @created_at DATETIME,
        @source_proc_name NVARCHAR(100),

    UPDATE procedure_ctrl SET [status]='running' WHERE [procedure]='sp_do_workorder_'+CAST(@process_id AS NVARCHAR(100)) AND [status]='idle'

    WHILE 1=1
    BEGIN
        IF NOT EXISTS (SELECT * FROM procedure_ctrl WHERE [procedure]='sp_do_workorder_'+CAST(@process_id AS NVARCHAR(100)) AND [status]='running') BREAK

        SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
        BEGIN TRANSACTION
            UPDATE workorder SET hash=CAST(@process_id AS NVARCHAR(100))
            FROM workorder x
            INNER JOIN (
                SELECT TOP 100 id FROM workorder WHERE hash='' AND workorder_step=0 ORDER BY created_at ASC
            ) y ON x.id=y.id
        COMMIT TRANSACTION
        SET TRANSACTION ISOLATION LEVEL READ COMMITTED

        DECLARE wo_cur CURSOR FAST_FORWARD FOR SELECT id,action,created_at,optin_source FROM workorder WHERE hash=CAST(@process_id AS NVARCHAR(100)) AND workorder_step=0 ORDER BY created_at ASC
        OPEN wo_cur
        FETCH NEXT FROM wo_cur INTO @wo_id,@wo_action,@created_at,@source_proc_name
        WHILE @@FETCH_STATUS=0
        BEGIN
            EXEC sp_basisprozess @wo_id,@wo_action,@created_at,@source_proc_name,@process_id
            FETCH NEXT FROM wo_cur INTO @wo_id,@wo_action,@created_at,@source_proc_name
        END
        CLOSE wo_cur
        DEALLOCATE wo_cur

        WAITFOR DELAY '00:00:01'
    END

    UPDATE procedure_ctrl SET [status]='idle' WHERE [procedure]='sp_do_workorder_'+CAST(@process_id AS NVARCHAR(100)) AND [status]='running'
END TRY
BEGIN CATCH
    EXEC dbo.sp_listerror
    DECLARE @error NVARCHAR(4000)
    SET @error='[sp_do_workorder]_'+CAST(@process_id AS NVARCHAR(100))+': critical problem'
    RAISERROR(@error, 12, 1)    
END CATCH

对于这10个代理作业中的大多数,我们经常遇到僵局。有没有人暗示为什么会这样?为了防止副作用,我们使用序列化事务隔离级别,因此只有一个代理作业可以获取一个工作订单条目。如果没有设置事务隔离,那么死锁就会消失,但经常发生,两个(或更多)代理作业会抓住相同的工作订单条目。

1 个答案:

答案 0 :(得分:2)

我没有尝试具体解决为什么在您的情况下会发生死锁但似乎您实际上正在使用表作为队列,在这种情况下请参阅this linked article以获取使用{的方法{1}}子句和锁定提示,以在没有死锁的情况下最大化并发性。

相关问题