Thursday, March 29, 2012

error reportings -- to except or not expect or return?

You have seen a chair in a public room and go back after a while to acquire it. Is it unexpected that the object is not there anymore?

I implement the business logic in stored procedures. Once user orders some service, the actions carry out are: a)check they have sufficient funds on their balance; and, b) a price check is made selecting the latest from pricelist. The newly created job entry refers the price in the pricelist table according to service_id and the issued pricelist date.

These are data validity checks, which must be done in a multiuser database. In the session beginning, user gets prices and funds. These data are not locked throughout the session -- another process may change the records. Therefore, the extra "business" checks are needed to preserve data correctness at the start of every transaction. One way to report the errors would be to raise an error.

Consider the primary-foreign key violations. They are raised by system with severity level of 16. Uniqueness violation is raised with severity 14. The MSSQL reference http://msdn2.microsoft.com/en-us/library/ms164086.aspx tells us that this is a sort of error, which can be corrected by user. Indeed, (s)he can. But the exceptions are unexpected situations. Exception processing consumes huge CPU processing power if some occurs. Normally, programs allocate resources and use them. If resource cannot be allocated -- app exits with exception. It just has nothing to do. However, there is nothing exceptional in shared data access by multiple concurrent processes in a multithread/multiuser/distributed application. Since, we cannot lock the user resources during user session, we should be prepared to concurrent data changes. Therefore, I find the exception raising inappropriate tool.

Consider a situation, when one user is assigned to a group. Nothing prevents from another admin to remove the group at the same time. Therefore, our user management application should be ready to expected "group does not exist" "error" result. I even disagree that this is an error, since it is perfectly valid to have such a situation. It is like you have seen a chair in a room and later return to take it. But the chair is not there anymore. Is it unexpected? The case should not be exceptional, IMO.

So we have:
1. Severity 1-10 "informational messages";
2. Severity 11-16 errors; and
3. RETURN error_code.

Which should be used? Anything else? Do my considerations make sense?

When I want the SQL Code to with certainly 'talk' back to the application code, I use RAISERROR.

For SQL Procedures to 'talk' with each other, I may also use the RETURN value.

I may use 'WITH LOG' to force the error condition into the Application Log.

And you are wise to question what is an error condition. Is it a 'system' error, a 'logic' error, or a 'business rule' exception?

|||Thank you, Arnie for sharing your experience.


> And you are wise to question what is an error condition. Is it a 'system'
> error, a 'logic' error, or a 'business rule' exception?

I believe that the "business logic" is an oxymoron likewise "market order", which is chaos and disorder in fact. Nevertheless, I do not distinguish between "rules" and "logic" here. It is logical that user balance should not go below zero, it is logical that the price must be the latest one, it is logical that user must belong to an existing group.

I understand the exceptions in a serial programing where your threads allocate (lock) resources but not in distributed programing where you cannot be sure about current condition of resources. It is ok for the local copies of the data to be incoherent in multiprocessor system. When data become incoherent, they may cause logic/rules/constraint/etc violations.

Some recommend checking file existence before opening it. But another process may remove the file between the check and open op. System returns "file does not exist" open error. It does not investigate whether the thread checked the file existence or it is making a reckless open. In other words, it does not examine the intentions and conditions of the caller. One utility would, for instance, set up admin rights for a list of users. This 'setadmin' app assumes the group 'admins' exists. It has nothing to do if there is no such group. It is exceptional situation for the app. Another application requests available groups, uses one and sends result back. Multiuser application should be prepared to situation where the used group does not exist anymore. A more realistic is a stock with multiple operators. When you have selected the list of available goods into a local copy, the goods can be removed by another operator.
You should tell the user that the product list is outdated, that it is incoherent with reality rather than complaining that the operation cannot be accomplished.

Looks like, error handling in distributed systems is more general than a DB errors issue. But any advices are welcomed.
|||

Valentin,

Note that I didn't use the term 'business logic', but instead used 'business rule exception' -and yet I agree with you that often business needs sometimes seem to defy logic; and that when describing the condition of a market, 'market order' could easily be a oxymoron, but when used as a entity, 'market order' has a definitive 'thing' quality about it.

This, and your other posts on Locking behavior demonstrate a deep concern for data integrity, and I can certainly appreciate your questions. When there are opportunities for instability, instability will occur. Stasis can only exist when ignoring temporality. Yet, in fact, Stasis cannot be separated from temporal considerations. The certainity of a state of data is increasingly uncertain as the system becomes more complex -distributed, multiprocessor, and multi-threaded. We often resort to extreme control when faced with apparant chaos. As TS Elliot would describe it, the time 'between' is the shadow, and we cannot ignore the shadow, for all things unknown exists in the shadow.

I once worked on a project where, for speed, data was cached on each web server in a very large web farm, and yet many data items were unique and ONLY one customer would be allowed to purchase that one item. There could be thousands of concurrent customers considering the same item. It was necessary to create a lot of communication between the cach objects, tenative 'sold' indicators, and definitive 'sold' indicators, 'cancellation', etc. You can imagine the chaos -but it did have a solution.

Since, as you put it so well, there are opportunities for chaos between the moment we determine IF an action can be taken AND the moment the action is actually taken, AND the 'system' doesn't help 'protect' us from that potential chaos, we are faced with having to find ways to invoke 'extreme control'. So like forcing all passengers through the security checkpoint as a single point of control, it may be useful to consider how to force all sensitive data activity through a 'single point of control'.

One useful approach is to force all data changes through Stored Procedures (No direct table access). Then in the Stored Procedure used to UPDATE a set of resouces in a TRANSACTION, also use sp_getapplock / sp_releaseapplock as a way to more tightly control access to the resources.

Here is an example: (Use Northwind database)

Code Snippet


CREATE PROCEDURE dbo.Employees_U_LastName
( @.EmployeeID int,
@.LastName varchar(20)
)
AS
BEGIN
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRANSACTION


DECLARE @.LockResult int

EXECUTE @.LockResult = sp_getapplock
@.Resource = 'ThisIsMyProcess-StayOut',
@.LockMode = 'Exclusive',
@.LockTimeout = 0

IF @.LockResult <> 0
BEGIN
ROLLBACK TRANSACTION
PRINT 'SomeOne Else is Using this Resource'
RETURN
END


PRINT 'TRANSACTION IS Active'
SELECT
EmployeeID,
LastName,
FirstName
FROM Employees
WHERE EmployeeID = @.EmployeeID


PRINT 'Waiting...'
WAITFOR DELAY '00:00:15'

PRINT 'Second UPDATE to Active TRANSACTION'
UPDATE Employees
SET LastName = @.LastName
WHERE EmployeeID = @.EmployeeID


EXECUTE sp_releaseapplock
@.Resource = 'ThisIsMyProcess-StayOut'


COMMIT TRANSACTION


PRINT 'Waiting for Other Process to Complete'
WAITFOR DELAY '00:00:05'

PRINT 'CHECK Results of TRANSACTION'
SELECT
EmployeeID,
LastName,
FirstName
FROM Employees
WHERE EmployeeID = @.EmployeeID


END
GO

Then execute this from two different connections.

Code Snippet


EXECUTE dbo.Employees_U_LastName
@.EmployeeID = 1,
@.LastName = 'Davolio-Jones'

In the second connection, you will get the message: 'SomeOne Else is Using this Resource'. But you have to be adamant that there is only one 'egress' point to the resources -through the procedure.

Perhaps this will help you along on your quest to find a secure and reliable method to control data

|||

Sometimes maybe it doesn't matter if the 'problem' is a 'system' error, a 'logic' error, or a 'business rule' error -the 'problem' has to be handled, and in the case at hand, they may all be handled the same. It is still very useful to keep clarity about the differences between them. For at times, the effect of treating them the 'same' is confusing at best.

valentin tihomirov wrote:

I do not distinguish between "rules" and "logic" here. It is logical that user balance should not go below zero, it is logical that the price must be the latest one, it is logical that user must belong to an existing group.

Logic follows defined mathematical sylogisms.

1 = 2

2 = 3

Therefore 3 = 1

is based on logic. That balance MUST be >= 0 is a Business decision (busines rule) solely BECAUSE [ bal < 0 ] is mathematically valid -but in this situation, this business has determined to not allow that to occur. A bank 'should' not allow an account to overdraw (balance >= 0 ) at all times, in other words, as the trustee for your money, the bank 'should' not give others money claiming that it is your money if you don't have money to give out. However, the banking industry decided that there was money to be earned by allowing overdrafts, and you can't charge for an overdraft UNLESS the balance is allowed to be less than zero, so the business rule changed, and balance can be less than zero ( to some limit ).

Logic is easy to code, business rules, subject to change, require a lot more effort and consideration.

Of course, I realize that you know what I'm talking about. I'm using your posting to help others that may be reading, and still learning, and still uncertain... (So I hope you accept the 'lecture' in that guise. I often respond to a posting with the realization that I may also be responding to other readers with similar -but not exactly the same, questions.)

|||

For clarification, in the demonstration procedure I posted earlier, I used a PRINT statement when checking if @.LockResult. Normally, I would raise an error so that the calling application (or procedure) can capture the error number and react accordingly.

First, create the Error in the database, (Use whatever Error Number ( > 50000 ) and message you wish.)

Code Snippet

EXECUTE sp_addmessage
@.msgnum = 51001,
@.severity = 16,
@.msgtext = N'Resource NOT Available',
@.lang = 'us_english',
@.replace = REPLACE

Then in the procedure:

Code Snippet

IF @.LockResult <> 0
BEGIN
ROLLBACK TRANSACTION
RAISERROR ( 51001, 16, 1 )
RETURN
END

Now the application (or calling procedure) can handle ErrorNumber 51001 as deemed necessary. Wait a few seconds and Re-Submit perhaps...

|||

Arnie Rowland wrote:

One useful approach is to force all data changes through Stored Procedures (No direct table access).

SPs are really how I'm trying to do it. Thanks for hinting about sp_getapplock. It looks like a 'lock' object (mutex, critical section) in conventional programming language. However, I think it is too coarse-grain. All the users who wish to change their last names will lock each other. In addition, this lock shell be taken by parties who affect the last names in different way (removing the record, for instance). And you get the whole database locked/synchronized by one big lock. The granularity of automatic transactional locks are one record. However, using full db object names like db_name.table_1.row_id.field_x we can outperform the locking granularity down to field finesse. These are my first ideas about the procedure. Thanks for the hint.

Arnie Rowland wrote:

Logic follows defined mathematical sylogisms.

I was learnt it is the math, which is a science of proving, which is based on logic But I think I can understand the difference between logic and rules now. Thank you for the explanation.

Arnie Rowland wrote:

Of course, I realize that you know what I'm talking about. I'm using your posting to help others that may be reading, and still learning, and still uncertain... (So I hope you accept the 'lecture' in that guise. I often respond to a posting with the realization that I may also be responding to other readers with similar -but not exactly the same, questions.)


That is ok, I am a communist. The sharing is a means of saving resources and a condition of intelligent life survival on the planet Earth.

|||

You're right, using sp_getapplock is kind of 'course-grained'; sp_getapplock creates a 'lock' object, using the name you provide. The only thing 'locked' is the entry to the sproc code. The included (and required) TRANSACTION locks the underlying data. However, blocking others from using the sproc code in practice creates a single 'gateway' through which all controlled activity must pass, effectively forcing serialization of the controlled activity. I would want to make the process as streamlined and efficient as possible in order to cause the minimal amount of queue stacking.

In a very high performance / high utilitzation system, using sp_getapplock just may prove to be to much of a 'bottleneck'. But, if one was concerned about reading into an active TRANSACTION, and the effects on other activities as a result of being able to read into an active TRANSACTION, sp_getapplock is one way to reduce the 'paranoia'. It may be too 'heavy-handed' for most use.

...deleted...

Keep up the good questions.

No comments:

Post a Comment