Database Error Detection and Recovery 163
CowboyRobot writes "ACM Queue has an interview by Steve Bourne with Bruce Lindsay, responsible for a lot of the SQL and RDBMS we use today, in which they discuss error detection and recovery.
My favorite part other than the photos is the definition of Heisenbugs - those problems that disappear only when you explicitly look for them."
Now where's that article again? (Score:1)
Where is it?
Article Text (Score:1)
Engineering for Failure
If you were looking for an expert in designing database management systems, you couldn't find many more qualified than IBM Fellow Bruce Lindsay. He has been involved in the architecture of RDBMS (relational database management systems) practically since before there were such systems. In 1978, fresh out of graduate school at the University of California at Berkeley with a Ph.D. in computer science, he joined IBM's San Jose Research Labo
Re:Now where's that article again? (Score:1)
Rite of Passage (Score:5, Interesting)
Re:Rite of Passage (Score:2, Interesting)
I have been dealing recently with a Heisenbug in Internet Explorer while trying to design a web page with floats. (Wed designers, weep with me.) The trouble is that a certain page renders wrong (what I think is wrong), the first time you look at it after opening Internet Explorer, and then displays correctly every time you look at it afterward, even with 'refresh'.
And yes, I really do have to design it for Internet Explorer.
Also, early on in the development of the page, I was encountering a similar situat
Re:Rite of Passage (Score:1, Informative)
The trouble is that a certain page renders wrong (what I think is wrong), the first time you look at it after opening Internet Explorer, and then displays correctly every time you look at it afterward, even with 'refresh'.
That sounds like one of these bugs [positioniseverything.net]. I've had even worse - all the text on the page disappearing, but minimising and then maximising the window fixes it! Internet Explorer really is a piece of shit.
Re:Rite of Passage (Score:1)
Re:Rite of Passage (Score:2)
(Note: I don't actually think Firefox is a piece of shit)
Re:Rite of Passage (Score:2)
my least favourite in internet explorer is that if you have a site that is delivered chunked via mod_gzip and press refresh IE decides that you only wanted the last chunk !
Re:Rite of Passage (Score:3, Interesting)
Re:Rite of Passage (Score:2)
Re:Rite of Passage (Score:5, Informative)
For stuff like this, a wonderful debugging tool is valgrind [kde.org] -- it takes about 5 minutes to download and install (GPL, Linux/x86), and will find all kinds of memory-usage bugs in your program that you never even knew existed.
Re:Rite of Passage (Score:2)
Debugging code is harder than writing code. If you write the most complex code you are capable of, you are by definition not smart enough to debug it.
-
Re:Rite of Passage (Score:2)
Debugging is twice as hard as writing the code in the first place. Therefore,if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
more here [dotgeek.org]
Re:Rite of Passage (Score:2)
-
Re:Rite of Passage (Score:3, Interesting)
Re:Rite of Passage (Score:4, Interesting)
Re:Rite of Passage (Score:2, Interesting)
if (1=0)
print("The end of the world...");
Of course, another senior pro
Re:Rite of Passage (Score:2, Insightful)
As well he should have. If you need something like that to stay, you need a comment explaining its purpose. "The following statement is never executed but necessary to work around a compiler bug" would be helpful. You could even describe the bug so they could check if it's still necessary once the next version of the compiler is released.
Re:Rite of Passage (Score:2)
The proper endgame mindset should be, "Does this change (or more generally, this action) help me ship the product? If not, I won't do it."
Re:Rite of Passage (Score:2)
So for example, perl initializes variables to 'undef' by default. I am already aware of this without setting the value explicitly. How do you justify setting the value to undef explicitly?
this is to test a bug in slashcode (Score:3, Funny)
Re:this is to test a bug in slashcode (Score:2)
Heisenbugs (Score:5, Funny)
This is a really cool article, but it was especially fun to see the heisenbug [jargon.net] mention. Years ago, some fellow CS people and myself conjectured a similar phenomenon that seemed to manifest once in a while, in which a computer malfunction goes away after one "proves" that there's no cause for the error to exist.
Here's a list of heisenbug anecdotes [c2.com], but note that some of these submissions aren't strictly heisenbugs.
Re:Heisenbugs (Score:5, Funny)
Then someone figured out that the system had the 'pipes' screensaver on that came with NT3.51. Of course, as soon as we started to diagnose the machine, the screensaver would disappear. And yes... the screensaver turned out to be the culprit, sucking all the system resources away. We removed it and all was well.
Does anyone know who coined the term 'heisenbug' by the way?
Re:Heisenbugs (Score:1, Informative)
Re:Heisenbugs (Score:1)
Re:Heisenbugs (Score:2)
Re:Heisenbugs (Score:2)
Yeah, like Linux admins are all Einsteins.
It's even worse that on Windows - while screensaver on Windows had to be liked by a particularly stupid admin in order to be enabled, on Linux it was enabled by the manufacturer - how fucking stupid is that?
As recently as Red Hat 8.0, they had screensaver activated by default, and it was known to cause system freeze-ups. Not to mention that X-Windows shouldn't have been installed by default anyway.
It's inexcusable to be stupid, but how stup
Re:Heisenbugs (Score:2)
Re:Heisenbugs (Score:2)
My point is that a stupid admin is a stupid admin and will always be a stupid admin no matter what OS he or she uses.
I know some stupid admins who used to be stupid NT admins and their company has switched to Linux lately. They're now stupid Linux admins. Same people. There's no need to single out Windows users.
Re:Heisenbugs (Score:2)
A 3D screensaver on a server? Now that's the sort of innovative shit that Linux needs to succeed in the marketplace ;)
(Yes, I know about Xscreensaver [jwz.org])
Re:Heisenbugs (Score:2)
Re:Heisenbugs (Score:2)
I think what you're describing is a corollary to the well-known fact that when the support person is investigating the problem it never manifests again until after they're gone.
Proving there is no cause for the error is like all of the passengers believing the plane will fly (the real
Re:Heisenbugs (Score:1)
Some programs die, but some programs live, I guess.
KFG
The Heart Monitor Case... (Score:4, Funny)
SB: But also in the heart monitor case, it?s hard to ask users if they want to keep the heart going because the answer is pretty obvious, whereas in the Word case, you can ask the user in some cases what to do about it.
New Microsoft Pace - Heart Monitor and Pacemaker
STOP: 0x0000000A (0x0000015a, 0x0000001c, 0x00000000, 0x80116bf4)
IRQL_NOT_LESS_OR_EQUAL - Beat.exe
Please hold your breath while a dump file is created...
Re:The Heart Monitor Case... (Score:1)
Technology where the screen and the user become the same color. Now that's bonding.
Re:The Heart Monitor Case... (Score:2)
Picture (Score:5, Funny)
Re: (Score:1)
Re:Picture (Score:2)
That's what it looks like when the beard/pipe/suspenders crowd groom themselves. Sort of like an aged flower child version of Bill Gaines [allaboutcomix.com] a la R. Crumb.
Re:Picture (Score:1)
Re:Picture (Score:3, Informative)
Re:Picture (Score:2)
Oh dis!
That reminds me, I'm only a week or so from a mullet. And I can't have that happen, I don't want people thinking I'm McGyver with glasses.
Heisenbugs - Oh my gawd (Score:1, Interesting)
It's like the equipment is playing hide and seek with you. You found the problem and the
Re:Heisenbugs - Oh my gawd (Score:2)
Your sig (Score:2)
This is why MySQL ignites flamewars (Score:5, Insightful)
A good design principle is: either do what you're told to do or tell us you didn't do it and why, but don't do something completely different.
Exactly. Compare and contrast with MySQL's behaviour [sql-info.de].
That's why there are loads of people who point out that you can't trust MySQL for important data, or that it isn't a "real" database. A real database tells you when it fails, which is something that is necessary for trusting it with data integrity.
The key point here is that if you go to sea with only one clock, you can't tell whether it's telling you the right time.
Ahh... but a man with one clock always knows the time - but a man with two is never quite sure :).
A man with one clock only thinks he knows the time (Score:1, Funny)
Re:A man with one clock only thinks he knows the t (Score:1, Funny)
Of course, both clocks could be totally broken but have been set to the same (unchanging) time by obsessive/compulsive someone "tidying up". Then the clocks are each right twice a day but not when you think.
Re:This is why MySQL ignites flamewars (Score:1, Interesting)
All data is imperfect without proper reference, and all reference data is external. See the art of calibration.
Even a room full of atomic clocks will not reflect the correct time in 100 years because the earth-sun system isn't slave to the rhythms of any process other than it (at least in any quantifiable sense). They could be several
Re:This is why MySQL ignites flamewars (Score:2, Insightful)
Any second now, the parent is going to be modded down, right? Everything noted in the parent is well documented functionality in MySQL, which takes the approach of not generating application-breaking exceptions, and allowing you to split of data validation as either a pre-processing step or a more macroscopic endevor.
You may not like this approach. If you don't, then don't use MySQL. There are lots of other (very
Re:This is why MySQL ignites flamewars (Score:4, Insightful)
Everything noted in the parent is well documented functionality in MySQL
Well documented, perhaps, but nevertheless utterly wrong and often in violation of the SQL specifications.
Slashdot doesn't need your redundant and off-topic flames.
Try reading the article. I was pointing out that MySQL's behaviour goes 100% in opposition to what the article calls a "good design principle". How on earth is that off-topic?
Re:This is why MySQL ignites flamewars (Score:1, Interesting)
You're right that pre-processing is a good way to do constraints. We gave-up on using Oracle contraints, because they aren't very featureful. Well, that's not Oracle's fault. It's the fault of the SQL-standard. You can do *much* better error checking with a turing-complete programming language than you can with SQL. That's what we do with our applications even though we use Oracle. The biggest problem is that
Re:This is why MySQL ignites flamewars (Score:1, Insightful)
It's funny how that idiot isn't smart enough to argue the philosophical differences
Before calling me and everybody who agrees with me names, go back and read what I posted. Most of it's a case of asking MySQL to do something, MySQL being unable to do it, and instead of throwing an error, doing something completely different.
Whether you prefer your data integrity checks in your database or in your application is irrelevent to the point I was making. The person claiming that MySQL's trying not to gen
Re:This is why MySQL ignites flamewars (Score:2, Troll)
Which you obviously don't...
If your database has one user.
Re:This is why MySQL ignites flamewars (Score:1)
It would be nice if such option was a system setting (or database-scope setting) somewhere. However, I rarely see apps where truncating numbers to fit is considered a good thing. COBOL does that and it has been the source of many bad bugs, such as w
Re:This is why MySQL ignites flamewars (Score:2)
To some degree or another, essentially everything. Are you aware of any systems which have absolutely no errors, including errors of omission?
An incorrect value being sneaked into a large database is far, far harder to detect and correct than your query coming back with an error.
True. If you attempt to put a gallon of worms into a half-pint container, you should expect troubles. Assuming that the table has any variable-length column, there
Re:This is why MySQL ignites flamewars (Score:2)
> not generating application-breaking exceptions, and allowing you to split of data validation as
> either a pre-processing step or a more macroscopic endevor.
Failing to provide exceptions to obvious errors is now 'well documented functionality'? Really? Exactly where? I remember when MySQL listed most of these issues as 'known problems'.
I think you're just misinterpreting the cause of these e
Re:This is why MySQL ignites flamewars (Score:2, Funny)
Re:This is why MySQL ignites flamewars (Score:1, Troll)
Re: (Score:1)
Good god, that picture! (Score:4, Funny)
Re:Good god, that picture! (Score:1, Funny)
That was my first thought too! Unfrozen, caveman database genius.
Too much slashdot. (Score:2)
Perhaps
Re:Too much slashdot. (Score:2, Insightful)
Heisenpages (Score:2, Funny)
Make error message meaningful! (Score:5, Insightful)
One of the things that is addressed to some extent in the article is the need to make error messages meaningful! There is nothing more frustrating to me than to encounter an error message like "syntax error."
At a minimum, an error message should have a Unique ID of where in the code this message is coming from, what was expected, what was actually found, and the context where it was found.
EXAMPLE:
Which would you prefer:In my experience, meaningful error messages save more debugging time than it takes to put them in.
Re:Make error message meaningful! (Score:1, Interesting)
Re:Make error message meaningful! (Score:1, Funny)
Re:Make error message meaningful! (Score:2)
Re:Make error message meaningful! (Score:4, Insightful)
I hand crafted a (simple) C compiler when an undergrad, and figuring out where the stream of good tokens turns to mush is very hard. Often by the time you realize there's a problem, you already missed the real problem.
I agree you should be as explicit and precise as you can in telling the user, but there are so many ways to screw things up, and they look so much like unusual-but-legal syntax that it's probably better to tell the user / developer what you actually do know, rather than guess about what might have been wrong.
Now, on the other hand, if your statement was
the compiler should probably be able to tell that the equals operator needs an operand of some kind on the right, and there was none. It ought to tell you immediately that the problem was a missing right hand operand for the equals operator, and it should be able to tell you the exact position of the equals that is missing the operand. Just spitting out "syntax error" in a case like that is a little weak.Re:Make error message meaningful! (Score:2)
Re:Make error message meaningful! (Score:1, Interesting)
1. Syntax error in line 1.
2. ERROR [ID=WXY1234] found "'" where expected """ in statement: "{printf "%d\n', i}" on line: 1.
I would prefer an error message indicating the real source of the problem, an unterminated string literal. What you list as the 2nd option doesn't describe the problem that the compiler runs into when trying to compile that code.
I submit to you option #3:
3. ERROR [ID=123123] Unterminated string literal in line 1
Re:Make error message meaningful! (Score:2)
Some languages do have support for error detection (Score:2)
Heisen-whats? (Score:2)
Are there Heisen-features as well?
Re:Heisen-whats? (Score:4, Funny)
Re:Heisen-whats? (Score:2)
Or is that another aspect of duality where it's really both at the same time?
-
Heisensoftware (Score:2)
Exception Handling (Score:3, Interesting)
What a nightmare.
Many people code unique inserts like this.
Check for duplicate record.
if not found, then insert.
else, prompt user.
Using exception handling, you code like this.
insert.
if error thrown, prompt user.
One less query, lots less code.
One problem, the web application language treated all db errors as fatal. When asked, I was told this was by design.
Thinking about it, I feel that Macromedia didn't want me to code efficiently. You don't sell extra ColdFusion servers if you can offload all your data logic to the SQL server. (Where it belongs)
Re:Exception Handling (Score:1)
I don't think that is the case. ColdFusion can capture DB errors. For example, it has a statement something like <cfCatch errorType="database">....
Whether it always works or not is perhaps another matter.
Re:Exception Handling (Score:2)
A simple example.
Insert record
If duplicate record error then update record instead
if other error throw exception
Now, a duplicate record would throw a fatal error to coldfusion. I could then catch that error in CF, but now I have to code for 2 possible outcomes, a fatal error that isn't fatal, and a fatal error that is fatal. At that point, you might as well split it int
Re:Exception Handling (Score:2)
Check for duplicate record.
if not found, then insert.
else, prompt user."
There's a race condition in this- a window between the checks and the insert.
So you could get an error anyway- a DB error if there's a unique constraint, data corruption if there isn't.
What you could do is lock (prevent other checks and inserts) then check, then insert if would be unique. But the lock could signifcantly affect performance - depends on the locking and the database.
As you sa
Fix it like this.... (Score:2)
reindex
restore backup.....
repeat
Ahah! (Score:1)
Java does exactly what Bruce wants (Score:4, Insightful)
I bet he didn't look into Java. Java (at least) allows and enforces that. A method will only throw an exception if declares to do so. A caller is forced to provide appropriate handlers or to declare it throws the exceptions not handled at its level. If a method can throw A, B or C but gets D during its execution, it has to in some way map D to either A, B or C (or not throw an exception at all).
Of course, I am talking here about checked exceptions. Unchecked exceptions are supposed to represent *bugs*, and nobody should be trying to capture those.
The sad thing is that even seasoned Java programmers do not understand how to write code w.r.t. exception handling. And beginners are usually turned off by the verbosity required by exception handling, so it is usual to see code where people capture (because they are forced by the language) and ignore exceptions (because they are too lazy and/or stupid to understand the consequences).
Re:Java does exactly what Bruce wants (Score:5, Insightful)
In my Java code I'm pretty paranoid about catching exceptions and handling them in as intelligent a way as I can, and even so I've run into plenty of situations where there's really no good way to recover from an underlying error and I end up just repackaging the exception into a higher-semantic-level one and tossing it upstream, where the upstream code does the same thing, all the way back out to the UI code, which displays an error message. At which point all I've achieved is cluttering up the intermediate layers of code with useless exception handlers when I could have gotten exactly the same effect by just catching a superclass exception in the UI code and displaying the same error message. (In addition to catching any specific exceptions that would cause a different result, of course.)
Most likely anyone who's written a Java app of any appreciable size has run into exactly the same thing. In theory, and in small sample snippets of code, checked exceptions seem great. In practice, even some experienced Java gurus find them more hassle than they're worth. I'm quite certain that over the years I've spent far more time writing code to handle checked exceptions than they've saved me in debugging or diagnosis time. That to me is not the sign of a helpful language feature.
Re:Java does exactly what Bruce wants (Score:2)
When a seemingly-impossible thing happens, I wrap it in a RuntimeException. Something like this:
Re:Java does exactly what Bruce wants (Score:1)
or WORSE:
But it's a good mechanism, and just bad programmer practice.
Cool! (Score:2, Funny)
ps: not a troll, this guy's a freakin genius. I hope I look like that in 20+ years.
Re:Cool! (Score:1)
Heisenbugs (Score:2)
ROFL! (Score:2)
BL In the heart monitor case, you better keep the heart going, whereas in the Microsoft Word case, you can just give them a
Oh my heart! (Score:1)
But blue screens probably cause a lot of stress heart-attacks, so that the end result is the same.
Has language in CS matured? (Score:3, Funny)
- "You asked me to do X, I didn't do it."
- "Aha, this seems like I should go further."
- "Oh, I see this as one of those really bad ones."
- "I'm going to initiate the massive dumping now."
Obviously he is an expert in his field but I'm not sure if he talks this way because of his personality or because there isn't a vocabulary big enough to describe it.
Would you imagine a medical doctor talking this way?
- "So the white blood cells fight with the cancer cells: die evil cell, die!!"
Or an engineer:
- "The little peg ask it's big brother : can you help me convert this energy into circular motion?"
Re:Has language in CS matured? (Score:4, Interesting)
In fact, the correlation is so strong that I am suspicious of folks who *cannot* boil an arbitrarily complex interaction into an easily understood metaphor.
Jargon does in no way denote true understanding.
Re:Has language in CS matured? (Score:2, Interesting)
Applying that idea to coding, it would mean that talking like that in relation to code (.. so I will ask the other component
Schroedinbugs (Score:2)
The on-line hacker Jargon File, version 4.1.0
Only on slashdot... (Score:2)
Not everybody reads the same obscure material as you do. You're SOOO up-to-date. Do you watch television as well?