My personal definitive guide to (java)Exceptions
Premises:
- I mostly compare C and Java as the main two languages with and without exceptions. You can swap your favourite two languages in, I’m not personally picking on C or something.
- C or Java code snippets in this article don’t necessarily compile correctly, consider it more like pseudocode to make a point.
- I know exceptions predates Java. I’m just talking about the language I’m most proficient with.
- Excuse english errors, it’s not my native language.
- Everything here is my opinion, when I say “something should be this way” is a shortcut for “I think something should be this way”.
Exceptions are one of the most poorly understood concepts of Java. Even experienced professionals can struggle with them and IMHO few people use them properly, mostly because there’s a huge amount of misinformation on the “proper” way to use them, which makes their usage way less effective than it could be. I have the feeling that a lot of the hatred programmers have for exceptions originates from their first days with Java. We’ve all been hit hard by the “Unchecked exception” compiler error in our first days, you couldn’t write the simplest method without being hampered by it. With time you learn your way around it but the unpleasant feeling for exceptions remains. And it’s a bad thing because exceptions are really neat.
I’ll go as far as to say that you can tell how good a java programmer is by looking at how they use exceptions.
Why exceptions
In a time before Java or C++, there was C. C didn’t offer any kind of error handling support, you could only handle your errors with a custom solution based on regular coding tools, like, mostly, function return values (or sometimes pointers parameters etc). Things looked like this:
There’s no standard tho:
- some libraries would return 0 on success and some error code on failure
- others would return 0 on failure and non-0 on success
- other need to mix the error status with the actual business return value, so maybe a function could return NULL on failure and a valid pointer on success.
All in all, it was a mess, a typical videogame setup function, for example, would look like this:
This show a very realistic mix of error handling mechanics you have to deal with. Also note that beside fully allocating the return value to error handling (and so basically foregoing proper functions altogether) you had no easy way to DESCRIBE your error, other than that it happened. Thus even uglier solution like global “errcode” variables were used to pass more information.
So basically, in C, FOR EVERY FUNCTION OR API CALL, YOU NEED TO MANUALLY CHECK FOR ERROR AND HANDLE OR RECOVER FROM IT in a proper way. You really had no other options. Since, in general, code is mostly calling functions and api (besides some assigned and arithmetics maybe?), it meant you had two choices: either tempest your code with error handling boilerplate everywhere, or don’t properly handle errors. At the worst, any screenful of C code could be like 70% error handling and 30% actual code (numbers that i just made up obviously).
Additionally, in some cases, the return value doesn’t offer “free slots” to allocate to error handling (like returning a nil pointer), because the range of valid return values cover the entire domain of the return type. For example something as simple as this:
Has no way to report an error to the caller, like for example an integer overflow, other than horribilities like adding a third parameter or a global variable etc.
And don’t forget to throw in Resource Management, that is: resource you allocate that you need to deallocate before returning. If you allocated a working buffer (for example to manipulate string), you better remember to deallocate it not just at the end of the function, but on every error check when required:
As you see, error handling blocks get bigger and more complex with each resource you acquire. Obviously the code could be refactored a bit but there’s only so much you can clean it, becouse resources must be deallocated properly, in reverse order and only those actually acquired.
Another problem in C is that it’s impossible to tell how a function reports an error just by looking at the signature (or even from a quick look at the source sometimes), so you have to revert to checking the docs. There’s no contract on the signature telling you how errors are managed. Java actually tried to add errors declaration to method signatures, which had its own problems (see checked exception later).
Your C programs were (are, if you still code in it) a continuous jump between business code and error handling code. Since error handling is a manual task you constantly have to do, it’s easy to forget every now and then, especially for newbies, leading to silent time bombs waiting on your code. What you want to concentrate on is business logic, not error handling. Which by the way is the same reasoning that came about with automatic memory management and garbage collector (“you want to concentrate on the business logic, not on memory management”).
Personally I consider this one of the biggest deal breaker for using C (beside maybe lack of modularity and namespacing), which would otherwise certainly still have its usage and raison d’etre.
So this is the first and only reason Exception were invented:
- To free the programmer from having to write boilerplate error handling code ON EVERY SINGLE FUNCTION CALL.
That’s it. Keep this in mind.
Doing that, they provide some added values, like unifying and formalizing error handling mechanics and relegating error handling code to its own separated block.
How they work
Well, each language with exceptions has its own flavour, but some general things are common to all, the main being: a way to signal an error (out of the bounds of a function signature) that interrupts the current flow and transfer it to an error handling routine, along with some information about the error. Now if it was just a matter of calling a handler function, it could as well be done in C, the problem is: where to go after the handler is done. You cannot resume execution like a regular interrupt handler. So designers devised a way to specify just that by letting you define a “catch” block, which will be the jump target for any error that happens inside its try block (even on nested calls). Catch blocks are not “linear” as the rest of the code, rather the call stack gets “unwinded” up to the closest “catch” block. It is traversal to the call stack, and this is what lets you write a single, catch all handler with proper semantic.
But let’s see the example from above. In java it could be something like:
I think nobody wouldn’t agree that this is a huge improvement. Business stuff is not hindered with error handling, error handling is nicely relegated to his own (small) block, readability is dramatically improved.
BUT, where exceptions really shine is when you have nested calls, ie: method A that calls method B that calls method C.
If you program with C, exception handling is trasversal to function calls: the main function has to handle it, the A function has to handle it, the B and the C function too. Each will have their own IFs and/or early returns and the usual boilerplate.
In Java you can think about the whole call chain and decide which is the best point to put a catch block (tipically whatever logic “main” you’re programming, but more on this later) and that’s it. Sub functions A, B and C don’t need to ever handle errors in any way. You might rise an eyebrow here, because after all your code is full of “catch” and error handling. And this is the first, huge and most important bit of “misinformation” about exceptions:
YOU ARE OFTEN ENCOURAGED TO WRITE CATCH BLOCKS, WHICH IS WRONG.
There are two reasons for this: the first one is mostly human: if you don’t place catch blocks, it doesn’t feel like you’re properly dealing with errors. They “escape”! If you write a service function, like a “downloadFile”, or “produceReport” and it doesn’t contain catches, your boss will look at it and say “hey, you forgot to handle exceptions, that’s a rookie mistake”. It’s a cultural thing that’s also encouraged by the compiler itself, and that brings the second reason: the compiler actually makes it hard to do exceptions right because of a thing call Checked Exceptions.
Suppose you have to write a function “downloadDocument” that takes an url. It’s from an interface your boss at work gave you to implement. Its signature could be:
byte[] downloadDocument(String url) throws CompanySpecificException
It actually throws a checked exception that your company codes usually uses internally.
Good, you start implementing it:
The problem is, it doesn’t compile, because “download” actually throws an IOException. You face the dilemma all other java programmers faced before you: what to do with an unhandled Exception. Here’s where most developers mistake. They usually add a catch statement and somehow “manage” the exception. But it’s wrong, because 90% of the time, that is not the right place to put a catch.
THE RIGHT THING TO DO WHEN FACING AN UNHANDLED EXCEPTION IS TO PROPAGATE IT, NOT TO CATCH IT.
Now, and this is super important, the right thing to do here is to add IOException on the throws declaration list. This is the second bit of misinformation: touching the throws clause is like a super taboo that only the gurus at your workplace are allowed to even think about. This is WRONG, as all exceptions are thrown by the code inside a method should be reported in the throw clause to let them “bubble up” to the caller, this is what exceptions are all about (see point #1). If you’re adding custom code to handle an exception (when you have no business in handling it) you’re going against their usage. You’re adding boilerplate code that exceptions are meant to free you from.
But back to our example: you’re a good programmer and you go on and add the exception to the throws clause, modifying the interface and risking the ire of your bosses:
byte[] downloadDocument(String url) throws CompanySpecificException, IOException
But now you have another problem: the call site where your “downloadDocument” is used, now doesn’t compile, because it’s inside another method (prepareReport() ) that only throws CompanySpecificException. You want to make things right tho, so you get even more intrepid and you change prepareReport(), adding IOException to its declared exceptions. But wops, turns out prepareReport() is also used by a totally different branch of the company software, managed by a different team. What now?
Are you seeing the picture now? Many people will tell you checked exceptions are bad, but few will give you the right reason.
THE REAL REASON CHECKED EXCEPTIONS SUCK IS BECAUSE THEY INTERFERE WITH THE NATURAL BUBBLING OF EXCEPTIONS.
So facing the wrath of TWO different development teams is too much even for the bravest of coders, you quickly revert your changes and do the next best thing:
You catch the exception and throw a different one you are allowed to throw (in this case CompanySpecificException, but in the worst case of a method with no throws declaration, it would be RuntimeException). If you’re a good programmer, you wrap the original one inside the new one (chaining them), so that a stack trace will reveal the real error location. If you don’t, and just do this:
throw new CompanySpecificException(“Error downloading document”);
then the stack trace will end up there, leaving you clueless about the true origin of the error. This is a very common error. I see it often made by my coworkers. Because the whole thing is error prone. If they’re really terrible or really newbies, they may even commit THE WORST ATROCITY ON THE EXCEPTION WORLD: eating up the exception silently:
catch(Exception e)
{
// nothing or perhaps e.printStacktrace();
}
I’m sure you’ve seen this around. This is the worst thing because error is silently ignored, and something else will break later on, reporting an exception and a stack trace that have nothing to do with the REAL cause of the error. Hello debugging!
Note that this whole error prone “catch game” you have to do to circumvent checked exception is totally DEVOIDED OF ANY ADDED VALUE, you only have to do it because of the flawed checked exception system. It’s pure boilerplate and is directly against the #1 reason of Exception existence (see point 1 above).
Even IDEs often get it wrong. For example in Eclipse when you have an unhandled exception, it offers you these two options:
Notice that, correctly, adding it to the throws declaration is the first choice, but if you decide to surround it, you get this:
To a distracted programmer, or one that is in hurry, this now compiles and so it’s “right”. Except it is not, the exception is eaten! Yes you have the TODO reminder, but I have thousand in my workplace codebase, and I can tell you they’re not much effective at calling attention of developers at a later time. A much more sensible default would be:
At least now if the programmer just mindlessly uses it as-is, it will properly handle the exception and chain it so that it reports the correct origin.
So the main takeout lesson here is the following:
THE PROPER WAY TO HANDLE EXCEPTIONS IN JAVA SHOULD BE TO DO ABSOLUTELY NOTHING AND LET THEM BUBBLE UP TO THE PROPER CATCH SITE.
Really, we have a wonderful system that solves an ugly problem in the best possible way: by doing nothing at all. And we waste it.
Again, this is hindered by the checked exception system. The problem with it is that it is too restrictive. I still want the “throws” declaration, but it should be used as a “contract”, to document the function about which kinds of errors it can throw, so the caller can distinguish them and deal with them in different ways, but shouldn’t restrict other exceptions to be thrown or require that the caller explicicly declare them too. For example, a method like this:
public void socketConnect(…) throws HostNotFoundException, TimeoutException, NoNetworkException
would be helpful for devs: they can differentiate the behaviour properly if they need to, but propagating the exception should still be the main thing.
In theory, entire service libraries should be writeable without a single “catch” block (or very few of them), because catch blocks belong to caller, to the end users of a certain service.
You could say: hey but if I use my fancy library like FancyExcel, I expect its methods to throw FancyExcelException right? Well, for a time it was like that (just look at Jasper JRException for an example), but now people are coming to their mind and either just throw RuntimeException, or simply declare the institutional exceptions (IOException etc) as needed, with no wrapping. For example, Gson has its own JsonParseException, but it extends RuntimeException. And it declares it in the Throws clause! Which is about the best you can do now: https://www.javadoc.io/doc/com.google.code.gson/gson/latest/com.google.gson/com/google/gson/JsonParser.html
If you think about it, wrapping an exception just to throw one with your custom company/library name, really have limited use and instead, it makes stack traces harder to read with all the chaining. If a library never does its own “catch”es, the stacktrace generated within it will be clear, linear and straight to the point, just as they were intended to be. Chaining exceptions also pollute the final error message. Have you ever seen something like “java.lang.RuntimeException: fancy.library.FancyLibraryException: java.io.IOException: java.foo.bar.FileNotFoundException: unable to find file”? I did.
Actually, catch-and-rethrow could be useful in some cases, expecially in long procedures where a single “No data found exception” could come from so many different systems. Something like “Error retrieving client detail: No data found” is certainly more helpful for a first assessment. The stack trace would still be the main evidence in both cases tho.
Exceptions let you handle error properly by doing absolutely nothing, so throw them at will but only catch them where you absolutely need.
A JAVA PROGRAM SHOULD HAVE PLENTY OF THROWS BUT VERY VERY FEW CATCHES.
I’ll go as far as to say that a java program quality could be measured by the number of “catch” it has (at least, one of the metrics could be it).
So which are good places for “catch”es then?
Catches should happen at the topmost of a logic “unit”, when not catching would propagate up the exception to the end user or exit the whole batch program. If you understood Exception, where to put catches should come pretty natural, anyway some good places would be:
- Just before returning a user initiated action: if the user pushes a “print report” button, which (hopefully asynchronously) calls printReport(), then printReport() is a good place to put a catch.
- In looping a list of item to process, when a problematic one shouldn’t block the whole process. For example, if you’re parsing a file containing lines with stock information, you may want to skip lines that give error and report them at the end while process all the others. Then you use a catch within the loop and handle error there.
- In an event system, the main event loop would certainly have a catch to capture errors of single events without terminating itself.
- A servlet needs a catch to capture problems with page generation (or any “action” initiated) to show a proper error message to the user.
What to show to the end user?
Certainly, one important question is what to show to the user when an exception happens. Well, you have many options here, and it also depends on what you’re programming and who’s your typical user. But one thing should be clear: exceptions are a programmer tool, not an error reporting tool. Showing an untreated exception message or a stack trace is rarely the right thing to do. For once, they expose the internal architecture of the system: they can tell an hacker which language you’re using, which library, framework, etc. and so are a security vulnerability. Don’t be like this:
The safest thing to show to a user is a generic message, and divert the actual error and stack trace to the logging system. Another option is to use custom subclasses of RuntimeException and add extra fields (yes you can) like error codes or user messages.
A word about resource management
As said before, resource management goes hand in hand with exception management. Freeing a resource is something you have to do no-matter-what, and it would interfere with exception propagation. For this reason, usually exceptions come equipped with a “finally” statement, that lets you execute code (which can, in turn, raise an exception, but that is another can of worms). What I usually do, since I don’t use “catch” much, I use the try finally variant, like this:
It has the unfortunate “feature” that variables declared in the try scope are not available in the finally, forcing you to move declarations up, which is more boilerplate. Some help has been provided by the semi-new “try-with-resources” construct, but it needs explicit support from the actual resource (it must implement Autoclosable). What I’ve seen in other languages, that I liked a lot, is delegating acquisition and release to the actual resource class and pass a lambda expression/closure with the code to execute with the resources. In java it could be something like this:
the “with()” method will open the file, call the closure and close the file in a finally. The nice thing is that, by delegating management to the resource, it’s moved from the client side to the service side, so even distracted programmers are safe and literally can’t forget to clean up. Hopefully, we’ll start to see this pattern in java too, now that it has lambdas.
Other approaches
Some languages are experimenting with other approaches, probably most notably is using Algebraic data types. With ADT, you can for example have a return type that’s not fixed, but can be one of several on a list. An example could be:
Error|String readFileContent(File file)
With this, your method can return either a String or an Error. Depending on the return type, you can signal that something went wrong or everything was ok and you have the full String domain available for actual content. Now IMHO this is marginally better than C: it lets you add a contract on how you report the error (is immediately clear from the signature), and gives you a full error object, with messages etc. Some languages let you chain calls with an “Error|?” return value, automatically stopping when an error is encountered and returning it (note the ?. syntax):
This will either return the String at the end of a call chain, or any Error returned by intermediate calls.
There’s certainly some merit in this approach, and I’ll say clearly that I don’t have much experience with it, but what strikes me is that, despite all the syntactic sugar and compiler support, with this approach, you’re BACK TO HANDLE ERRORS ON A PER-CALL LEVEL, just like C was. In every single call to a method, you’re required to explicitly deal with possible errors, however succinctly.
In conclusion, I encourage you to think of exception as something that you usually throw and seldomly catch, rather than the opposite. When in doubt, let the caller deal with it. It’s not “someone else will do that” mentality, it’s how exception works. Embrace RuntimeExceptions until they clean up the checked exception system (like most other JVM languages already did).
Think of the program you’re developing and try to identify which are the good “catch” sites in it, and compare to how many actual “catch” you have in it.
Thanks for reading!