Maybe I'm coming in a little late for feature suggestions, but in light of what is "hot" in the academic world now I would suggest adding this:
1. First, add a softmax objective function fo the ANN. This makes the outputs of multi-class problems equal to the original output over the sum of the rest of the outputs. It's also the canonical link function for a multinomial output.
2. Then, add the Bayesian neural network training originally proposed by MacKay and explained well in Bishop's latest book, "Pattern Recognition and Machine Learning"
http://research.microsoft.com/~cmbishop/PRML/index.htm
Yes, the ability to train ANNs in a Bayesian fashion is not new, but it's also not part of any industry strength package yet.
This would allow you to train without using a hold-out set, and would automatically handle weight decay and its corresponding parameter.
3. Add support for modeling time-varying (non-stationary) weights. Have the weights be time dependent. One way to do this is through an extended Kalman Filter. The following paper shows how to do this for just a single node ANN:
http://citeseer.ist.psu.edu/penny99dynamic.html
You could extend that, and you could use Bishop's book to get the derivatives you need to do that.
Just my suggestions!