Right. At the cost of being obtuse, I think I understood what the OP meant but wanted to clarify as I know of no XML implementation of regex and actually I can’t think of a situation where I would want to mix regex and XML. I am simply asking because perhaps there is an alternative that the OP might consider rather than going down this path that might be more appropriate. I don’t know, hence the question.
Python regex is particularly excellent, even if it isn’t in Java. It will at least give a solid foundation to someone who is approaching a regular expression for the first time. It is also really nice since you can see what the results are at every single step very quickly using the interpreter and you don’t need to worry so much about types and all of the overhead that a Java example will bring. Not to mention, regex is just about any language is going to be the same, or should be. That would be the whole point of regex isn’t it? The RE module provides a brief and excellent synopsis of the features and functions that I would expect any language to implement. I am also biased since this is how I learned how to use regex and simply offer it as an excellent and clear resource. I am sure there are others.
Unfortunately, I have to absolutely disagree with your last point. If you worry about problems early on in the process, they are much easier to fix than not. If all the OP wanted to do was find specific tags, I wouldn’t use regex. Throughout my career, I have probably used regex hundreds of time and in every single case, there were always situations where something went wrong. Regular expressions are very rigid, unreadable, and can be extremely complex. If the user needs langauge support other than English, things that sound simple turn out to be more difficult. None of this is impossible or can’t be overcome but to state offhandedly about how easy it is to debug a matching error is simply wrong. Some (in my opinion most) matching errors are difficult to track down and even harder to fix. This is particularly true with freeform text and if you are new to regular expressions and don’t know what to expect.
I didn’t mean to come off badly so I am sorry that I did.