Regex for GSM 03.38 7bit character set

If your application sends SMS you want to make sure that characters to be sent comply with GSM 03.38 (e.g. 7bit).

If you need to validate (user) input for invalid characters you could do that pretty easily with regular expressions. For Java the following regex is ready to use. For other languages you can base your implementation on the “pure” unescaped regex shown in the first line of the code comment.

Geeh, who allowed the Greek to smuggle part of their alphabet into GSM 03.38? Since those characters don’t fit into Latin1 (ISO-8859-1) they should be UTF-8 encoded in the regex. More on that in this excellent regex tutorial: http://www.regular-expressions.info/unicode.html. Oh yes, and I do recommend using RegexBuddy – it really is my regex life-saver.

4 thoughts on “Regex for GSM 03.38 7bit character set

Leave a Reply