A list of confusable-unicode characters would be kind of handy, in general. Surely there must be one somewhere by now?
A suspicious case would be a URL that has characters from such a list, mixed in with other characters from the alternate set.
I.e. if you encounter a Cyrillic 'а' in a URL composed mostly of Latin characters (or any mix of the two, really), that's probably suspicious.
A legit Cyrillic domain name probably wouldn't do that (I think?).