how to detect non-ascii characters in C++ Windows?

  ascii, c++, codepages, windows

I’m simply trying detect non-ascii characters in my C++ program on Windows.
Using something like isascii() or :

bool is_printable_ascii = (ch & ~0x7f) == 0 && 
                          (isprint() || isspace()) ;

does not work because non-ascii characters are getting mapped to ascii characters before or while getchar() is doing its thing. For example, if I have some code like:

#include <iostream>
using namespace std;
int main()
{
    int c;
    c = getchar();
    cout << isascii(c) << endl;
    cout << c << endl;
    printf("0x%xn", c);
    cout << (char)c;
    return 0;
}

and input a 😁 (because i am so happy right now), the output is

1
63
0x3f
?

Furthermore, if I feed the program something (outside of the extended ascii range (codepage 437)) like ‘Ĥ’, I get the output to be

1
72
0x48
H

This works with similar inputs such as Ĭ or ō (goes to I and o). So this seems algorithmic and not just mojibake or something. A quick check in python (via same terminal) with a program like

i = input()
print(ord(i))

gives me the expected actual hex code instead of the ascii mapped one (so its not the codepage or the terminal (?)). This makes me believe getchar() or C++ compilers (tested on VS compiler and g++) is doing something funky. I have also tried using cin and many other alternatives. Note that I’ve tried this on Linux and I cannot reproduce this issue which makes me inclined to believe that it is something to do with Windows (10 pro). Can anyone explain what is going on here?

Source: Windows Questions

LEAVE A COMMENT