New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8 configuration parse errors - exposed after #5767 #5840
Comments
This is not surprising. We had to remove some code because we could not track down the author and get them to agree to changing the license. We need a new implementation. |
You can revert that commit for yourself, until someone comes up with a fix. |
I will change title s/regression/exposed/ For some reasons config parser fail if last character is greek small omega Command pass with values like "ωψ" (omega psi), i..e omega is not last character. |
@petrovr, I can have a look at this in the next days if it‘s not urgent (I‘m on vacation). Thanks for the hint about how to reproduce the error without e_nss. |
Fixes openssl#5778, openssl#5840 The various IS_*() macros did not work correctly for 8-bit ASCII characters with the high bit set, because the CVT(a) preprocessor macro and'ed the given ASCII value with 0x7F, effectively folding the high value range 128-255 over the low value range 0-127. As a consequence, some of the IS_*() erroneously returned TRUE. This commit fixes the issue by mapping CVT(a) to 127 for all values a >= 127. This works because the lookup tables CONF_type_default and CONF_type_win32 have all bits cleared at entry 127, whence IS_*(a) returns FALSE for all values outside the range 0-127. The IS_*() macros were also changed to return TRUE or FALSE (1 or 0) instead of a nonzero or zero value. Note that with this change, the macro IS_*(c,a) evaluates the 'a' argument twice, unless CHARSET_EBCDIC is defined. This prohibits the use of arguments with side effects like IS_*(c, *p++).
Roumen, could you please be so kind and check whether #5844 fixes the e_nss test? I already verified that it fixes your openssl example:
|
I did not undertake the effort to find out how exactly the false positives of the |
So issue was discovered with omega as last character in line in UTF-8 encoded file. Line is in format key = value. On such line is expected spaces around '=' to be removed.
Now I understand what was reason of CONF_HIGHBIT - inefficient way to prevent IS_* macros to return non-zero result outside 7-bit ASCII code. |
Now I finally appreciate how wise it was from the UTF-8 designers to ensure that 7-bit bytes (bytes where the most significant bit is 0) never appear in a multi-byte sequence. Otherwise we would get plagued by false positives of the |
Fixes openssl#5778, openssl#5840 The various IS_*() macros did not work correctly for 8-bit ASCII characters with the high bit set, because the CVT(a) preprocessor macro and'ed the given ASCII value with 0x7F, effectively folding the high value range 128-255 over the low value range 0-127. As a consequence, some of the IS_*() erroneously returned TRUE. This commit fixes the issue by adding range checks instead of cutting off high order bits using a mask. In order avoid multiple evaluation of macro arguments, most of the implementation was moved from macros into a static function is_keytype(). Thanks to Румен Петров for reporting and analyzing the UTF-8 parsing issue openssl#5840.
Fixes #5778, #5840 The various IS_*() macros did not work correctly for 8-bit ASCII characters with the high bit set, because the CVT(a) preprocessor macro and'ed the given ASCII value with 0x7F, effectively folding the high value range 128-255 over the low value range 0-127. As a consequence, some of the IS_*() erroneously returned TRUE. This commit fixes the issue by adding range checks instead of cutting off high order bits using a mask. In order avoid multiple evaluation of macro arguments, most of the implementation was moved from macros into a static function is_keytype(). Thanks to Румен Петров for reporting and analyzing the UTF-8 parsing issue #5840. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from #5903)
Fixed by #5903. |
e_nss regression test start to fail with master (last commit Documentation typo fix in EVP_EncryptInit.pod - 1238caa )
git bisect point to commit 9256510 with following comment:
"...After this commit the various IS_() macros in the auto-generated file
conf_def.h may incorrectly return true if the supplied character has its
most significant bit set. The IS_() macros should be able to correctly
handle 8-bit characters. Note that UTF-8 support is not a requirement...."
e_nss regression test creates X.509 certificates with cyrillic and greek letters in distinguished name.
With above commit error is:
Country Name (2 letter code) [XX]:State or Province Name (full name) [World]:Locality Name (eg, city) [Somewhere cyrillic-АБВ-Яабв-я greek-ΑΒΓ-Ωαβγ-<<<>>>> :problems making Certificate Request
140215077046016:error:0D07A086:asn1 encoding routines:ASN1_mbstring_ncopy:invalid utf8string:crypto/asn1/a_mbstr.c:85:
<<<>>>> as is visible from https://gitlab.com/e_nss/e_nss/blob/master/tests/ca/catest.config#L50 is small greek letter omega.
I can not understand what is relation between commit and above failure.
For instance after replacement of "αβγ-ω" with "ω-αβγ" certificate is created!
Perhaps before CONF_HIGHBIT mask another defect.
The text was updated successfully, but these errors were encountered: