Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Token unknown error on formfeed in query [CORE5479] #5749

Closed
firebird-automations opened this issue Feb 4, 2017 · 9 comments
Closed

Token unknown error on formfeed in query [CORE5479] #5749

firebird-automations opened this issue Feb 4, 2017 · 9 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @mrotteveel

If a query contains a formfeed (\f or ASCII 0x0C), then Firebird produces the error:

Dynamic SQL Error; SQL error code = -104; Token unknown - line 3, column 1; [SQLState:42000, ISC error code:335544634]

(where the token is the formfeed character)

Instead a formfeed should be considered whitespace.

@firebird-automations
Copy link
Collaborator Author

Modified by: @mrotteveel

security: Developers [ 10012 ] =>

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

SQL standard:
------
3.1.6.62 white space
consecutive sequences of one or more characters that have no glyphs
NOTE 19 — White space is typically used to separate <nondelimiter token>s from one another in SQL text, and is
always permitted between two tokens in SQL text.
------

But Unicode is complex and has invisible characters that even tricks Java source code. How should we decide what to consider whitespaces?

@Jhironsel
Copy link

Al eliminar los espacios en blanco de un store procedure pude resolver.
EXECUTE PROCEDURE SP_INSERT_USUARIOS(?, ?, ?, ?, ?, ?, ?, ?);
Prueba pasada con este.
EXECUTE PROCEDURE SP_INSERT_USUARIOS(?,?,?,?,?,?,?,?)

@mrotteveel
Copy link
Member

@Jhironsel Please post in English in the tracker. This ticket describes a bug in Firebird compared to the SQL standard, that Firebird doesn't consider the formfeed character as white space while it should. Your comment doesn't seem related to that problem (your statements contains normal spaces, 0x20, and works just fine in all.

I can't reproduce an error with spaces between parameters, but if you have reproducible bug in a supported version (3.0, 4.0), then please create a separate ticket.

@mrotteveel
Copy link
Member

@asfernandes I seem to have missed your comment back then. Your question is answered by section 4.2.4 Character repertoires of the SQL:2016 standard:

White space is any character in the Unicode General Category classes “Zs”, “Zl”, and “Zp”, as well as any of the following characters:

— U+0009, Horizontal Tabulation
— U+000A, Line Feed
— U+000B, Vertical Tabulation
— U+000C, Form Feed
— U+000D, Carriage Return
— U+0085, Next Line

NOTE 26 — The normative provisions of this International Standard impose no requirement that any character set have equivalents for any of these characters except U+0020 (); however, by reference to this definition of white space, they do impose the requirement that every equivalent for one of these shall be recognized as a white space character.

The Unicode General Category classes “Zs”, “Zl”, and “Zp” are assigned to Unicode characters that are, respectively, space separators, line separators, and paragraph separators.

The only character that is a member of the Unicode General Category class “Zl” is U+2028 (Line Separator). The only character that is a member of the Unicode General Category class “Zp” is U+2029 (Paragraph Separator). The characters that are members of the Unicode General Category class “Zs” are: U+0020 (Space), U+00A0 (No-Break Space), U+1680 (Ogham Space Mark), U+180E (Mongolian Vowel Separator), U+2000 (En Quad), U+2001 (Em Quad), U+2002 (En Space), U+2003 (Em Space), U+2004 (Three-Per-Em Space), U+2005 (Four-Per-Em Space), U+2006 (Six-Per-Em Space), U+2007 (Figure Space), U+2008 (Punctuation Space), U+2009 (Thin Space), U+200A (Hair Space), U+202F (Narrow No-Break Space), U+205F (Space, Medium Mathematical), and U+3000 (Ideographic Space).

NOTE 27 — If and when the Unicode General Category classes “Zs”, “Zl”, and/or “Zp” are modified to add new characters or
remove characters, those modifications may be implemented by SQL-implementations without affecting conformance to this
International Standard.

@pavel-zotov
Copy link

I have a Q: what about "U+000B, Vertical Tabulation" ? Can it be used as delimiter ?

@mrotteveel
Copy link
Member

@pavel-zotov According to the quote from the standard (see above), it should be considered whitespace, yes.

@mrotteveel
Copy link
Member

@pavel-zotov However, the change seems to have been specifically applied only for the formfeed character.

@pavel-zotov
Copy link

Yes. And, because of this, QA test currently does not check u+000B.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment