-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ipv4 and ipv6 domains #3669
Ipv4 and ipv6 domains #3669
Conversation
|
||
} // namespace | ||
|
||
void registerDataTypeDomainIPv4AndIPv5(DataTypeFactory & factory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IPv5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
/** Text serialization for the CSV format. | ||
*/ | ||
void serializeTextCSV(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override; | ||
/** delimiter - the delimiter we expect when reading a string value that is not double-quoted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obsolete comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
void deserializeTextCSV(IColumn & column, ReadBuffer & istr, const FormatSettings &) const override; | ||
|
||
/** Text serialization intended for using in JSON format. | ||
* force_quoting_64bit_integers parameter forces to brace UInt64 and Int64 types into quotes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obsolete comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -128,6 +115,48 @@ void DataTypeFactory::registerSimpleDataType(const String & name, SimpleCreator | |||
}, case_sensitiveness); | |||
} | |||
|
|||
void DataTypeFactory::registerDataTypeDomain(const String & domain_name, const String & family_name, DataTypeDomainUniquePtr domain, CaseSensitiveness case_sensitiveness) | |||
{ | |||
const auto& data_type_creator = findCreatorByName(family_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed too
const auto& data_type_creator = findCreatorByName(family_name); | ||
all_domains.reserve(all_domains.size() + 1); | ||
|
||
// We can't move the pointer to the Creator since creator is copied later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shared_ptr is Ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified that piece of code, shared_ptr
is not needed anymore.
struct FormatSettings; | ||
class IColumn; | ||
|
||
class IDataTypeDomain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
dbms/src/DataTypes/IDataTypeDomain.h
Outdated
virtual void deserializeTextCSV(IColumn & column, ReadBuffer & istr, const FormatSettings &) const = 0; | ||
|
||
/** Text serialization intended for using in JSON format. | ||
* force_quoting_64bit_integers parameter forces to brace UInt64 and Int64 types into quotes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obsolete comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
dbms/src/DataTypes/IDataTypeDomain.h
Outdated
*/ | ||
virtual void serializeTextCSV(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const = 0; | ||
|
||
/** delimiter - the delimiter we expect when reading a string value that is not double-quoted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obsolete comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Hi @Enmk @alexey-milovidov is there any plans to have an IP datatype where you can have both IPv4 and IPv6 ? Right now I'm storing all IPs as a FixedString(16) so it's way simpler when querying |
ec3298b
to
eed1527
Compare
eed1527
to
f19e284
Compare
31b74e8
to
21abf95
Compare
21abf95
to
4060285
Compare
Performance comparison result. Performance was measured on my machine against base version (without these modifications) using performance tests included in the PR. Negative numbers represent degraded performance, positive - improved, see Please note that I've observed +/-1% variance of the results from run to run, so I believe there numbers here are within +/-2%. ipv4 (ALL CASES)Queries: 444 ipv4 IPv4StringToNumQueries: 100 ipv4 IPv4NumToString+IPv4StringToNumQueries: 100 ipv4 IPv4NumToStringClassC+IPv4StringToNumQueries: 100 ipv4 errorQueries: 44 ipv6 (ALL CASES)Queries: 426 ipv6 IPv6StringToNumQueries: 100 ipv6 IPv6NumToString+IPv6StringToNumQueries: 100 ipv6 errorQueries: 26 ipv6 mappedQueries: 200 |
@@ -0,0 +1,59 @@ | |||
DROP TABLE IF EXISTS ipv4_test; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In stateless tests tables should be created in test
database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
dbms/src/Common/formatIPv6.cpp
Outdated
{ | ||
const auto limit = IPV4_BINARY_LENGTH - zeroed_tail_bytes_count; | ||
|
||
for (const auto i : ext::range(0, IPV4_BINARY_LENGTH)) | ||
for (const auto i : ext::range(0, limit)) | ||
{ | ||
UInt8 byte = (i < limit) ? src[i] : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useless ternary operator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useless ternary operator
fixed
aaf70e8
to
9b7818a
Compare
dbms/src/Common/formatIPv6.cpp
Outdated
@@ -136,4 +145,134 @@ void formatIPv6(const unsigned char * src, char *& dst, UInt8 zeroed_tail_bytes_ | |||
*dst++ = '\0'; | |||
} | |||
|
|||
//UInt32 parseIPv4_fast(const char * src) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, forgot to remove refactoring leftovers from commit. It is gone now.
} | ||
} | ||
} | ||
|
||
std::vector<String> formatQueries(const String & query, StringToVector substitutions_to_generate) | ||
Queries formatQueries(const Query & query, const SubstitutionsMap& substitutions_to_generate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const SubstitutionsMap &
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted all changes that I've maid to PerformaceTest.cpp to reduce the scope of this PR.
dbms/src/Common/formatIPv6.cpp
Outdated
@@ -46,18 +48,24 @@ static void printInteger(char *& out, T value) | |||
} | |||
|
|||
/// print IPv4 address as %u.%u.%u.%u | |||
static void formatIPv4(const unsigned char * src, char *& dst, UInt8 zeroed_tail_bytes_count) | |||
static void doFormatIPv4(const unsigned char * src, char *& dst, UInt8 zeroed_tail_bytes_count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need UInt8 zeroed_tail_bytes_count
?
There is only only one invocation, and there zeroed_tail_bytes_count
equals IPV4_BINARY_LENGTH.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That particular function was used by formatIPv6
, used by both CutIPv6
and IPv6NumToString
. I've removed that implementation and unified it with the rest of the code, so formatIPv4
it is used by formatIPv6
, IPv4NumToString
and IPv4NumToStringClassC
.
@@ -6,6 +6,8 @@ select IPv4StringToNum('127.0.0.1' as p) == (0x7f000001 as n), IPv4NumToString(n | |||
select IPv4StringToNum(materialize('127.0.0.1') as p) == (materialize(0x7f000001) as n), IPv4NumToString(n) == p; | |||
select IPv4NumToString(toUInt32(0)) == '0.0.0.0'; | |||
select IPv4NumToString(materialize(toUInt32(0))) == materialize('0.0.0.0'); | |||
select IPv4NumToString(toUInt32(0x7f000001)) == '127.0.0.1'; | |||
select IPv4NumToString(materialize(toUInt32(0x7f000001))) == materialize('127.0.0.1'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this test fails on my pc. Here is the diff https://gist.github.com/s-mx/83ed11ed435525470869cf2edc33da3d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed that
9b7818a
to
59910eb
Compare
Added: * IDataTypeDomain interface; * method DataTypeFactory::registerDataTypeDomain for registering domains; * DataTypeDomainWithSimpleSerialization domain base class with simple serialization/deserialization; * Concrete IPv4 and IPv6 domain implementations: DataTypeDomanIPv6 and DataTypeDomanIPv4; Updated: * IDataType text serialization/deserialization methods; * IDataType implementation to use domain for text serialization/deserialization; * Refactored implementation of the IPv4/IPv6 functions to use formatIPv4/v6 and parseIPv4/v6 from Common/formatIPv6.h; Tests: * Added test cases for IPv4 and IPv6 domains. * Updated IPv4/v6 functions tests to validate more cases; * Added performance tests for IPv4 and IPv6 related functions;
59910eb
to
2716df8
Compare
@Enmk : Out of curiosity, why is IPv6 represented using @KochetovNicolai : Would a PR changing it to |
Well, the major reason is compatibility. Existing IPv6-functions work with |
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Added:
Updated:
Tests: