Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ssl renegotiation causing handshake failure #494

Conversation

fedefrancescon
Copy link
Contributor

TLS renegotiation was not working properly, causing handshake to fail pretty much always.
Improved renegotiation by adding more error checks on mbedtls returns.

Adding the session reset and refactoring the read/write apparently solved the problem

I still suspect that there may be still some corner cases causing concurrency issues. (ie. calling a close when a read is currently happening)

Copy link

sonarqubecloud bot commented Feb 7, 2024

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud


createSecurityEvents(self->tlsConfig, ret, flags, self);
/* mbedtls_ssl_renegotiate mandates to reset the ssl session in case of errors */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it says:

*                  ................. you must stop using
 *                 the SSL context for reading or writing, and either free it
 *                 or call \c mbedtls_ssl_session_reset() on it before
 *                 re-using it for a new connection; the current connection
 *                 must be closed.

But by returning false here I believe it will flow down to INT_STATE_CLOSE_ON_ERROR and close it (via releaseSocket which then calls TCPSocket_close and then mbedtls_ssl_free), so it should not be reused.

So maybe after all this is not needed (as far as a more complex schema that we reset it instead of freeing it is done, but that would be much more complex I think).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think the problem of relying on the releaseSocket leave the window open to potential issues that arises when testing continuous reads/writes.

If I get it correctly the releaseSocket calls TLSSocket_close which in turns call mbedtls_ssl_close_notify which then calls mbedtls_ssl_write_record which should not be done!
AS per mbedtls only mbedtls_ssl_session_reset or mbedtls_ssl_free should be done.

Also, by doing so we rely on higher level function to properly handle an mbed specific behaviour, which is not easy to get right. For example, during tests I've found that IedConnection_close in turns calls readSocket and this read CAN lead to a race condition with other reads in some cases, due to the read/write operations being executed on a separated thread. (that's also why I experimented with semaphores, with terrible results and multiple deadlocks 🙈)
Please note that I think that this PR is not a full formal fix, I suspect that there are still some possible "broken" corner cases, which are now just harder to get.
I consider this as good enough to be considered working as expected under normal conditions.

I thought a bit of using the mbedtls_ssl_free directly but, looking at the code this seems a rather "heavy" change that would require adding quite some error handling.

TLDR:
I think that the safest approach is mbedtls_ssl_session_reset as soon as possible to avoid "broken" connections. Also this approach let us keep the current code flow without much changes


DEBUG_PRINT("TLS", " started renegotiation\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed removing that or at least fix ot to "completed renegotiation"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yess, moved both completed/failed in TLSSocket_performHandshake right after the related mbedtls call


return -1;
int reset_err = mbedtls_ssl_session_reset(&(self->ssl));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above that this will be closed, so maybe not needed.

if ((ret == MBEDTLS_ERR_SSL_WANT_READ) || (ret == MBEDTLS_ERR_SSL_WANT_WRITE))
return 0;
int len = 0;
while (len < size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why you put the loop here. I get the idea: you will loop reconstructing the packet until the desired size is reached.

But this as far as I understand is already done at a higher level in CotpConnection_readToTpktBuffer where a read is done (call to this function), then a check if we have enough needed data and if not it will return TPKT_WAITING which in IsoClientConnection_handleConnection will signal that the packet is not complete and will then retry later (using MmsConnection_tick) which also manages a kind of thread relaxation (via Thread_sleep calls).

So I'm not sure this change of logic is a good idea (you made the read waitiling loop at a lower level compared to before which was at state machine level), I would just in case augment the ret comparison with the two new MBEDTS_ERR_SSL_ and that's all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, you're right and I had your same thoughts and doubts.
Here are my supposition, so take it with a big grain of salt:

Handling at a higher level, with sleeps and so on, left out the proper handling of MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS and/or MBEDTLS_ERR_SSL_CRYPTO_IN_PROGRESS which I think is happening during a renegotiation.
This not properly read buffer (or delayed read) left the door open to an incoplete data read which resulted renegotiations always ending with Handshake failed for unknown reason.

Also I suspect that, in a similar way of the session reset, the reads may end up being "interleaved" with other operations which then are using a "dirty" read/write buffer.

Again, please take this as is: just ideas! As I found it to very hard for me to follow the flow logic of the events, also because this is happening with INTENSE reads/writes.

I can eventually give it some more tests in a later moment.


checkForCRLUpdate(self);

if (startRenegotiationIfRequired(self) == false) {
return -1;
}

while ((ret = mbedtls_ssl_write(&(self->ssl), buf, len)) <= 0)
while (len < size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the feeling that this is similar to the above: you moved the logic to complete the buffer at a lower level (so you loop chunk by chunk here until whole packet is done) which IMO is already done in CotpConnection_sendDataMessage which will chunk at appropriate size and loop. (see around here:

int sentBytes = writeToSocket(self, self->socketExtensionBuffer, self->socketExtensionBufferFill);
if (sentBytes > 0) {
if (sentBytes != self->socketExtensionBufferFill) {
int target = 0;
int i;
uint8_t* buf = self->socketExtensionBuffer;
for (i = sentBytes; i < self->socketExtensionBufferFill; i++) {
buf[target++] = buf[i];
}
self->socketExtensionBufferFill = self->socketExtensionBufferFill - sentBytes;
and
if (self->socketExtensionBufferFill == 0) {
sentBytes = writeToSocket(self, buffer, remainingSize);
}
if (sentBytes == -1)
goto exit_function;
if (sentBytes != remainingSize) {
/* write additional data to extension buffer */
if (self->socketExtensionBuffer) {
uint8_t* extBuf = self->socketExtensionBuffer;
int extCurrentPos = self->socketExtensionBufferFill;
int bytesNotSent = remainingSize - sentBytes;
int i;
for (i = 0; i < bytesNotSent; i++) {
extBuf[i + extCurrentPos] = buffer[sentBytes + i];
}
self->socketExtensionBufferFill = extCurrentPos + bytesNotSent;
}
else {
and then in CotpConnection_sendDataMessage)

So also here I would just keep possibly the two new return codes check without modifying the state machine too much.

@mzillgith mzillgith merged commit 790e3e6 into mz-automation:v1.6_develop May 9, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants