-
-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Firebird server stops accepting new connections after some time #7480
Comments
The current database level configuration is more suitable for SuperServer mode than SuperClassic. The following values are too large:
|
This is configuration for my test server. For database I am using pagesize 32768 and together with "DefaultDbCachePages = 32768" - each connection takes ~1GB of RAM. I have 40GB RAM installed and usually there are few connections simultaneously (for test server). If you think that this setting could be reason for this problem - I can reduce it. For GCPolicy - I will change it to "cooperative". I have turned sweep process off and sweeping is done manually with gfix - I thought that it doesn't affect anything important. |
What kind of connection string you use ?
Doesn't matters for non-SS architectures |
From IBExpert:
Mostly I am using Java and Jaybird driver (4.0.8.java11).
Also when connecting from IBExpert - it waits around one minute and then says connection failed.
Not yet - I will try this next time when server will freeze. |
From Java I'm using this connection string:
Today FB4 server again froze. So I tried to connect from isql tool from same server.
It took around 1 minute before error message. After that I tried embedded connection without specifying user/pass.
It never connected - there was no error messages or anything. |
Could you provide full memory dump of firebird process and another one of hung isql (with embedded connection) ? |
Today I encountered same problem. I made asked memory dumps. |
This core file should go to me, but how was it compressed? Please make sure you've used tar with --sparse switch to process core dump, in other case I may have problems decompressing it. Also xz may be used - it automatically detects sparse files. |
I just sent link for dump files. |
Please use xz next time instead gzip for core dumps (file was decompressing >hour - due to disk load). It's not an issue with compressed size, it's about sparse core dump. |
Also I need he following libraries from your box: |
Sorry for using wrong compression method. |
Definitely wrong libraries: |
Sorry. It looks I have copied some wrong files from 32bit folder. |
Sooner of all your hang is already fixed in current codebase. Please try current snapshot. In any case it should provide more informative core dumps. PS. If snapshot anyway hangs (with current dump it's hard to diagnose exact reason) please do not try to attach to server 16000 times - almost all core dump (>90% size) contains stacks of attach threads waiting in same place. |
I installed 4.0.3 snapshot build and it worked almost 2 weeks without problems. But today Firebird stalled. I made another core dump and uploaded in that same file share in folder named "2023-mar-27". |
I also need snapshot binaries + d4ebug info. |
I uploaded Firebird 4.0.3 binaries I'm using. What kind of debug info you need? |
One which came with that file - Firebird-debuginfo-4.0.3.2906-0.amd64.tar.gz |
I don't have debuginfo archive from that snapshot build :( and also there is no snapshot archive available on firebird download page. I didn't know that I have to save debuginfo archive when downloading snapshot. |
After last problem I installed newest 4.0.3 snapshot build and now it worked around 3 weeks without problems. But today again Firebird stalled. So now I made another core dump and uploaded in that same file share in folder named "2023-apr-24". I also included Firebird snapshot binaries and debuginfo package. If there is something additional needed - just ask. |
Once again new case never seen before in your dumps. Though symptoms may look similar - but definitely other reason. Sorry, the only thing I could do this time is enhance debugging information collecting (3019afa). |
Sorry for long silence on this issue. I was playing with different configurations to seek some clues on this problem. I got few times when Firebird got stalled. I even restored database from backup, to rule out metadata corruptions possibility. This time from fresh restart Firebird worked around 5 days and then today (to be precise - this night) again stalled. |
Today I made another 2 dumps that I believe is right before Firebird hangs up. |
From time to time we also have the sam issue, Firebirds stops acceppting new connections and select with MON$ tables freeze in active clients. Unfortunately we could not produce dump. Hope this issue will be resolved with the help of new dumps. |
I do not remember where from to download core dumps. Also please put there binaries & debug info. |
@AlexPeshkoff I just resent access information to core dumps to your email. |
Looks like you have embedded connections to your database, and that embedded connections hang sometimes. I see no other reasons for current behavior. To better understand what happens please next time when you have that problem in addition to core dump do the following: |
Today again FB started to show hanging symptoms and I made core dump and also fb_lock_print as suggested into somefile.txt ;) |
On 6/19/23 11:02, agx4ever wrote:
Today again FB started to show hanging symptoms and I made core dump
and also fb_lock_print as suggested into somefile.txt ;)
All requested files are uploaded to the same share under folder:
2023-jun-19
Does procedure XRF_IS_UNIT_COMPENSATED$S appears to you interesting (not
trivial) in any aspect ?
|
On 6/19/23 11:02, agx4ever wrote:
Today again FB started to show hanging symptoms and I made core dump
and also fb_lock_print as suggested into somefile.txt ;)
All requested files are uploaded to the same share under folder:
2023-jun-19
I've found something interesting in this dump / lock_print.
Are you ready to run special build (some devel checks missing in regular
production build will be turned on)?
|
Yes, of course I'm ready to run special build. Just give it to me and I definitely give it a try. |
Update on issue. |
I have acquired successful 4 core dumps with provided special FB build. All files are uploaded at previous file share under folder "2023-jul-29". There are also debuginfo, Firebird binaries and libs used. |
Good news - all 4 dumps show exactly the state that I've expected, all are reasonably same and rather informative. |
Very good news! |
I see you've sent very truncated log. But what is in log AFTER abort is not interesting, I want to see did something happen right BEFORE abort. |
It's full log as it is on server. I haven't removed any entry. There is nothing interesting there. |
Sorry - looked truncated. And no - there are no such options. OK, negative result is also result. |
Please install new special build from |
Thank you for your fast response! |
FB3 is almost unaffected - AST on change encryption state should not happen too often (unlike TPC one since FB4). Anyway backported required part of fix to it. |
@agx4ever You can upgrade to tomorrow snapshot (just make sure it's OK on http://firebirdtest.com/), it will contain fix for your bug. But if you can provide me 2 or 3 more dumps it will help us make sure we fixed all possible reasons of a bug. |
After I installed snapshot build with this fix - everything works fine and Firebird server hasn't crashed already two months. |
On 10/13/23 11:16, agx4ever wrote:
After I installed snapshot build with this fix - everything works fine
and Firebird server hasn't crashed already two months.
It seems that this issue is fixed. Thank you for your fast support and
problem debugging!
When this fix will be published in regular version build?
It's present in 4.0.3, but I highly recommend you wait for 4.0.4 - new
regressions are too bad.
|
I have server that runs FB3 and I want to migrate to FB4. I have created new test server and installed latest FB4. It works fine until one day. It can work few days or max 2 weeks without problems and then suddenly firebird server just stops accepting new connections. On server I can see firebird in process list, but it simply doesn't accept new connections. When I stop and then start firebird - it works fine again. Error log does not show anything unusual.
I tried same installation and same configuration on different server, to exclude hardware problems or software misconfigurations - and the result is same - FB process stops accepting new connections after some time.
OS: Linux, CentOS Stream release 8
Firebird 4.0.2 - Firebird-4.0.2.2816-0.amd64
--- firebird.conf ---
TempDirectories = /mnt/data0/fb4/tmp/
DefaultDbCachePages = 2048
UseFileSystemCache = true
TempBlockSize = 8M
TempCacheLimit = 64M
InlineSortThreshold = 2048
AuthServer = Srp256
AuthClient = Srp256, Srp
UserManager = Srp
ReadConsistency = 0
RemoteServicePort = 3050
LockMemSize = 1M
LockHashSlots = 8191
ServerMode = SuperClassic
--- databases.conf ---
dev_main = /mnt/data0/fb4/dev_main.fdb
{
DatabaseGrowthIncrement = 128M
DeadlockTimeout = 10
DefaultDbCachePages = 32768
FileSystemCacheThreshold = 1048576
GCPolicy = combined
LockHashSlots = 49999
LockMemSize = 40M
}
--- no replication configuration ---
Last time when the problem occurred I made fbguard and firebird process dumps with "gcore" command. I can send those dumps in email (or other convenient way, just tell how).
If there is anything else I can do, to provide more information, please tell me.
The text was updated successfully, but these errors were encountered: