Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http gateway freezes: no response code will be returned #7032

Closed
RubenKelevra opened this issue Mar 25, 2020 · 10 comments
Closed

http gateway freezes: no response code will be returned #7032

RubenKelevra opened this issue Mar 25, 2020 · 10 comments
Labels
kind/bug A bug in existing code (including security flaws) need/author-input Needs input from the original author

Comments

@RubenKelevra
Copy link
Contributor

Version information:

go-ipfs version: 0.4.23-6ce9a355f
Repo version: 7
System version: amd64/linux
Golang version: go1.14

Description:

I tried to list a directory via the local web-gateway and couldn't get an answer via the browser. Doing the same via the ipfs command is on the other hand very fast:

$ time ipfs ls QmdQogukw7icepHraCEgrqUiYRehqXbGiXkRXYX1i8rUZq > /dev/null
real    0m0,839s
user    0m0,191s
sys     0m0,312s

using an (already cached IPNS is also very fast):

$ time ipfs ls /ipns/pkg.pacman.store/arch/x86_64/default/community > /dev/null
real    0m0,806s
user    0m0,197s
sys     0m0,299s

While this wget requests connects but never gets a code back (within 2 minutes).

$ time wget http://127.0.0.1:8080/ipfs/QmdQogukw7icepHraCEgrqUiYRehqXbGiXkRXYX1i8rUZq

After a restart of the node this issue disappeared. Any idea how to debug this in the future?

@RubenKelevra RubenKelevra added the kind/bug A bug in existing code (including security flaws) label Mar 25, 2020
@Stebalien
Copy link
Member

Follow the guide in https://github.com/ipfs/go-ipfs/blob/master/docs/debug-guide.md. The goroutine dump will be especially helpful in figuring out where the request is stuck.

@hsanjuan hsanjuan added the need/author-input Needs input from the original author label Mar 26, 2020
@RubenKelevra
Copy link
Contributor Author

RubenKelevra commented Mar 27, 2020

@Stebalien

I had a similar occurrence, where the Web-API basically stalled for 2-3 minutes, for no apparent reason.

I wrote a goroutine dump for you while it happened (the same computer/setup but a different IPFS-version, this time build from the git repo: 729e053)

ipfs.stacks.log
ipfs_config.txt

@Stebalien
Copy link
Member

Could you take the rest of the dumps? One thing I'm noticing is that zap (the logging framework) appears to be blocking. Do you have debug logging enabled?

@RubenKelevra
Copy link
Contributor Author

Yes, debug logging was on.

When it occurs the next time, I'll do all dumps :)

@RubenKelevra
Copy link
Contributor Author

RubenKelevra commented Mar 28, 2020

Well, my testing installation (for 0.5.0-dev-84724a360) now got completely stuck. I started this time with a completely empty repo/block storage and just tried to cluster follow the ipfs-websites cluster.

This is what the cluster-follower has logged on the console until I terminated it:

output_hanging_ipfs.txt

Nothing else was running - just the cluster follower and the WebUI.

The log of ipfs hasn't printed any errors (haven't changed the log level to debug this time) - and I cannot get any of the dumps from the API... the curls are running for 10 Minutes now with 0 bytes fetched.

Any idea how to debug this? :)

@RubenKelevra
Copy link
Contributor Author

A overview of the processes:

$ pstree -acnp | grep ipfs | grep -v grep
  |   |   |   `-ipfs,324159 diag sys
  |   |   |       |-{ipfs},324160
  |   |   |       |-{ipfs},324161
  |   |   |       |-{ipfs},324162
  |   |   |       |-{ipfs},324163
  |   |   |       |-{ipfs},324164
  |   |   |       |-{ipfs},324165
  |   |   |       |-{ipfs},324166
  |   |   |       `-{ipfs},324167
  |-ipfs,243084 daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
  |   |-{ipfs},243085
  |   |-{ipfs},243086
  |   |-{ipfs},243087
  |   |-{ipfs},243088
  |   |-{ipfs},243089
  |   |-{ipfs},243090
  |   |-{ipfs},243091
  |   |-{ipfs},243092
  |   |-{ipfs},243207
  |   |-{ipfs},243297
  |   |-{ipfs},244563
  |   `-{ipfs},244564
$ ps -eLfa | grep -E "324159|324160|324161|324162|324163|324164|324166|324165|324167|243084|243085|243086|243087|243088|243089|243090|243091|243092|243207|243297|244563|244564|^UID" | grep -v "grep"
UID          PID    PPID     LWP  C NLWP STIME TTY          TIME CMD
ruben     243084       1  243084  0   13 06:28 ?        00:00:00 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243085  3   13 06:28 ?        00:24:02 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243086  0   13 06:28 ?        00:00:02 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243087  0   13 06:28 ?        00:00:00 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243088  0   13 06:28 ?        00:00:02 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243089  0   13 06:28 ?        00:00:00 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243090  0   13 06:28 ?        00:00:02 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243091  0   13 06:28 ?        00:00:03 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243092  0   13 06:28 ?        00:00:00 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243207  0   13 06:29 ?        00:00:02 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243297 98   13 06:30 ?        10:26:05 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  244563  0   13 06:35 ?        00:00:00 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  244564  1   13 06:35 ?        00:12:12 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     324159  324140  324159  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324160  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324161  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324162  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324163  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324164  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324165  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324166  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324167  0    9 15:50 pts/13   00:00:00 ipfs diag sys

So just one thread is busy looping, how can we get a stack trace of it? :)

@RubenKelevra
Copy link
Contributor Author

I wrote with ttocslliwDiscor4 on IRC, he assumed it's a bug in go-yamux and go-libp2p 0.7.2 should have fixed it (my build is g84724a360).

I used SIGABRT to kill it and that's the stacktrace:

ipfs_stacktrace.txt

@Stebalien
Copy link
Member

So, there was a bug in go-libp2p but it's unrelated (potentially causes streams to fail).

The bug here looks like golang/go#38070 which will be fixed in go 1.14.2.

If you want to work around the issue now, you can use https://github.com/libp2p/go-yamux/tree/fix/re-disable-write-coalescing. It doesn't fix the bug in go, but this branch reduces the likelihood that the bug will be triggered.

@Stebalien
Copy link
Member

Closing in favor of libp2p/go-libp2p#858.

@RubenKelevra
Copy link
Contributor Author

If you want to work around the issue now, you can use https://github.com/libp2p/go-yamux/tree/fix/re-disable-write-coalescing. It doesn't fix the bug in go, but this branch reduces the likelihood that the bug will be triggered.

No it's not needed, I just want to test this build until it gets released to make sure I can point out some bugs early. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/author-input Needs input from the original author
Projects
None yet
Development

No branches or pull requests

3 participants