Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

Manually connect to signaling server #2508

Closed
oed opened this issue Oct 3, 2019 · 21 comments
Closed

Manually connect to signaling server #2508

oed opened this issue Oct 3, 2019 · 21 comments
Labels
exp/expert Having worked on the specific codebase is important kind/bug A bug in existing code (including security flaws) P2 Medium: Good to have, but can wait until someone steps up status/ready Ready to be worked topic/libp2p Topic libp2p

Comments

@oed
Copy link
Contributor

oed commented Oct 3, 2019

  • Version: 0.36.4
  • Platform: Browser: Chromium / Firefox
  • Subsystem: libp2p

Type: Question

Low - An optional functionality does not work.

Severity: Low

Description:

I'm trying to figure out if it's possible to connect to a signaling server without putting it in config.Addresses.Swarm. I have two main reasons for this: 1. webrtc-signal can make ipfs throw an error when starting if the user has a plugin that blocks webrtc, 2. starting ipfs with signaling servers makes the startup a few seconds slower.

I've tried connecting to signaling servers using ipfs.swarm.connect (both with and without peerIds), but this throws errors, e.g.

// Websocket signaling server
Error: No available transports to dial peer QmbrLCJ9GcbR1sHh9pjwyn8eo45pCD6YW9F3jW3QJ42XBk!
    at createError (index.js:4)
    at CONNECTION_FAILED (errors.js:6)
    at nextTransport (index.js:227)
    at eval (index.js:240)
    at eval (transport.js:106)
    at eval (index.js:59)
    at eval (tryEach.js:78)
    at eval (once.js:12)
    at replenish (eachOfLimit.js:61)
    at iterateeCallback (eachOfLimit.js:50) undefined
// Webrtc signaling server
3box.js:1918 wr Error: Dial was aborted
    at createError (index.js:4)
    at DIAL_ABORTED (errors.js:7)
    at Queue.abort (queue.js:143)
    at Queue.blacklist (queue.js:168)
    at ClassIsWrapper.eval (queue.js:251)
    at Object.onceWrapper (events.js:238)
    at ClassIsWrapper.emit (events.js:146)
    at ClassIsWrapper.emit (base.js:36)
    at ClassIsWrapper.close (base.js:27)
    at nextTransport (index.js:226) undefined

Is there some other way of connecting to a singaling server without specifying it in the initial config?

@alanshaw alanshaw added kind/bug A bug in existing code (including security flaws) exp/expert Having worked on the specific codebase is important topic/libp2p Topic libp2p P2 Medium: Good to have, but can wait until someone steps up status/ready Ready to be worked labels Oct 21, 2019
@alanshaw
Copy link
Member

Unfortunately no, and really not being able to connect to the signalling server should not bring the node down (or at least there should be an option to allow).

Please be aware that the "*-star" protocols are being phased out in favour of relay and distributed signalling. See more here libp2p/js-libp2p#385

@oed
Copy link
Contributor Author

oed commented Oct 21, 2019

Thanks @alanshaw! I had a look at the issue you linked to but found it unclear what my alternatives are. Is it still a work in progress, or is there some other way of discovering other browser nodes that is available now?
I've looked at the docs for delegate routers, would that provide everything needed or would I still need to use something else as well?

@lidel
Copy link
Member

lidel commented Nov 13, 2019

not being able to connect to the signalling server should not bring the node down (or at least there should be an option to allow).

Devil in the details :-) We switched to js-libp2p-websocket-star-multi some time ago.
This means, js-ipfs will fail to start only if NONE of signaling servers is up
(see how we set ignore_no_online in /src/core/runtime/libp2p-browser.js#L23).

TL;DR: If you have only one ws-star configured, it becomes a single point of failure.

@oed Try adding 2-3 backup signaling servers to make your setup more robust, or set libp2p.wsStarIgnoreErrors: true in options passed to the constructor of js-ipfs if you are ok with running node without any of them (keep in mind it means your node has no means of discovery, as DHT is not available in JS yet).

@oed
Copy link
Contributor Author

oed commented Nov 13, 2019

Thanks, that's very helpful @lidel!

achingbrain added a commit that referenced this issue Feb 5, 2020
The user may start the node with no swarm addresses to speed up
startup times - if they then use libp2p to listen on new transports
we should return the addresses currently being listened on instead
of those configured at startup.

refs #2508
achingbrain added a commit that referenced this issue Feb 5, 2020
The user may start the node with no swarm addresses to speed up
startup times - if they then use libp2p to listen on new transports
we should return the addresses currently being listened on instead
of those configured at startup.

refs #2508
@achingbrain
Copy link
Member

Since the async/await refactor of ipfs and libp2p a lot has changed behaviour wise.

You can now specify no swarm multiaddrs during startup and use ipfs.libp2p.transportManager.listen() after the libp2p node has been started to listen on new addresses (though there's a bug with the node addresses reported by ipfs.id() which will be fixed by #2749).

It'll also not explode if some of the addresses are unlistenable on (try adding the multiaddr '/dns4/star-signal.cloud.ipfs.team/tcp/443/wss/p2p-webrtc-star' to the servers array during browser tests to the test added by the PR above - star-signal.cloud.ipfs.team is currently down a lot more than it's up but it doesn't stop the node from starting in the test as it can still connect to the other signalling server started by aegir before the test run).

Apart from that the same caveats apply though - you must pass some addresses to ipfs.libp2p.transportManager.listen() and at least one of them must be listenable on.

N.b. you can't use js-libp2p-websocket-star-multi or js-libp2p-websocket-star with the new ipfs or libp2p right now as they've yet to be refactored to support the new async/await transport API.

alanshaw pushed a commit that referenced this issue Feb 6, 2020
The user may start the node with no swarm addresses to speed up
startup times - if they then use libp2p to listen on new transports
we should return the addresses currently being listened on instead
of those configured at startup.

refs #2508
alanshaw pushed a commit that referenced this issue Feb 7, 2020
The user may start the node with no swarm addresses to speed up
startup times - if they then use libp2p to listen on new transports
we should return the addresses currently being listened on instead
of those configured at startup.

refs #2508
@lidel
Copy link
Member

lidel commented Feb 17, 2020

Ack, I believe the plan is to not port websocket-start and switch to new websocket stardust transport (cc @vasco-santos: where this work can be tracked?)

@vasco-santos
Copy link
Member

Ack, I believe the plan is to not port websocket-start and switch to new websocket stardust transport (cc @vasco-santos: where this work can be tracked?)

That's true, we are going to release libp2p-stardust, which will be an improvement from libp2p-websocket-star. You can follow libp2p/js-libp2p-stardust#14. It is in a review stage now.

@oed
Copy link
Contributor Author

oed commented May 28, 2020

@vasco-santos is the libp2p-stardust ready to be used? We are in the process of updating ipfs to > 0.41 and can no longer use websocket-star
I'm assuming I could just add it to the Swarm array like this on browser nodes?
(from the example in the repo readme, we would deploy our own instance)

Swarm: [
  '/dns4/stardust.mkg20001.io/tcp/443/wss/p2p-stardust/'
]

@vasco-santos
Copy link
Member

Hello @oed

Thanks for reaching out. Are you able to use webrtc-star as an alternative to stardust? If yes, I highly recommend that you use webrtc-star instead.

Bear in mind, that both these libp2p transports need the signalling server. The public deployed signalling servers are still outdated AFAIK. If they got updated, people still relying on them would also have issues. The best option for this is to have your own deployed, since these servers availability is not guaranteed. Check this recent twitter thread for more information: https://twitter.com/vascosantos10/status/1262769647482482689

If this does not help you let me know, and we can discuss further here.

@oed
Copy link
Contributor Author

oed commented May 28, 2020

Thanks, yeah as I mentioned we are deploying our own instance. What's the reason you are recommending webrtc-star over stardust?

This is the webrtc package you are talking about right? https://github.com/libp2p/js-libp2p-webrtc-star

The reason I'm hesitant about using that is that we where running into issues in firefox last time we tried it. Also as you can see in this issue: #3022 webrtc is not supported if you want to run ipfs in a worker, which we intend on doing in the future.

Any other reason not to use stardust?

@vasco-santos
Copy link
Member

@oed The reason for recommending webrtc-star is that it is widely used while stardust was mostly an experiment. For both of them, I think you will only need to change the swarm listening multiaddr to a valid one, and run your own instance of the server. The firefox support should not be a problem AFAIK.

This is the webrtc package you are talking about right? https://github.com/libp2p/js-libp2p-webrtc-star

Yes!

The reason I'm hesitant about using that is that we where running into issues in firefox last time we tried it. Also as you can see in this issue: #3022 webrtc is not supported if you want to run ipfs in a worker, which we intend on doing in the future.

I would not worry about this for now. Our goal is to get rid of the start servers during the next couple of months and fully rely on circuit relay nodes and the libp2p/js-libp2p-rendezvous protocol (libp2p/specs/rendezvous). The circuit relay is already implemented and you can use it in the browser context. The main reason that we still need to use a star server is because browser nodes need to discover each other to establish a connection. This connection is currently established via the star servers in the examples, but you could also do it using the circuit relay if you knew the peers addresses. The remaining piece of this puzzle is where the rendezvous protocol will appear. You can also read more about this migration to remove the start servers on libp2p/js-libp2p#385

FYI, I just started working on the rendezvous protocol a couple of days ago and it will be my main focus for the following weeks. You can expect this to be released in [email protected] release. With this, I plan to write a browser user guide to explain in detail on how to use circuit relay and rendezvous to setup IPFS/libp2p in the browser. We just shipped today the rc for [email protected], so you can expect this in the next release! :)

@oed
Copy link
Contributor Author

oed commented May 28, 2020

Thanks again @vasco-santos, very helpful. Will definitely try using webrtc-star again for now. Why it wasn't working before might have been a misconfiguration on our end.

@vasco-santos
Copy link
Member

Let me know if you hit any issue @oed

@oed
Copy link
Contributor Author

oed commented May 29, 2020

Still getting the same error as I remembered in firefox. Perhaps I'm doing something obvious wrong?
Screenshot 2020-05-29 at 11 11 50

Seems to be working fine in Chromium btw 👍

@oed
Copy link
Contributor Author

oed commented May 29, 2020

Actually, in chromium I get the following error when opening to separate tabs (one normal, one incognito), that connect to the same webrtc-star:

Screenshot 2020-05-29 at 15 32 40

Trying to manually connect one node to another doesn't work either.

Screenshot 2020-05-29 at 15 37 34

Could this be due to some simple misconfiguration @vasco-santos ?

@vasco-santos
Copy link
Member

Can you point me in a repo so that I can give it a try? I will try to test this out over the weekend and get back to you. A repo, or the configuration + app code running

@oed
Copy link
Contributor Author

oed commented May 29, 2020

Thanks @vasco-santos!
Here is a very minimal example: https://gist.github.com/oed/8cdabbc75db2a544c46598d065fbc125

It will throw the errors above if:

  • Firefox: load the page and wait for a bit
  • Chromium: load the page in two tabs and wait for a bit

@vasco-santos
Copy link
Member

vasco-santos commented May 30, 2020

@oed

I have done some experiments with this and I have found several things that we need to consider, but I would like to explain this by parts.

First of all, I grabbed your example, but used it as I recommend you should:

const opts = {
  preload: { enabled: false },
  config: {
    Bootstrap: [],
    Addresses: {
      Swarm: ['/dns4/p2p.3box.io/tcp/9091/wss/p2p-webrtc-star/']
    }
  }
}
Ipfs.create(opts).then(ipfs => {
  window.ipfs = ipfs

  ipfs.libp2p.on('peer:discovery', (peer) => {
    console.log('discovered', peer)
  })

  ipfs.libp2p.on('peer:connect', async (peer) => {
    console.log('connected', peer)

    ipfs.swarm.peers().then(peers => console.log('current peers connected: ', peers))
  })
})

With this, I tested chrome and firefox.

In Chrome this just worked as expected. I start one peer and it logs its swarm listening address. When I start another peer in an incognito window, they both log that they discovered the other peer and then they both log that they are connected to it.

In Firefox, I got some problems. I started one peer in and it logged the swarm listening address as expected. Then, I tried to start a new node in an incognito window and basically nothing happened.

With that, I tried to test one node in chrome and one node in firefox, which also worked. I ended up figuring out what was causing this and if you use 2 chrome tabs the same thing happens. Browser compatibilities life 😢 What is the "problem"? In firefox (any window) and chrome (same window), an IPFS node will use the same ipfs repo. This way, when we start a second node, it will simply not start because there was one peer already running.

So, I tried out setting a random repo string in the configuration for test purposes:

const opts = {
  preload: { enabled: false },
  repo: Math.random().toString(36).substring(7),
  config: {
    Bootstrap: [],
    Addresses: {
      Swarm: ['/dns4/p2p.3box.io/tcp/9091/wss/p2p-webrtc-star/']
    }
  }
}
Ipfs.create(opts).then(ipfs => {
  window.ipfs = ipfs

  ipfs.libp2p.on('peer:discovery', (peer) => {
    console.log('discovered', peer)
  })

  ipfs.libp2p.on('peer:connect', async (peer) => {
    console.log('connected', peer)

    ipfs.swarm.peers().then(peers => console.log('current peers connected: ', peers))
  })
})

This configuration guarantees that each tab/windows will have a node running with a random repo. As a consequence, any time we open a tab/window a new node is created and the other is discovered and connected. I could have this working on Chrome and Firefox without visible issues.

The question here, is if this is what is expected to happen? @Gozala is working on defining this on ipfs/js-ipfs#3022 and provide your thoughts on it. If we should expect that a node should be shared between tabs, this type of tests would not be possible between two tabs in a browser for example. Anyway, this is out of the scope of this issue, should to point you on the uncertainty in this topic.


In your previous comments, you mentioned that you were doing a manual dial. While you are able to, I would recommend just having the peers dial themselves automatically after discover, since this is the default behaviour of js-ipfs and js-libp2p. Anyway, if you have a good reason for this manual dial, you can disable the autoDial feature as follows:

const opts = {
  preload: { enabled: false },
  repo: Math.random().toString(36).substring(7),
  libp2p: {
    config: {
      peerDiscovery: {
        autoDial: false
      }
    }
  },
  config: {
    Bootstrap: [],
    Addresses: {
      Swarm: ['/dns4/p2p.3box.io/tcp/9091/wss/p2p-webrtc-star/']
    }
  }
}
Ipfs.create(opts).then(ipfs => {
  window.ipfs = ipfs

  ipfs.libp2p.on('peer:discovery', (peer) => {
    console.log('discovered', peer)
  })

  ipfs.libp2p.on('peer:connect', async (peer) => {
    console.log('connected', peer)

    ipfs.swarm.peers().then(peers => console.log('current peers connected: ', peers))
  })
})

With this, you can do ipfs.swarm.connect(´/dns4/p2p.3box.io/tcp/9091/wss/p2p-webrtc-star/p2p/QmNjXUEFBgijKK3FAicZvSpaoWTQYV2HJD37U2tyAcxySg´) when you have two open windows. The event handlers for the connections will log the connection in both windows. I also tried this in both browsers and cross browser and it worked.


Bear in mind that if you are using a webapp and your friend in a different laptop is using the same web app via the same signal server, this should just work, since each of you will have a different ipfs repo, a consequently a different peer ID.

I could not reproduce your issues in the attached pictures though. I have Firefox 76.0.1 and Chrome 81.0.4044.129. I had my chrome opened for 15 minutes, with both peers connected. From the output, it seems related with the webrtc protocol. Could you try with the code that I provided above and if the same problems occur let me know with the full outputs.

But, I have an idea for the last one, the operation aborted.

Since a few versions ago, js-libp2p had some improvements on its dialer. A peer might listen on several different multiaddrs and js-libp2p will try to dial several multiaddrs in parallel and it will abort the pending dials, once the first success and you have a connection. While this is not the case here, I think that this is related. As I wrote above, when a peer is discovered, libp2p will try to dial to it immediately by default. If you did a manual dial when the automatic dial was already being performed, the manual dial is being aborted because the first won the "race". In normal circumstances, I think that this should not occur. I believe that the issue here is that the automatic dial on discover will have the multiaddr resolved and not the dns4 multiaddr. This results in the libp2p dialer treating these as different multiaddrs. We will be working on adding dns resolvers for multiaddrs, and we should take this case into consideration when we do it cc @jacobheun

@oed
Copy link
Contributor Author

oed commented May 30, 2020

Using a random repo for each instance makes sense. Figured out the main issue, however I'm still seeing inconsistencies.

Browser versions:
Chromium: 81.0.4030.0
Chrome: 83.0.4103.61
Firefox: 76.0.1

Now whenever I open the example I discover a node with PeerId: QmQazseSLaTELKbpjEjboRexkAibBSMVmqd6hS4u8x4xbr. I assume this might be your peer @vasco-santos?

I was having the errors below until I disabled my VPN. After my VPN was disabled I was able to connect to peers on my own computer.

Connection errors with VPN enabled

When I open any combination of browsers and tabs I get these errors when it's trying to connect.

In Firefox it looks like this:

Screenshot 2020-05-30 at 18 38 31

In Chromium and Chrome it looks like this:

Screenshot 2020-05-30 at 18 40 53

iOS Safari issues

Once I figured out that the issue was because I had a VPN enabled I wanted to try to see if this worked on mobile. I simply loaded the example in Safari on ios, however I'm not seeing any new peers show up in my console on my desktop browser (have not set up console in mobile).

Is there any reason this would not work on mobile?

Testing with a friend

Asked a friend to load the same page though ngrok. Using Chrome same version as above. We are completely unable to establish a connection between our two browsers. Seeing the same issue in the console as in the images above.

Conclusion

Unless I'm missing something it seems like the webrtc star is still quite fragile, and not as usable as websocket-star was for us previously. Will try using the stardust service, but if there are similar problems there we won't be able to update from [email protected].* which we are currently using.
No mobile support is also a major issue (could there be something else wrong there?).

Edit:

We're not really interested in doing manual dials btw. Automatic dial is what we need.

@vasco-santos
Copy link
Member

Hey @oed

Regarding the VPN issues and iOS Safari, I would ask you if you could open an issue on libp2p/js-libp2p. We need to consider and test this in libp2p.

Could you get this working with stardust? Rendezvous is on its way to hopefully solve all these issues.

@oed
Copy link
Contributor Author

oed commented Jun 10, 2020

Hey @vasco-santos sorry for the slow response here. We are currently going ahead with webrtc-star. Will post new issues there.

This particular issue (Connecting to signaling server after ipfs construction has been solved by #2508 (comment))

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
exp/expert Having worked on the specific codebase is important kind/bug A bug in existing code (including security flaws) P2 Medium: Good to have, but can wait until someone steps up status/ready Ready to be worked topic/libp2p Topic libp2p
Projects
None yet
Development

No branches or pull requests

5 participants