Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement SOCKS v4, v5 Proxy #203

Closed
sethmlarson opened this issue Aug 13, 2019 · 25 comments
Closed

Implement SOCKS v4, v5 Proxy #203

sethmlarson opened this issue Aug 13, 2019 · 25 comments
Labels
help wanted Extra attention is needed
Milestone

Comments

@sethmlarson
Copy link
Contributor

Related: #36

@sethmlarson sethmlarson added the help wanted Extra attention is needed label Aug 13, 2019
@sethmlarson
Copy link
Contributor Author

So the issue with this is that there's no sans-I/O implementation of the SOCKv4 and SOCKSv5 protocols that doesn't require us to add 3 dependencies. The protocols are so simple that I'm actually in favor of writing our own library that has no dependencies.

@yeraydiazdiaz
Copy link
Contributor

I'm happy to lend a hand on this @sethmlarson. Does it depend on #259?

@sethmlarson
Copy link
Contributor Author

sethmlarson commented Sep 2, 2019

Thanks @yeraydiazdiaz! ❤️

It'll definitely intersect on the configuration stage on the client but the dispatcher implementation is separate, let's start by getting a sans-I/O implementation of SOCKSv4 and v5 w/o dependencies and go from there.

I actually think that the sans-I/O implementation should be it's own library, maybe on the python-http org, but it can start on one of our personal accounts. Would you like to be the originator or should I create a repo and add you and you can take it from there?

@yeraydiazdiaz
Copy link
Contributor

I'll definitely need some help so it might be easier if it's all setup in non-personal repo from the start 🙂

@sethmlarson
Copy link
Contributor Author

I pushed the initial commit: https://github.com/sethmlarson/socks
Feel free to make massive changes as nothing currently works E2E, it's just a result of me programming a few hours.
I've sent collaborator requests to everyone interested. :)

The repo will live under python-http once we release for the first time!

@mIcHyAmRaNe
Copy link

mIcHyAmRaNe commented Apr 10, 2020

while waiting for the implementation, this is a temporary alternative for people who want to use socks4/5 with httpx by using pysocks :

# pip install PySocks

import httpx
import socks
import socket

socks.set_default_proxy(socks.SOCKS5, "127.0.0.1", 9050)
socket.socket = socks.socksocket

URL = 'http://ifconfig.me/ip'

with httpx.Client() as client:
    resp = client.get(URL)
    print(resp.text)

@romis2012
Copy link

If it's relevant, there is 3rd party SOCKS implementation: httpx-socks

@bbkane
Copy link

bbkane commented Jul 10, 2020

It looks like that one doesnt support trio (my favorite async backend).

@romis2012
Copy link

It looks like that one doesnt support trio (my favorite async backend).

Trio support added in version 0.2.0

@tomchristie tomchristie added this to the v1.1 milestone Jul 30, 2020
@cdeler
Copy link
Member

cdeler commented Sep 16, 2020

@tomchristie

I wrote a small PoC with httpcore + PySocks with requesting example.com through the local socks5 proxy.

Working with the PoC, I added new method to AsyncioBackend named open_socks_stream (I decided that socks4/5 is a transport for us like tcp, ssl or uds).

If this idea (PySocks and a new method in the backends) works for you, I can start adding socks proxies support to httpcore

@florimondmanca
Copy link
Member

@cdeler From what I understand, SOCKS is an application-level protocol that sits on top of TCP (well, there's UDP in SOCKS5, but that's not something we should be thinking about for now), so in theory we shouldn't really need a new type of open_* method on concurrency backends.

Also, since I don't think it's been linked to here yet — @yeraydiazdiaz had started a lovely piece of work on HTTPCore already a few months back, based on the socksio library: encode/httpcore#51. Benefits of socksio is that it's sans-I/O, meaning that we can use it either with sync or async, just like h11 and h2 for HTTP/1.1 and HTTP/2.

So perhaps, if anyone's interested, there'd be room for getting that work up to date. I personally think socksio and the sans-I/O approach is our safest bet there if we want to have an as-simple-and-straightforward-as-possible implementation. :-)

@cdeler
Copy link
Member

cdeler commented Sep 16, 2020

@florimondmanca
Whoops, I lost that there is a PR in progress...

On one hand you are right, proxy works on L7, but it is a transport for us...

Well in terms of pysocks, this library provides us with a socket-like interface which wraps a connection

You can check what I've done here: encode/httpcore#186 (I've created a PR just to show an idea)

I'd like to have a chance to check encode/httpcore#51 :-) Thank you for the advice

Update: looks like @yeraydiazdiaz has the same problem as me with https connection

@cdeler
Copy link
Member

cdeler commented Sep 17, 2020

@florimondmanca you are right, socksio is really better, since it allows us implement the code on connection-level. I closed encode/httpcore#186 (with PySock) and opened encode/httpcore#187 (socksio) draft

@cdeler
Copy link
Member

cdeler commented Sep 21, 2020

I wonder what we should do with socks4. It enforces us to process nslookup on our side (as socks4 connect do not accept domain names).

Lets imagine that someone wants to access "google.com" through socks4 proxy. What should we do there? To raise a ProxyError with "SOCKS4 protocol do not support domain names as a host address" or make nslookup on our side (does anyone know good nslookup libraries for async?) ?

@romis2012
Copy link

romis2012 commented Sep 21, 2020

SOCKS4 protocol do not support domain names as a host address

Some socks5 servers also don't support DNS resolving. On the other hand, some socks4 servers support it.

does anyone know good nslookup libraries for async

See how it is implemented in python-socks which httpx-socks is based on. For asyncio backend you can also use aiodns.

@KOLANICH
Copy link

python-hyper/hyper#441 may be related (though it is for hyper)

@jacklanda
Copy link

while waiting for the implementation, this is a temporary alternative for people who want to use socks4/5 with httpx by using pysocks :

# pip install PySocks

import httpx
import socks
import socket

socks.set_default_proxy(socks.SOCKS5, "127.0.0.1", 9050)
socket.socket = socks.socksocket

URL = 'http://ifconfig.me/ip'

with httpx.Client() as client:
    resp = client.get(URL)
    print(resp.text)

Thank u and hope the implementation will come up to soon

@balki
Copy link

balki commented Apr 11, 2021

I wrote a socks library. It has no dependencies and supports sync, async and sans IO usage. I had to write because I want to use over 'unix domain sockets' which none of the other packages supported. (I didn't know about socksio when I wrote but the usage is much simpler). Feel free to depend on it or just copy relevant parts of it.

@tomchristie
Copy link
Member

Alrighty - closed via encode/httpcore#478 and #2034 using Seth's fantastic socksio package.

We've only got SOCKS5 in right now. Not obvs to me how much value there would be in 4/4a, or in SOCKS5 with the IP resolved by the client. Probably makes sense to leave a decision on those pending user feedback.

@ghost
Copy link

ghost commented Jan 20, 2022

Wow! This is awesome!

socks5h would be very useful for me for Tor .onion support.

Edit: I might be misunderstanding. Is the IP resolved by the socks5 server? So SOCKS5 with client-side resolution is not supported?

@tomchristie
Copy link
Member

Is the IP resolved by the socks5 server? So SOCKS5 with client-side resolution is not supported?

Correct. That is the current setup.

@tomchristie
Copy link
Member

That seems like something that's a valid enough use-case to open an issue for. You'd be very welcome to do that, and reference this comment.

I had a quick read up about this to educate myself, and found this blog post to be pretty helpful.

@ghost
Copy link

ghost commented Jan 20, 2022

That's perfect for me! I had some code using requests that I wanted to port over to httpx, but was holding off for SOCKS support. With requests, I seem to recall that this behavior was enabled by doing socks5h:// and not socks5://. socks5:// assumes the client resolves the IP. Should we mimic this in httpx?

@tomchristie
Copy link
Member

That would probably be a nice feature / change in behaviour to have, yup, but it's a little bit involved to implement.

@tomchristie
Copy link
Member

tomchristie commented Jan 20, 2022

Sorry - slow me down I'm being a bit stupid.

Our current behaviour is that the proxy resolves the DNS. (socks5h) What we don't have is support for the client resolving the DNS. (Which would be a bit of a pain to add, but do-able. Tho it's not obvious to me what use-case we'd want to support that for.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests