Requests: Too many open files

Created on 5 Nov 2011  ·  81Comments  ·  Source: psf/requests

I am building a basic load generator and started running into file descriptor limits, I havent seen any documentation pertaining to how to release resources, so either I am doing it wrong and the docs need updated, or requests is leaking file descriptors somewhere (without support for keepalive I am slightly confused at why any files would be left open at all)

Bug Contributor Friendly

Most helpful comment

"Too many open files" is the result of the bug caused by sockets staying in CLOSE_WAIT .
So ulimit won't fix just make a workaround .

All 81 comments

Where you using requests.async?

nope, all requests were reasonably plain requests.get / requests.post, I am still seeing a few in there

$ lsof | grep localhost | wc -l
110

all but 4/5 of them are of the format

Python    82117 daleharvey  123u    IPv4 0xffffff800da304e0       0t0      TCP localhost:61488->localhost:http (CLOSE_WAIT)

I'm a bit baffled by this, to be honest.

Hah ill take another shot at reproducing it reliably, if I cant ill close

I've seen this happening to me, but only when I'm using the async module w/ 200+ simultaneous connections.

Hi,
I got exactly the same problem using requests and monkey patching with gevent : some connections staying in CLOSE_WAIT .
Maybe a problem with gevent so .

It may be problem of ulimit -n. Try with a higher value.

"Too many open files" is the result of the bug caused by sockets staying in CLOSE_WAIT .
So ulimit won't fix just make a workaround .

@tamiel how do we fix this?

I will do more tests asap and try to fix .

I've looked into it, and seems to be a problem with all libraries using httplib.HTTPSConnection.

Posted an example here :

https://gist.github.com/1512329

I just encountered a very similar error using an async pool with only HTTP connections - I'm still investigating but passing a pool size to async.map makes the error reproduce quickly.

Any fixes to this? This makes Requests unusable with gevent..

It's all about the CLOSE_WAITs. Just have to close them. I'm not sure why they're still open though.

Is it a urllib3 issue? Having to close these by ourselves isnt a great idea i feel.

It's more of a general issue. We can keep the conversation here.

Ok just to give you a perspective, we are trying to move from httplib2 to requests, and we dont see this issue with httplib2. So its not a general issue for sure.

By general i mean that it's a very serious issue that effects everyone involved.

so how do we solve this? we really want to use requests + slumber moving forward

I'd love to know the answer to that.

The leak appears to be due to the internal redirect handling, which causes new requests to be generated before the pending responses have been consumed. In testing [email protected] has an under-satisfying but effective fix simply by forcing each response to be consumed before continuing.

This required changes in two places which makes me want to refactor the interface slightly but I'm out of time to continue currently.

399 has a fix which works well in my async load generator (https://github.com/acdha/webtoolbox/blob/master/bin/http_bench.py) with thousands of requests and a low fd ulimit

I have run into the same issue when using async -- kludged a workaround by chunking requests and deleting responses / calling gc.collect

I believe I was running into this today connecting to a licensed server that only allows 5 connections.

Using async I could only GET 4 things before it paused for 60 seconds.

Using the normal GET with consumption I could fetch about 150 things serially in under 40 seconds.

Haven't made my kludge yet since I saw this issue.

Just got this error while using ipython and got this message. This is just making each request one at a time, but I think I got something similar when using async.

ERROR: Internal Python error in the inspect module.
Below is the traceback from this internal error.
Traceback (most recent call last):
    File "/Library/Python/2.7/site-packages/IPython/core/ultratb.py", line 756, in structured_traceback
    File "/Library/Python/2.7/site-packages/IPython/core/ultratb.py", line 242, in _fixed_getinnerframes
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py", line 1035, in getinnerframes
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py", line 995, in getframeinfo
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py", line 456, in getsourcefile
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py", line 485, in getmodule
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py", line 469, in getabsfile
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/posixpath.py", line 347, in abspath
OSError: [Errno 24] Too many open files

Unfortunately, your original traceback can not be constructed.

Oddly, I think when using just the normal Python interpreter I get a "Max Retries Error" but I think that is another issue with me doing requests on all the same domain, but not sure.

I ran into this on the first project I had where allow_redirects was True; it appears to be caused by the redirection chain leaking response objects which aren't released even with prefetch=True. This fixed it in my initial testing:

        [i.raw.release_conn() for i in resp.history]
        resp.raw.release_conn()

Hmmm..

@acdha setting:

requests.defaults.defaults['allow_redirects'] = False

before I make any requests still results in the same error, but I think this isn't an option for my implementation as all the requests I'm making will require a redirect =/

@dalanmiller How are you processing your responses? I was previously using async.map with a response hook and it _appears_ to be more stable using a simple loop over async.imap:

for resp in requests.async.imap(reqs, size=8):
    try:
        print resp.status_code, resp.url
    finally:
        [i.raw.release_conn() for i in resp.history]
        resp.raw.release_conn()

@acdha

I was just using a for loop through a url list and doing a request.get on each with my settings and such.

for u in urls:
    response_list.append(requests.get(u))

I tried using your paste and it works for about 50 requests in my 900 length list, until I start to get "max retries errors exceeded with url" for the rest. This is a pretty standard error though for hitting the same domain repeatedly though, no?

Hey, i was crawling a huge list of urls, 35k, and got this same error on _some_ of requests.

I am getting urls in chunks of 10, like this:

responses = requests.async.map([requests.async.get(u, params=self.params()) for u in chunk]) # chunk is a list of 10

Somewhere in 20k range i started getting error 24, then it was ok thru 30k and then again.

Any more info you would be interested in to narrow it down?

requests.async is gone. You might want to consider moving to grequests.

All right, thanks. Would be good to mention this in the docs.

Kind of a noob when it comes to Pull Requests and writing documentation but I took a stab at it and sent it. Please comment or criticize :)

https://github.com/kennethreitz/requests/pull/665

Ok, this happends even without using async, with just requests.get, after 6K requests.

I suspected that.

For me the 'Too many open files' error occurred after downloading exactly 1k files. My solution was to disable keep-alive property, ever getting requests in chunks (@acdha thank you for the hint). lsof -p PID | wc -l shows a non-increasing number of connections during the execution.

rsess = requests.session()
rsess.config['keep-alive'] = False

rs = [grequests.get(l, session=rsess) for l in links]

for s in chunks(rs,100):
    responses = grequests.map(s, size=concurrency)
    for r in responses:
        try:
            print(r.status_code, r.url)
        finally:
            r.raw.release_conn()

[1] chunking: http://stackoverflow.com/a/312464

Closing while deferring to urllib3 fix.

@kennethreitz What's the urllib3's issue number?

Looks like this is the issue http://bugs.python.org/issue16298

@silvexis could very well be related to the urllib3 bug, now I'm just wishing someone had answered @piotr-dobrogost :P

Is anyone else still encountering this issue?

I haven't heard any reports of it. Are you?

It's problem of the box config, not of the framework. Look at kernel configuration of your OS. In BSD it is called kern.maxfiles. There is thread about ulimit in Linux systems: http://stackoverflow.com/questions/34588/how-do-i-change-the-number-of-open-files-limit-in-linux
Hope it helps, and I don't know how to change this parameter on Windows.

With the caveat that we're still running an older version of requests, we have the following, horrible, code in place to handle this:

    if self._current_response is not None:
            # Requests doesn't have a clean API to actually close the
            # socket properly. Dig through multiple levels of private APIs
            # to close the socket ourselves. Icky.
            self._current_response.raw.release_conn()
            if self._current_response.raw._fp.fp is not None:
                sock = self._current_response.raw._fp.fp._sock
                try:
                    logger.debug('Forcibly closing socket')
                    sock.shutdown(socket.SHUT_RDWR)
                    sock.close()
                except socket.error:
                    pass

(I think self._current_response is the requests' response object)

Hmm, where is the chain of closing broken? We have a Response.close() method that calls release_conn(), so what needs to happen in release_conn() for this to work?

@Lukasa this was definitely fixed in urllib3 as I was part of the discussion. With an inclination towards being conservative in my estimate, I would say it's there since requests 1.2.x if not 1.1.x.

Yeah, I did think this was fixed. Unless we see something on 1.2.3, I'm going to continue to assume this is fixed.

I'm seeing a CLOSE_WAIT leak with 2.0.2, do you have unit tests to ensure there is no regression on the topic ?

No, we don't. AFAIK urllib3 doesn't either. Can you reproduce your leak easily?

We use request in our internal app since monday, and hit the 1024 maxfiles today..

2 hours after reboot, we have 40 CLOSE_WAIT as told by lsof.

So I think we'll be able to reproduce in a dev environment, yes. I'll keep you in touch

@tardyp also, how did you install requests? I think all of the OS package maintainers strip out urllib3. If they don't keep that up-to-date and you're using an old version, that could be the cause instead. If you're using pip, then feel free to open a new issue to track this with instead of adding discussion onto this one.

I installed with pip, but I use python 2.6, I've seen fix on python2.7 for
this bug. Do you monkeypatch for older version?

Pierre

On Fri, Nov 29, 2013 at 5:33 PM, Ian Cordasco [email protected]:

@tardyp https://github.com/tardyp also, how did you install requests? I
think all of the OS package maintainers strip out urllib3. If they don't
keep that up-to-date and you're using an old version, that could be the
cause instead. If you're using pip, then feel free to open a new issue to
track this with instead of adding discussion onto this one.


Reply to this email directly or view it on GitHubhttps://github.com/kennethreitz/requests/issues/239#issuecomment-29526302
.

@tardyp please open a new issue with as much detail as possible including whether the requests you're making have redirects and whether you're using gevent. Also, any details about the operating system and an example of how to reproduce it would be fantastic.

FYI https://github.com/shazow/urllib3/issues/291 has been reverted due to bugs.

Should we re-open this?
I am having the same issue!

@polvoazul There's no way this is the same issue, which was originally reported in 2011, so I don't think reopening is correct. However, if you're running the current release of requests (2.4.3) and can reproduce the problem, opening a new issue would be correct.

@Lukasa i need you help 。 i use eventlet + requests,that always create so many sock that can't identify protocol 。 my requests is 2.4.3, is eventlet + requests cause this problem?

I'm sorry @mygoda, but it's impossible to know. If you aren't constraining the number of requests that can be outstanding at any one time then it's certainly possible, but that's an architectural problem outside the remit of requests.

@Lukasa thank you。 i think my issue is similar with this。 my project is pyvmomi. that connection is long-connection. i always confused why can hold so many can't identify protocol sock

Having the same problem now, running 120 threads, cause 100000+ opened files, any solution right now?

@mygoda you use awesome periods。

@1a1a11a _What_ files do you have open? That'd be a useful first step to understanding this problem.

@1a1a11a what version of requests are you using? What version of python? What operating system? Can we get any information?

I am using request 2.9.1, python 3.4, ubuntu 14.04, basically I am writing a crawler using 30 threads with proxies to crawl some website. Currently I have adjusted the file limit per process to 655350, otherwise it will report error.

I am still receiving the error "Failed to establish a new connection: [Errno 24] Too many open files" from requests.packages.urllib3.connection.VerifiedHTTPSConnection." I'm using Python 3.4, requests 2.11.1 and requests-futures 0.9.7. I appreciate requests-futures is a separate library, but it seems like the error is coming from requests. I'm attempting to make 180k asynchronous requests over SSL. I've divided those requests into segments of 1000, so I only move onto the next 1000 once all the future objects have been resolved. I'm running Ubuntu 16.04.2 and my default open files limit is 1024. It would be good to understand the underlying reason for this error. Does the requests library create an open file for each individual request? And if so, why? Is this a SSL certificate file? And does the requests library automatically close those open files when the future object is resolved?

Requests opens many files. Some of those files are opened for certificates, but they are opened by OpenSSL and not by Requests, so those aren't maintained. Additionally, Requests will also open, if needed, the .netrc file, the hosts file, and many others.

You will be best served by using a tool like strace to work out which files are opened. There is a strict list of system calls that lead to file descriptors being allocated, so you should reasonably swiftly be able to enumerate them. That will also let you know whether there is a problem or not. But, yes, I'd expect that if you're actively making 1000 connections over HTTPS then at peak load we could easily use over 1000 FDs.

I struggled with this issue as well and found that using opensnoop on OS X worked great to let me see what was happening if anyone runs in to the same issues.

I'm also frequently seeing this error when repeatedly calling requests.post(url, data=data) to an HTTP (not HTTPS) server. Running on Ubuntu 16.04.3, Python 3.5.2, requests 2.9.1

What is data?

A few hundred kb text

Not a file object?

No, I form a large query in memory.

Are you running this code in multiple threads?

No, single thread, posting to localhost

It seems almost impossible for us to be leaking that many FDs then: we should be repeatedly using the same TCP connection or aggressively closing it. Want to check what your server is up to?

I am having this problem. Python 2.7, requests 2.18.4, urllib3 1.22.
Running multi-threaded code (not multi-processed). Connecting to at most 6 URLs at one time, manually creating and closing a new session for each one.

I'm having the same problem on Python 3.5, requests==2.18.4

@mcobzarenco are you sure you are (implicitly) closing the underlying connection of the response? Simply returning the response will not close the connection. When reading response.content the data is actually read and after that the socket will not stay in CLOSE_WAIT.

Was this page helpful?
0 / 5 - 0 ratings