I refile this issue as a "bug" report, because now I tested it also under Fedora with the latest sources.
For demonstration of this bug, you can use this h2o.conf:
user: www
access-log: /var/log/h2o/h2o-access.log
error-log: /var/log/h2o/h2o-error.log
listen: 8080
hosts:
"<my-ip-address-here>:8080":
paths:
"/dir0":
mruby.handler: |
proc {|env|
resp = H2O.next.call(env)
resp
}
proxy.reverse.url: "https://www.google.com" # this URL is just for testing/demonstration
"/dir1":
file.dir: "/usr/local/www/data/testh2o/dir1"
"/dir2":
file.dir: "/usr/local/www/data/testh2o/dir2"
"/":
file.dir: "/usr/local/www/data/testh2o"
This works: I can access files under dir1, dir2 and dir3. And when I request http://<my-ip-address-here>:8080/dir0
I get the Google homepage (just for demo).
But adding even one more /path
segment breaks H20.next.call(env) as described here for HardenedBSD, FreeBSD.
Broken h2o.conf:
user: www
access-log: /var/log/h2o/h2o-access.log
error-log: /var/log/h2o/h2o-error.log
listen: 8080
hosts:
"<my-ip-address-here>:8080":
paths:
"/dir0":
mruby.handler: |
proc {|env|
resp = H2O.next.call(env)
# In my production conf I have here code to look for a certain header field.
# If it finds it, this handler makes a http_request() to a certain URL on another server
# in order to trigger emptying a cache.
# This bug demo is independent of this code.
resp
}
proxy.reverse.url: "https://www.google.com" # this URL is just for testing/demonstration
"/dir1":
file.dir: "/usr/local/www/data/testh2o/dir1"
"/dir2":
file.dir: "/usr/local/www/data/testh2o/dir2"
"/dir3":
file.dir: "/usr/local/www/data/testh2o/dir3"
"/":
file.dir: "/usr/local/www/data/testh2o"
I now get an Internal Server Error. And the error log on Fedora says:
[h2o_mruby] in request:/dir0:mruby raised: (eval):28: can't modify `SCRIPT_NAME` with `H2O.next`. Is `H2O.reprocess` what you want? (RuntimeError)
This is on a fresh Fedora 31, with H2O built from source at b9989220bea9bda30f69083b66166c4c657bdf84.
As the error message suggested using H2O.reprocess
I now tried this option. Same outcome: Adding more path-segments crashes h2o.
hosts:
test.local:
listen:
port: 80
host: 10.0.0.10
paths:
"/dir0":
file.dir: "/usr/local/www/data/testh2o/dir0"
"/dir1":
file.dir: "/usr/local/www/data/testh2o/dir1"
"/wp-admin":
mruby.handler: |
proc {|env|
env['SCRIPT_NAME'] = '/proxy-wp-admin'
resp = H2O.reprocess.call(env)
resp
}
"/proxy-wp-admin":
proxy.reverse.url: "https://www.google.com"
With this configuration opening http://test.local/wp-admin
in the browser gives me the Google homepage.
But adding one more path segment, e.g. an additional /dir2
gives me the following error:
Feb 4 17:12:11 web2 h2o[34490]: received fatal signal 11
Feb 4 17:12:11 web2 h2o[34490]: [34493] 0x4cf6a0 <???> at /usr/local/bin/h2o
Feb 4 17:12:11 web2 h2o[34490]: [34493] 0x8018c5946 <pthread_sigmask+0x536> at /lib/libthr.so.3
Feb 4 17:12:11 web2 h2o[34490]: [34493] 0x8018c4eb2 <pthread_getspecific+0xe12> at /lib/libthr.so.3
Is there really such a low limit on the number of allowed path segments??? This must be a bug. Possibly the same as the one for H2O.next
.
Since both, H2O.next
and H20.reprocess
seem broken, I had to explore another solution. I implemented a reverse proxy handler in mruby, following @kazuho 's example.
On my test server this works. I will likely deploy it in production, soon. Maybe it helps someone else running into the same problems with above mentioned methods:
proc {|env|
# copy headers
headers = {}
env.each do |key, value|
if /^HTTP_/.match(key)
headers[$'] = value
end
end
if env['CONTENT_TYPE']
headers['CONTENT_TYPE'] = env['CONTENT_TYPE']
end
# issue the request
input = env["rack.input"] ? env["rack.input"] : ""
if env['QUERY_STRING'].to_s != ''
uri = env['SCRIPT_NAME'] + env['PATH_INFO'] + '?' + env['QUERY_STRING']
else
uri = env['SCRIPT_NAME'] + env['PATH_INFO']
end
req = http_request(
"http://<backend-ip>#{uri}",
method: env["REQUEST_METHOD"],
headers: headers,
body: input,
)
# Extract status, headers and body, so that I can look into the headers
status, headers, body = req.join
# actual work done here, if a certain header is present
[status, headers, body]
}
Not quite working, yet:
H2O or mruby turns the original CONTENT_TYPE
header into HTTP_CONTENT_TYPE
, which breaks a WordPress backend...
Example:
"CONTENT_TYPE"=>"application/x-www-form-urlencoded; charset=UTF-8"
becomes
"HTTP_CONTENT_TYPE"=>"application/x-www-form-urlencoded; charset=UTF-8"
Is there a way to force it to use just CONTENT_TYPE
?
And all this, just because I wanted to add another /path
in my h2o.conf...
@utrenkner Sorry for being late. I hope https://github.com/h2o/h2o/pull/2254 will fix the issue. With that PR H2O.next
and H2O.reprocess
works well I think
H2O or mruby turns the original CONTENT_TYPE header into HTTP_CONTENT_TYPE, which breaks a WordPress backend...
I tried the same mruby handler but couldn't reproduce it. I mean, I spawned minimal upstream server (echo -ne "HTTP/1.1 200 OK\r\n\r\n" | nc -l 9000
) and see the headers sent from h2o. It included CONTENT_TYPE
, not HTTP_CONTENT_TYPE
.
BTW if you take this http_request approach, you have to convert underscores in header hash keys to hyphens: that is, you have to convert HTTP_X_FOO
to X-FOO
, not X_FOO
. I guess this is the cause of that issue: your server implementation doesn't recognize it as content-type
header, which have to be treated specially in most cases.
@i110 Great many thanks! I will try out your patch, tomorrow.
And also thank you for testing the above mruby handler. Indeed, I completely forgot about changing the underscore into a hyphen. With that change, it does work as intended and CONTENT_TYPE
stays CONTENT_TYPE
!
@i110 Thank you so much! #2254 solves my problems with both H2O.next
and H2O.reprocess
. I added even more /path
segments to h2o.conf, just to check that it really works. And, yes, it does!