Re: [Zope] keeping Java Servlets session ids based on url rewriti ng
Albert, I've put this in the collector as a possible bug... hopefully it will get fixed with the next release if it proves not to be the proper behavior. In the meantime, you want want to try messing around with that regex to get the appropriate behavior for your environment. aboulang@ldeo.columbia.edu wrote:
I've done a little poking around in ZPublisher's HTTPRequest.py and BaseRequest.py and I don't think that's where the ';*' gets stripped. I can't find *where* it gets stripped. It must be possible to make Zope de-ignore things split on a ";", but right now I can't find out where to do so.
Um from looking at the code I think it may be Zserver not Zpublisher doing it. I think there is code which set up the CGI env vars at Zpublisher pick em up and works with them, so it is the code that sets those GCI vars that is dropping it. Isn't it tru that if you use APACHE, they are set by APACHE and is you use Zserver w/o frontending it with APACHE something in Zserver has to be setting them?
I think this is where the stripping occurs:
From default_handler in the medusa directory...
# split a uri # <path>;<params>?<query>#<fragment> path_regex = regex.compile ( # path params query fragment '\\([^;?#]*\\)\\(;[^?#]*\\)?\\(\\?[^#]*\)?\(#.*\)?' )
def split_path (path): if path_regex.match (path) != len(path): raise ValueError, "bad path" else: return map (lambda i,r=path_regex: r.group(i), range(1,5))
Which is called by HTTPServer.py:
def get_environment(self, request, # These are strictly performance hackery... split=string.split, strip=string.strip, join =string.join, upper=string.upper, lower=string.lower, h2ehas=header2env.has_key, h2eget=header2env.get, workdir=os.getcwd(), ospath=os.path, ): [path, params, query, fragment] = split_path(request.uri) while path and path[0] == '/': path = path[1:] if '%' in path: path = unquote(path) if query: # ZPublisher doesn't want the leading '?' query = query[1:]
server=request.channel.server env = {} env['REQUEST_METHOD']=upper(request.command) env['SERVER_PORT']=str(server.port) env['SERVER_NAME']=server.server_name env['SERVER_SOFTWARE']=server.SERVER_IDENT env['SERVER_PROTOCOL']=request.version env['channel.creation_time']=request.channel.creation_time if self.uri_base=='/': env['SCRIPT_NAME']='' env['PATH_INFO']='/' + path else: env['SCRIPT_NAME'] = self.uri_base try: path_info=split(path,self.uri_base[1:],1)[1] except: path_info='' env['PATH_INFO']=path_info env['PATH_TRANSLATED']=ospath.normpath(ospath.join( workdir, env['PATH_INFO'])) if query: env['QUERY_STRING'] = query env['GATEWAY_INTERFACE']='CGI/1.1' env['REMOTE_ADDR']=request.channel.addr[0]
# If we're using a resolving logger, try to get the # remote host from the resolver's cache. if hasattr(server.logger, 'resolver'): dns_cache=server.logger.resolver.cache if dns_cache.has_key(env['REMOTE_ADDR']): remote_host=dns_cache[env['REMOTE_ADDR']][2] if remote_host is not None: env['REMOTE_HOST']=remote_host
env_has=env.has_key for header in request.header: key,value=split(header,":",1) key=lower(key) value=strip(value) if h2ehas(key) and value: env[h2eget(key)]=value else: key='HTTP_%s' % upper(join(split(key, "-"), "_")) if value and not env_has(key): env[key]=value env.update(self.env_override) return env
Also from rfc1738
http://rfc.fh-koeln.de/rfc/html/rfc1738.html
"Reserved:
Many URL schemes reserve certain characters for a special meaning: their appearance in the scheme-specific part of the URL has a designated semantics. If the character corresponding to an octet is reserved in a scheme, the octet must be encoded. The characters ";",
"/", "?", ":", "@", "=" and "&" are the characters which may be reserved for special meaning within a scheme. No other characters may
be reserved within a scheme.
Usually a URL has the same interpretation when an octet is represented by a character and when it encoded. However, this is not true for reserved characters: encoding a character reserved for a particular scheme may change the semantics of a URL.
Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.
On the other hand, characters that are not required to be encoded (including alphanumerics) may be encoded within the scheme-specific part of a URL, as long as they are not being used for a reserved purpose. "
Hopes this helps, Albert
-- Chris McDonough Digital Creations, Publishers of Zope http://www.zope.org
participants (1)
-
Chris McDonough