As of Version 1.3, the Apache HTTP Server includes a port to
(non-ASCII) mainframe machines which use the EBCDIC character
set as their native codeset.
(Initially, that support covered only the Fujitsu-Siemens
family of mainframes running the
BS2000/OSD operating system, a mainframe OS which features
a SVR4-derived POSIX subsystem. Later, the two IBM mainframe
operating systems TPF and OS/390 were added).
![]()
On an EBCDIC based system, HTML files and other text files are usually saved encoded in the native EBCDIC code set, while image files and other binary data are stored with identical encoding as on ASCII based machines. When the Apache server accesses documents, it must therefore make a distinction between text files (to be converted to/from ASCII, depending on the transfer direction) and binary files (to be delivered unconverted). Such a distinction can be made based on the assigned MIME type, or based on the file extension (i.e., files sharing a common file suffix).
By default, the configuration is symmetric for input and output (i.e., when a PUT request is executed for a document which was returned by a previous GET request, then the resulting uploaded copy should be identical to the original file). However, the conversion directives allow for specifying different conversions for input and output.
The directives EBCDICConvert and EBCDICConvertByType are used to assign the conversion setting (On or Off) based on file extensions or MIME types. Each configuration setting can be defined for input only (e.g., PUT method), output only (e.g., GET method), or both input and output. By default, the conversion setting is applied for input and output.
Note that after modifying the conversion settings for a
group of files, it is not sufficient to restart the server. The
reason for this is the fact that a cached copy of a document
(in a browser or proxy cache) will not get revalidated by
contents, but only by date. Since the modification time of the
document did not change, browsers will assume they can reuse
the cached copy.
To recover from this situation, you must either clear all
cached copies (browser and proxy cache!), or update the
modification time of the documents (using the
touch command on the server).
Note also that server-parsed documents (CGI scripts, .shtml files, and other interpreted files like PHP scripts etc.) are not subject to any input conversion and must therefore be stored in EBCDIC form on the server side.
In absense of any EBCDICConvertByType directive, and if no matching EBCDICConvert was found, Apache falls back to an internal heuristic which assumes that all documents with MIME types starting with "text/", "message/" or "multipart/" as well as the MIME type "application/x-www-form-urlencoded" are text documents stored in EBCDIC, whereas all other documents are binary files.
In order to provide backward compatibility with older versions of apache, the EBCDICKludge directive allows for a less powerful mechanism to control the conversion of documents to and from EBCDIC.
Note:
The EBCDICKludge directive is deprecated, since its functionality is superseded by the more powerful EBCDICConvert and EBCDICConvertByType directives.
The directives are applied in the following order:
![]()
Since all Apache input and output is based upon the BUFF data type and its methods, the easiest solution was to add the actual conversion to the BUFF handling routines. The conversion must be settable at any time, so BUFF flags were added which define whether a BUFF object has currently enabled conversion or not. Two such flags exist: one for data read from the client (ASCII to EBCDIC conversion) and one for data returned to the client (EBCDIC to ASCII conversion).
During sending of the header, Apache determines (based on the returned MIME type for the request) whether conversion should be used or the document returned unconverted. It uses this decision to initialize the BUFF flag when the response output begins. Modules should therefore determine the MIME type for the current request before initiating the response by calling ap_send_http_headers().
The BUFF flag is modified at several points in the HTTP protocol:
| set (In and Out) before a request is received (because the request and the request header lines are always in ASCII format) | |
| set/unset (for Input data) when the request body is received - depending on the content type of the request body (because the request body may contain ASCII text or a binary file) | |
| set (for returned Output) before a response header is sent (because the response header lines are always in ASCII format) | |
| set/unset (for returned Output) when the response body is sent - depending on the content type of the response body (because the response body may contain text or a binary file) |
![]()
#ifdef
CHARSET_EBCDIC#ifdef _OSD_POSIX | TPF |
OS390
EBCDICConvertByType {On|Off}[={In|Out|InOut}] mimetype [...]
EBCDICConvert {On|Off}[={In|Out|InOut}] fileext [...]
where the mimetype argument may contain
wildcards.![]()
When exchanging binary files between the mainframe host and a Unix machine or Windows PC, be sure to use the ftp "binary" (TYPE I) command, or use the rcp -b command from the mainframe host (the -b switch is not supported in unix rcp's).
The default assumption of the server is that Text Files (i.e., all files whose Content-Type: starts with text/) are stored in the native character set of the host, EBCDIC.
SSI documents must currently be stored in EBCDIC only. No provision is made to convert them from ASCII before processing. The same holds for other interpreted languages, like mod_perl or mod_php.
![]()