Enabling Multilanguage Support With UTF-8 - the hack
There are several places where character encoding is used. They are:
- Encoding which is used when JSP compiler reads files from disk
- Encoding which is used when the web server writes the response to the socket
- Encoding that is specified in the response header sent to the browser
- Encoding which is used when the browser sends the request to the server.
- Query string data is encoded using percent signs, but after you decode it you need to specify character encoding to convert it to a java string
- Multipart form data is encoded using base64 encoding. The same, after decoding base64, character encoding is used
Out of the box, Magnolia uses ISO-8859-1 for most of these and I don't know any civilised way to switch to UTF-8. My solution is as follows:
- Replace org.apache.tomcat.util.http.Parameters class with the one attached. It sits inside $MAGNOLIA_HOME/author/server/tomcat-util.jar
Some browsers including my beloved Firefox 1.0 don't specify character encoding in the request header. Tomcat defaults to ISO-8859-1 when parsing the parameters. I changed the defaults to UTF-8 in that class. This patch should fix point 4.1
- Replace com.oreilly.servlet.multipart.MultipartParser class with the one attached. It sits inside $MAGNOLIA_HOME/author/webapps/magnolia/WEB-INF/lib/cos.jar
This should fix point 4.2
- Add parameter contentType="text/html;charset=UTF-8" to all @page directives inside your templates and inside all jsp files in the $MAGNOLIA_HOME/author/webapps/magnolia/admintemplates folder.
This should fix points 1, 2 and 3
I have attached the sources of the modified files as well. Places changed by me are marked by 'Mitek' word.
Enabling Multilanguage Support With UTF-8 or other encodings - a clean solution
1- configure Tomcat to use the correct encoding while compiling jsp files. By default Tomcat is already configured to use UTF8, you can switch to another encoding (e.g. UTF16) by changing $TOMCAT_HOME/conf/web.xml:
<servlet>
<servlet-name>jsp</servlet-name>
<servlet-class>org.apache.jasper.servlet.JspServlet</servlet-class>
<init-param>
<param-name>javaEncoding</param-name>
<param-value>UTF8</param-value>
</init-param>
...
</servlet>
For more details/throubleshooting you can see an interesting article here:
http://www.javaworld.com/javaworld/jw-04-2004/jw-0419-multibytes.html
2- you still have to force browser requests to contain the correct encoding. You can use a simple j2ee filter for this scope, without patching any tomcat/cos class (actually: this has been tested using jakarta commons-upload and not oreilly servlet... please report any problem with it).
Spring includes a filter in its distribution (org.springframework.web.filter.CharacterEncodingFilter), see http://www.springframework.org/docs/api/org/springframework/web/filter/CharacterEncodingFilter.html
To force requests to be UTF8 you only will need to add this to your web.xml (of course you will also need the appropriate jars/classes in your classpath):
<filter>
<filter-name>encodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>true</param-value>
</init-param>
</filter>
In its previous incarnation on JspWiki, this page was last edited on Feb 9, 2007 10:12:01 AM by BorisKraft
Other known authors include :