Content Issues when Logging or Updating a Call from E-Mail

From support-works
Jump to navigation Jump to search




Status: Published
Version: 1.2
Authors: HTL QA
Applies to: Supportworks ESP

Introduction

In order to meet the requirements for Email (RFC8228 and its associated RFCs), any email sent must contain a Plain Text Part and then optionally any subsequent formats (i.e. HTML) that the sending client is capable of Supplying. Each of these parts is capable of being encoded in a given Character set (http://en.wikipedia.org/ wiki/Character_encoding) and each Character set has specific characters that can be displayed. For example US-ASCII has the $ symbol but not the € symbol and therefore any email sent as plain text in US-ASCII will not be able to use the € symbol directly and will need to encode this (=80). It is then up to the receiving client to decode this.

In Supportworks all Plain Text Parts of Emails will be sent as US-ASCII and therefore all non US-ASCII characters encoded. (Note that this does not include the Subject – which can only be plain text only). When the receiving client opens the email it will attempt to decode the email based on it's own character set configuration. If the character set contains a mapping for =80, it will display the character from it's set. If no value found it will generally display a ? or other character that may appear to be incorrect in the context of the message. If both the sender and receiver have the same character set then it's likely that they will see the expected character, however there is no absolute guarantee of this.

HTML is slightly different, whilst the part of the message is sent (within Supportworks) in US-ASCII, the charset of the HTML, with any given message, can be set within header information sections of each part of that particular message - that comprises a complete message. The character set used is most likely to come from the local machine (ie Charset-windows-1252) therefore any characters encoded within the HTML to the specified charsets requirements (For example the € symbol becomes &^ in Windows-1252). On receipt of the email it is the client's responsibility to decode the HTML and display it, which can only occur completely successfully if the specified charset is available to the client application.

For most clients, the character set in use comes from the local machine or browser settings or email header, rather than Content Type for the given Part. Testing a few of the more common email clients shows that only Gmail reliably takes the Content type from message parts - rather than somewhere else, and therefore other client software is most likely to have issues with displaying of non US-ASCII characters.

Clients Content Type
  • AOL
  • Hotmail
  • Outlook Express
  • Outlook 2003, 1007, and 2010
  • Lotus Notes 6 - 8.5
  • Live Mail
  • Thunderbird 2 and 3
  • iPad
  • iPhone Mail
  • Android Mail
Each take the Content-Type from the header of your email - See below
  • Gmail
Each convert the Content-Type to UTF-8

Format of a MIME Email

A typical email consists of 3 parts (Plus part for each attachment) , the Header, A Plain text part and a HTML Part. The client (e.g Outlook 2010) decides which part to show (Plain\HTML) and then shows this according to its settings. The RFC states that the content-type should come from the associated part, however Outlook (amongst others) incorrectly uses the first one it comes across (Microsoft actual state that this is an optimisation rather than a defect)

Mailformatdiagram.png

Non-ASCII Symbols

Any character that is not part of the original standard US-Ascii set (of 128 characters) may be lost in translation. This is not an issue that is specific to Supportworks or any other Email server services – there is the potential for any email clients to "suffer" from this (See below for Links to Google and Microsoft).

The same email sent to two people can be viewed differently on different machines. It is also possible for the same email to be viewed differently on the same machine depending on the client used.

An example of this is sending the € symbol to [email protected] as HTML and then viewing the email in IE9 (Windows 7 SP1) and then IE6 (Windows XP SP3) . Its likely that you will see the email arrive as expected in IE9 but wrong in IE6, further more when in IE9 if you choose View->Encoding you could change this and end up with the same results as IE6.

Summary

To summarise, Email and Character sets can be "hit and miss", this is a known entity throughout the industry and effects all clients/servers. The only way to guarantee that the email is received as expected is to use characters only within the US-Ascii base set. Going forward email clients\servers are starting to switch to UNICODE and/or UTF-8 and use the Content-Type of the given part of the email to decode the text which provides a more consistent "standard" for character encoding/decoding ¬however until ALL email clients utilise the format (In practice very unlikely to happen) the problem will continue.

Recommendation

A patch has been released for F0092021: Character encoding not processed correctly for incoming emails. This patch is recommended to customers who frequently receives emails from mail clients that uses a different code page other than the one required by Supportworks.

Note that the patch is not compatible with systems prior to Supportworks ESP 7.6.2 SP1

Further Reading