Programming Reference Manual
 
Syntax
 
long = stringclassobject.isUTF8
 
Description
Determines whether the contents of the StringClass instance form a valid UTF-8 byte sequence. isUTF8 returns True when every byte in the buffer satisfies the formal UTF-8 grammar as defined in RFC 3629, including correct continuation-byte patterns for two-, three- and four-byte code points, and returns False when any malformed sequence is encountered.
 
Detection is performed deterministically via the Windows API function MultiByteToWideChar with the MB_ERR_INVALID_CHARS flag set, which by design rejects any byte sequence that is not strictly valid UTF-8. Pure ASCII content (code points 0–127) is by definition a valid UTF-8 subset and will therefore also return True; this is the expected behaviour and reflects the fact that an ASCII string can be safely treated as UTF-8 without further conversion. isUTF8 is intended as a reliable preflight check before calling fromUTF8, and is itself called internally by the toString method to decide whether automatic decoding is required.
 
The contents of the StringClass instance are not modified by isUTF8. Earlier versions of the StringClass relied on a heuristic detection that could produce false positives or false negatives for certain character combinations within the Windows-1252 codepage; the current API-based implementation is fully deterministic and replaces that heuristic.
 
See Also
Example
Sub Main
Dim oS As StringClass
 
Set oS = New StringClass
 
' UTF-8 encoded buffer: "Café"
oS.Value = "Caf" & Chr$(195) & Chr$(169)
Debug.Print "First buffer is UTF-8: "; oS.isUTF8
 
' Native Windows-1252 buffer: "Café"
oS.Value = "Caf" & Chr$(233)
Debug.Print "<span style='color: #800080; font-weight: bold;'>Second</span> buffer is UTF-8: "; oS.isUTF8
 
Set oS = Nothing
End Sub