UTF8 Tools

From Free Pascal wiki
Revision as of 11:17, 9 July 2009 by Theodp (talk | contribs)
Jump to navigationJump to search

About

Sharing some of my code


UTF-8 Tools

Purpose

Some tools for common problems with UTF-8 / Unicode.

  • charencstreams.pas: Load and save data from almost any text source like
    • ansi, UTF8, UTF16, UTF32
    • big or little endian
    • BOM or no BOM

Simple demo:

fCES := TCharEncStream.Create;
fCES.LoadFromFile(OpenDialog1.FileName);
Memo1.text := fCES.UTF8Text;  
fCES.free;
  • character.pas: Get Information about code points using the TCharacter class.

Demo

if TCharacter.IsLetter(s[i]) then s[i] := TCharacter.toLower(s[i]);
  • utf8scanner.pas: Access UTF-8 strings by code index, use case statements on UTF-8 strings and more...

Index demo

s := TUTF8Scanner.Create(Memo1.text);
for i := 1 to s.Length do
if TCharacter.IsLetter(s[i]) then s[i] := TCharacter.toLower(s[i]);
Memo1.Text := s.UTF8String;
s.free;

Case demo

 s := TUTF8Scanner.Create(Memo1.text);
 s.FindChars := 'öäü';
 repeat
   case s.FindIndex(s.Next) of
 {ö} 0: s.Replace('oe');
 {ä} 1: s.Replace('ae');
 {ü} 2: s.Replace('ue');
   end;
 until s.Done;
 Memo1.Text := s.UTF8String;
 s.free; 

Download

Donwload utf8tools.zip