
Library to encode/decode nonstandard Unicode Transformation Formats
===================================================================

SPDX-FileType: DOCUMENTATION
SPDX-FileCopyrightText: NONE
SPDX-License-Identifier: CC0-1.0


General
-------
This library provides conversion functions for inofficial Unicode
Transformation Formats.
Conversion from UTF-7 (RFC 2152) to UTF-8 is supported.
Conversion from CESU-8 (<https://www.unicode.org/reports/tr26/tr26-4.html>)
to UTF-8 is supported.

The development goals of this implementation are:
- C90 conformance
- No memory (re)allocation
- Support for reproducible builds

Functionality is implemented with separate objects.
The linker should only pull in the code that is really needed.

Metadata for pkg-config is provided.


API
---
This library occupies the namespaces with "ucic0_" and "UCIC0_" prefix.
The namespaces with "ucic0_i_" and "UCIC0_I_" prefix are reserved for internal
use, never use it outside of the library (the shared library may not even
export such symbols).

The API with stripped namespace prefixes is intended to be compatible with
Solaris 11:
<https://docs.oracle.com/cd/E88353_01/html/E37843/iconvstr-3c.html>

No global context is used, therefore no initialization or shutdown
functions are present.


Error handling
--------------
Some fatal errors can be internally handled via assert(). Such errors, like
internal data corruption, indicate bugs in the library.
For maximum performance internal checks can be disabled with:

   CPPFLAGS=-DNDEBUG

The build system of the library enables the checks by default.


Thread safety
-------------
All API functions are thread-safe if "assert()" is thread-safe.
Otherwise "NDEBUG" must be defined to compile a thread-safe library.


Versioning scheme
-----------------
The release version contains 3 numbers "x.y.z":

- Major (x)
  The major number is incremented for every API/ABI change that is not backward
  compatible (with exception for version 0).

- Minor (y)
  The minor number is incremented for API/ABI extensions that are backward
  compatible.

- Patch (z)
  The patch number is incremented for changes that don't change the API/ABI.

In other words:
Releases with the same major and minor numbers are drop-in replacements.
Up- and downgrades between such versions are possible without touching programs
that use the library.
Releases with the same major, but different minor numbers are backward, but not
forward compatible. Upgrades are possible, downgrades can break programs that
use the library.
Releases with different major numbers require changes in all programs that use
the library.

Versions with different major numbers can be installed in parallel (including
header files).
Binaries can be linked against multiple instances of the library with different
major versions (e.g. if dependencies are using different versions).


Unicode normalization
---------------------
No normalization is guaranteed for Unicode output data.
An external normalization step is required after conversion.


Supported encodings
-------------------
As defined by IANA, encoding names are matched case-insensitive:
<https://www.iana.org/assignments/character-sets/character-sets.xhtml>
Optionally the registered alias names are accepted too (only for source
encodings).
The library enables this option by default.

Optionally, for better error tolerance, some nonstandard variants of the
names are accepted too (only for source encodings).
The library enables this option by default.


Supported target encodings
--------------------------
- Unicode (with UTF-8 transformation format)
  Name: "UTF-8"


Supported source encodings
--------------------------
- UTF-7
  Name   : "UTF-7"
  Aliases: "csUTF7"
  Nonstandard: "UTF_7", "UTF7"
- CESU-8
  Name   : "CESU-8"
  Aliases: "csCESU8", "csCESU-8"
  Nonstandard: "CESU_8", "CESU8"
