Extended UTF-8 input encoding support for LaTeX (https://wolfgang.jeltsch.info/software/tex/ucs)

root

Introduction

The ucs package contains extended support for using UTF-8 as input encoding of LaTeX documents.

The package is currently maintained by Wolfgang Jeltsch.

Installation

Simply put ucs.sty, utf8x.def, ucsencs.def, and data/* somewhere in your TeX path. If you want CJK characters, you will need the c??enc.def files, too. For Klingon, you need lklenc.def and lklkli.fd.

You may have to run some command like mktexlsr; otherwise the files will not be found by TeX. What to do, however, is distribution-dependent.

Warning: If you install the files into some directory where TeX does not search recursively (usually the current directory and your personal TeX directory are of that kind), you will have to put the contents of data directly into that directory.

If you do not install data/uninames.dat, you will save about 300 KB, but you will not get the full names of the characters in error messages.

You may add glyph macros to the files in config (see perldoc makeunidef.pl for information on the configuration file format). If you do so, you will have to recreate the files in data by running make datafiles, which uses makeunidef.pl internally. You may also run makeunidef.pl directly. When doing this, you may supply another target directory instead of data, for example, some directory in your TeX path. The makeunidef.pl tool will not delete or overwrite any files it has not created itself. If you add --exclude cjkbg5,cjkgb,cjkjis,cjkhangul, CJK characters will not be included, saving more than 1 MB of disk space.

Usage

The simplest way to use this package is to add the following lines to your preamble:

\usepackage{ucs}
\usepackage[utf8x]{inputenc}

You may even omit the first line in many cases. Often, however, you need perform additional steps like loading additional packages. See languages.ps.gz for language-specific examples.

Adding glyphs

The tables with LaTeX macros for the glyphs does not contain many entries yet. If your particular glyphs are not supported, you can add corresponding macros to the configuration files (see perldoc makeunidef.pl). If you do so, please e-mail these configuration files to the ucs package maintainer, together with information about special packages or LaTeX versions needed for using your macros. Please do only supply macros that generate aesthetic glyphs, no hacks.

Getting the source code repository

Install darcs from http://darcs.net/ if you do not have it already. Then check out the ucs repository with the following command:

darcs get http://hub.darcs.net/jeltsch/ucs

Building

If you are working with the source code repository, you will probably want to build the autogenerated files at some point. To do this, run GNU make in the root directory of the repository.

Creating a distribution bundle

Running make dist in the root directory of the repository will give you a file ucs.tar.gz. This file will contain all source files and all autogenerated files, and will have the “executable” bit set for all scripts. It is intended for uploading to CTAN.

Web page

Information about this package, including a link to its source code repository, can be found at http://wolfgang.jeltsch.info/software/tex/ucs.

Frequently asked questions

> LaTeX complains about missing commands I have not used. Why?

The ucs package uses many macros from many packages. You have to include these into your preamble. To find out which package contains the missing macro, you can run perl discovermacro.pl ⟨missing macro⟩ or perl discovermacro.pl ⟨document⟩.log. Alternatively, have a look at the human-readable file ltxmacrs.txt.

> The package complains about uni-global.def (and other files) > missing, but they are in the TeX search path.

Perhaps you have put the ucs/data directory in a directory where TeX does not search recursively (for example, your private TeX directory or the current directory). You can change this by putting the ucs package into a recursively searched directory or by putting the files in ucs/data at the top level of the searched directory.

> When I try to activate options using \usepackage[⟨options⟩]{ucs}, > LaTeX complains about an option clash.

Probably, ucs.sty has already been loaded via \usepackage[utf8x]{inputenc}. Try loading ucs.sty first, or set the options with \SetUnicodeOption.

> I get an “TeX capacity exceeded” error. What can I do?

Try the savemem option. This will reduce the memory consumption of ucs.sty, especially if you use CJK glyphs, but will also slow down operation significantly. Alternatively, increase TeX’s capacity, if this is feasible in your situation.

> The Esperanto letter LATIN SMALL LETTER H WITH CIRCUMFLEX is ugly, but > ^h with the babel package option esperanto is not. Why?

The file esperanto.ldf defines its own macro for ^h, but ucs uses the standard \^h. Add the following lines to your preamble:

    \DeclareTextCompositeCommand{\^}{T1}{h}{h\llap{\^{}}}
    \DeclareTextCompositeCommand{\^}{OT1}{h}{h\llap{\^{}}}

This will make \^h and the corresponding Unicode character yield the same as ^h.

> When TeX outputs a line of my document to the terminal or the log > file, the non-ASCII characters are replaced by garbage. Why?

The first possibility is that you do not read the output with a Unicode-enabled terminal. The second is, that TeX replaces some bytes by ^^-sequences. I do not know how to tell TeX which characters are to be escaped that way (tell me if you do). If nothing else helps, you can use latexout.pl which converts such output to UTF-8.

> Is the ucs package the same as the unicode package?

Dominique Unruh, the original author of the ucs package, first used the name unicode.sty for the style file, but there was a name clash with Sebastian Rahtz’ jadetex/passivetex package. So later, he used the name ucs.sty instead of unicode.sty. The package was subsequently called ucs in TeXLive and installed into the directory tex/latex/ucs. However, it was still called unicode in MiKTeX and on CTAN. In April 2012, it was decided that the name unicode should not be used anymore to avoid confusion. So it is now the ucs package.

Legal information

© 2000 by Dominique Unruh; © 2011–2013 by Wolfgang Jeltsch

This work may be distributed and/or modified under the conditions of the LaTeX Project Public License, either version 1.3 of this license or (at your option) any later version, with the extensions listed below.

The latest version of the LaTeX Project Public License (without the extensions listed below) is in http://www.latex-project.org/lppl.txt and version 1.3 or later is part of all distributions of LaTeX version 2005/12/01 or later.

This work has the LPPL maintenance status “maintained”.

The Current Maintainer of this work is Wolfgang Jeltsch.

This work consists of all files found at http://hub.darcs.net/jeltsch/ucs including subdirectories.

The following extensions to the LPPL apply for this work:

  • The directory structure may be changed

  • A distribution may split the package into smaller packages, as long as this fact is visible to the user and the user may easily install the complete ucs package (for example, by installing all small packages).

  • The set of configuration files may be extended by adding new *.ucf files to config, with any characters defined in these files being only accessible via an option that starts with the five letters “local”.

  • Files in the unsupported directory may be omitted.

  • Scripts (that is, executable files that are not TeX input), may be renamed, provided that the original name without the suffix is part of the new name (for example, discovermacro.pllatex-ucs-discovermacro) and that this renaming stated in some documentation file that is part of the distribution. Occurrences of the scripts’ names in the documentation may be changed to match the new name.

  • Parts of the files explicitly marked as “configuration data” may be changed by distributions as long as this is stated in a comment near the place of that modification and in some documentation file that is part of the distribution.