chm backend for fpdoc

From Free Pascal wiki
Revision as of 16:07, 21 December 2020 by Marcov (talk | contribs) (→‎Remaining problems)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Examples

This link has information about the TOC and Index files in chms: http://www.nongnu.org/chmspec/latest/Sitemap.html

These formats are based on HTML and use the following doctype:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">

The <HEAD> tag contains a <meta> tag providing information on the program that generated the files and a comment indicating the version of the file, e.g.:

<meta name="GENERATOR"content="Microsoft® HTML Help Workshop 4.1">
<!-- Sitemap 1.0 -->

The <BODY> tag contains an <OBJECT> tag that stores properties of the file in <param> tags, followed by a <UL> tag, whose <LI> tags have <OBJECT> tags that store the properties of the Contents/Index items in <param> tags. e.g.:

<BODY>
<OBJECT type="text/site properties">
    <param name="Property Name" value="Property Value"></OBJECT>
<UL>
    <LI> <OBJECT type="text/sitemap">
        <param name="Property Name" value="Property Value"></OBJECT></UL>
</BODY>

Note that the Property Names and Property Values and tags are not case-sensitive, but HHW will always write all three in the default capitalization, when appropriate.

Note that the tags are mostly in uppercase and the <LI> tag is not closed; this is in compliance with the doctype.

Some properties that were seen in HHA.dll that may or may not be used are Background Image, NumberImages, InformationTypeDecl, Secondary, Icon, Display, Keyword, Instruction, Section Title, Favorites, QueryType, SendEvent, SendMessage, HHI, Inclusive & Exclusive.

This the beginning chunk of an autogenerated (the autogenerated TOC stinks) hhc (Table of Contents/TOC) file for the RTL: .hhk are in the same format but are for the Index pane and do not have subitems.

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<HTML>
<HEAD>
<meta name="GENERATOR" content="Microsoft® HTML Help Workshop 4.1">
<!-- Sitemap 1.0 -->
</HEAD><BODY>
<OBJECT type="text/site properties">
    <param name="Auto Generated" value="Yes">
</OBJECT>
<UL>
    <LI> <OBJECT type="text/sitemap">
        <param name="Name" value="Reference for package 'rtl'">
        <param name="Local" value="rtl/index.html">
        </OBJECT>
    <UL>
        <LI> <OBJECT type="text/sitemap">
            <param name="Name" value="Units">
            <param name="Local" value="rtl/index.html">
            </OBJECT>
        <LI> <OBJECT type="text/sitemap">
            <param name="Name" value="Description">
            <param name="Local" value="rtl/index.html">
            </OBJECT>
    </UL>
    <LI> <OBJECT type="text/sitemap">
        <param name="Name" value="Reference for unit 'BaseUnix'">
        <param name="Local" value="rtl/baseunix/index.html">
        </OBJECT>
    <UL>
        <LI> <OBJECT type="text/sitemap">
            <param name="Name" value="Overview">
            <param name="Local" value="rtl/baseunix/index.html">
            </OBJECT>
    </UL>

......
</UL>
</BODY></HTML>

Generating FPC CHM docs via fpdoc

As of October 2008, the fpdoc part of the documentation (the part that documents the various units) can be compiled to CHM fairly easily, a

make clean html HTMLFMT=chm CSSFILE=/path/to/fpdoc.css

will generate rtl.chm and fcl.chm. Note that the cssfile is currently in a different repository (I used fpc/utils/fpdoc/fpdoc.css since it has the correct name). A script fixdocs.sh automates most of the CHM related options. (*nix, requires latex + tex4ht)

Significant fixes to the CHM, XML and makefiles backend were introduced in the sources (fpcdocs,chm, fcl-xml and fpdoc) from the 2.3.1 branch, July 31st 2009 and newer.

Remaining problems

  1. The text mode IDE crashes on loading the LCL.chm's index on all architectures and OSes. Other CHM viewers don't seem to suffer from this, but there is always a chance it is CHM pkg related. (fixed)
  2. The prog and user documents don't have an index. Ref guide only has keywords.
  3. FCL-XML generates UTF-8, which some CHM browsers (kchmviewer) don't seem to handle properly. Note: is this still applicable in March 2014? Workaround: ISO8859-1 default? KCHMviewer seem to work fine on windows.
  4. compiling LCL finds a lot of unknown identifiers. These should be documented?
  5. crosslinking between user,ref,prog and fpdoc?

And the major one:

  1. How to combine the indexes to a master index (including duplicate handling)? Currently additional disambiguating context is only added to the index if a dupe _within_ a CHM is detected.

Generating LCL CHM via build_lcl_docs

Lazarus provides the build_lcl_docs.lpr tool: a wrapper which simplifies calling fpdoc to generate an LCL CHM.

It can be run both in GUI mode and command line mode.

Useful command line parameters

  1. --fpdoc <fpdoc executable to be used to generate CHM>
  2. --css-file=<css file to be used for layout, e.g. fpdoc.css>
  3. --outfmt chm Set output format to CHM

Generating LCL CHM via fpdoc

Only tested on *nix

Warning-icon.png

Warning: this script enables all fpdoc bells and whistles, takes about 4 minutes and 400MB memory on a Core2-6600. If you want to do this regularly, make sure you have a fpdoc from trunk, preferably from a checkout that was compiled with optimization on which saves about half a minute

export HTMLFMT=chm
cd lazarus/docs/html
sh build_lcl_html.sh "fpdoc" "" /path/to/fpcdocs

The generated .xct file is an index file for fpdoc cross file links, and is of no use to the end user, unless you want to link to the chm. That is also why we have to specify the fpcdocs archive. If we have previously built the FPC docs there, lcl will use this to craft cross CHM links.

Generating FPC CHM docs for the latex documentation

Building CHMs for the documentation that is not in fpdoc format (but normal latex files) is harder. The process can be fairly easily be separated into two stages:

  1. generate HTML from latex sources
  2. compile HTML to CHM
    1. raw compile
    2. generating indexes and TOC

Generating HTML

The first is currently the hard part. During the years several converters have been tried, the current one used is tex4ht. While the theory is easy (install latex, install tex4ht, run make html), the practice is difficult (*), and the generated docs are usually corrupt (ligatures converted to pics or unicode escapes, bad "next" links etc). Luckily, these docs mutates less often, so one probably can use a generated set of CHMs for the entire duration of a release cycle. Still it would be great to have a definitive, reproducable solution.

For the moment, the "bad" output of tex4ht is fixed by a DOM based (FCL-XML) FPC tool, relinkdocs. Relinkdocs scans the HTML docs and relinks the next/up/prev links.

(*) to the degree that the version on the FPC front page is more often broken than not.

Compiling and TOC generation

To compile the resulting HTML to CHM, another tool was written, called compilelatexchm, which also relies on FCL-XML for the TOC scanning. The CHM is searchable, but does not yet have an index. (I don't know a suitable algoritm to come up with the keywords)

A batch file that automates these steps is added to the fpcdocs repository (fixdocs.sh), but is not idiot proof yet. If you have troubles, ask me (Marco/oliebol) on IRC.

Plastex

A week after I did the above fixing of tex4ht output, Andrew Haines came with a patch that supported using PlasTeX as tex2chm converter. I don't entirely like the output (chapter and paragraph numbers drop off, which makes it harder to correspond over documentation, and the template is a bit playful), but I haven't looked into customizing it at all, so maybe something can be done there. I haven't tested the generated CHMs with the textmode IDE yet too.

One can select PlasTeX by passing USEPLASTEX=1 to the make process.

chmcmd understood options

While the chmcmd program responds to e.g. --help by outputting the command-line options it understands, it doesn't report what options are acceptable in the .hhp file. Refer to the OptionKeys constant in fpcsrc/packages/chm/src/chmfilewriter.pas, and documentation available at e.g. [1].

For anybody determined to "roll their own", an absolutely minimal .hhp file reads something like

[OPTIONS]
Compiled file=test.chm
Contents file=./Default.hhc
[FILES]
./Borg-UI_MainForm_HelpButton.html
./Borg-UM_MainForm_HelpButton.html

with the TOC file being

 <html><head></head><body><object type="text/site properties"></object><ul>
 <li><object type="text/sitemap">
   <param name="Name" value="Borg-UI_MainForm_HelpButton">
   <param name="Local" value="Borg-UI_MainForm_HelpButton.html">
   </object>
 <li><object type="text/sitemap">
   <param name="Name" value="Borg-UM_MainForm_HelpButton">
   <param name="Local" value="Borg-UM_MainForm_HelpButton.html">
   </object>
 </ul></body></html>

Troubleshooting

Trouble viewing in Vista

The chm file could be marked as blocked. To unblock, use the chmls tool (included in $(fpcdir)\packages\chm\src\):

chmls unblockchm yourfile.chm

Alternative ways and background

Have a look at this link. We still must find something that make the installer do this automatically.

Or turn it off entirely using:

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\HTMLHelp\1.x\HHRestrictions]
 "MaxAllowedZone"=dword:00000001
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\HTMLHelp\1.x\ItssRestrictions]
 "MaxAllowedZone"=dword:00000001

You can add a registry key to repair this issue in XP and Vista by setting the ItssRestrictions.MaxAllowedZone to 3 or 4 or 5.

A simpler solution is to unzip with e.g. infozip, instead of Windows built-in support. (beware, newer infozips might conform at any time!)

Apparently this is an alternate filestream: http://stackoverflow.com/questions/1617509/unblock-a-file-with-powershell