Difference between revisions of "chm backend for fpdoc"

From Free Pascal wiki
Jump to navigationJump to search
Line 162: Line 162:
 
http://stackoverflow.com/questions/1617509/unblock-a-file-with-powershell
 
http://stackoverflow.com/questions/1617509/unblock-a-file-with-powershell
  
We need a program for this!
+
I made a small program that wraps this, and committed it as "chmunblock" util to chm.

Revision as of 15:21, 11 June 2010

Examples

Hi from this link this is all the information about the TOC and Index files in chms: http://www.nongnu.org/chmspec/latest/Sitemap.html

These formats are based on HTML and use the following doctype:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">

The <HEAD> tag contains a <meta> tag providing information on the program that generated the files and a comment indicating the version of the file. e.g.:

<meta name="GENERATOR"content="Microsoft® HTML Help Workshop 4.1">

The <BODY> tag contains an <OBJECT> tag that stores properties of the file in <param> tags, followed by a UL> tag, whose <LI> tags have <OBJECT> tags that store the properties of the Contents/Index items in <param> tags. e.g.:

<BODY>
<OBJECT type="text/site properties">
    <param name="Property Name" value="Property Value">
   …
</OBJECT>
<UL>
    <LI> <OBJECT type="text/sitemap">
        <param name="Property Name" value="Property Value">
        …
        </OBJECT>
   …
</UL>
</BODY>

Note that the Property Names and Property Values and tags are not case-sensitive, but HHW will always write all three in the default capitilization, when appropriate.

Note that the tags are mostly in uppercase and the <LI> tag is not closed; this is in compliance with the doctype.

Some properties that were seen in HHA.dll that may or may not be used are Background Image, NumberImages, InformationTypeDecl, Secondary, Icon, Display, Keyword, Instruction, Section Title, Favorites, QueryType, SendEvent, SendMessage, HHI, Inclusive & Exclusive.\

This the beginning chunk of an autogenerated (the autogenerated TOC stinks) hhc(TOC) file for the rtl: .hhk are in the same format but are for the Index pane and do not have subitems.

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<HTML>
<HEAD>
<meta name="GENERATOR" content="Microsoft® HTML Help Workshop 4.1">
<!-- Sitemap 1.0 -->
</HEAD><BODY>
<OBJECT type="text/site properties">
    <param name="Auto Generated" value="Yes">
</OBJECT>
<UL>
    <LI> <OBJECT type="text/sitemap">
        <param name="Name" value="Reference for package 'rtl'">
        <param name="Local" value="rtl/index.html">
        </OBJECT>
    <UL>
        <LI> <OBJECT type="text/sitemap">
            <param name="Name" value="Units">
            <param name="Local" value="rtl/index.html">
            </OBJECT>
        <LI> <OBJECT type="text/sitemap">
            <param name="Name" value="Description">
            <param name="Local" value="rtl/index.html">
            </OBJECT>
    </UL>
    <LI> <OBJECT type="text/sitemap">
        <param name="Name" value="Reference for unit 'BaseUnix'">
        <param name="Local" value="rtl/baseunix/index.html">
        </OBJECT>
    <UL>
        <LI> <OBJECT type="text/sitemap">
            <param name="Name" value="Overview">
            <param name="Local" value="rtl/baseunix/index.html">
            </OBJECT>
    </UL>

......
</UL>
</BODY></HTML>

Generating FPC CHM docs via fpdoc

At the moment (october 2008), the fpdoc part of the documentation (the part that documents the various units) can be compiled to CHM fairly easily, a

make clean html HTMLFMT=chm CSSFILE=/path/to/fpdoc.css

will generate rtl.chm and fcl.chm. Note that the cssfile is currently in a different repository (I used fpc/utils/fpdoc/fpdoc.css since it has the correct name). A script fixdocs.sh automates most of the CHM related options. (*nix, requires latex + tex4ht)

Due to fixes to the CHM, XML and makefiles backend, it is advised to use a sources (fpcdocs,chm, fcl-xml and fpdoc) from the 2.3.1 branch, Jul 31th 2009 and newer.

Remaining problems

  1. The textmode IDE crashes on loading the LCL.chm's index on all architectures and OSes. Other CHM viewers don't seem to suffer from this, but there is always a chance it is CHM pkg related. ([b] Provisionally fixed[/b])
  2. The prog and user documents don't have an index. Ref guide only has keywords
  3. Lazarus native CHM support on Windows (using the Windows chm viewer instead of Lview)
  4. FCL-XML generates UTF-8, which some CHM browsers (kchmviewer) don't seem to handle properly. FCL-XML support for other encodings? ISO8859-1 default? KCHMviewer seem to work fine on windows.
  5. Most of the Unix viewers are horribly slow on the larger chms, specially the lazarus one. Not really our problem, but maybe some bugreports would help. (KCHMViewer has improved in newer versions)
  6. compiling lcl finds a lot of unknown identifiers. These should be documented?
  7. crosslinking between user,ref,prog and fpdoc ?

And the major one:

  1. How to combine the indexes to a master index? Specially duplicate handling. Currently additional disambiguating context is only added to the index if a dupe _within_ a CHM is detected.

Generating LCL CHM via fpdoc

(I only did this on *nix)

Warning: this script enabled all fpdoc bells and whistles, and before 28feb took 40mins to complete on a core2-6600. After the cleanup still 4minutes and 400MB memory. If you want to do this in a nightly build, make sure you have a fpdoc from trunk, preferably from a checkout that was compiled with optimization on. That knocks off another half minute. ( 12.5%)

 export HTMLFMT=chm
 cd lazarus/docs/html
 sh build_lcl_html.sh "fpdoc" "" /path/to/fpcdocs

The generated .xct file is an index file for fpdoc cross file links, and is of no use to the user, unless you want to link to the chm. That is also why we have to specify the fpcdocs archive. If we have previously built the FPC docs there, lcl will use this to craft cross CHM links.

Generating FPC CHM docs for the latex documentation

Building CHMs for the documentation that is not in fpdoc format (but normal latex files) is harder. The process can be fairly easily be separated into two stages:

  1. generate HTML from latex sources
  2. compile HTML to CHM.
    1. raw compile
    2. generating indexes and TOC

Generating html

The first is currently the hard part. During the years several converters have been tried, the current one used is tex4ht. While the theory is easy (install latex, install tex4ht, run make html), the practice is difficult (*), and the generated docs are usually corrupt (ligatures converted to pics or unicode escapes, bad "next" links etc). Luckily, these docs mutates less often, so one probably can use a generated set of CHMs for the entire duration of a release cycle. Still it would be great to have a definitive, reproducable solution.

For the moment, the "bad" output of tex4ht is fixed by a DOM based (FCL-XML) FPC tool, relinkdocs. Relinkdocs scans the HTML docs and relinks the next/up/prev links.

(*) to the degree that the version on the FPC front page is more often broken than not.

compiling and TOC generation

To compile the resulting html to CHM, another tool was written, called compilelatexchm, which also relies on FCL-XML for the TOC scanning. The CHM is searchable, but does not yet have an index. (I don't know a suitable algoritm to come up with the keywords)

A batchfile that automates these steps is added to the fpcdocs repository (fixdocs.sh), but is not idiot proof yet. If you have troubles, ask me (Marco/oliebol) on IRC.

Plastex

A week after I did the above fixing of tex4ht output, Andrew Haines came with a patch that supported using PlasTeX as tex2chm converter. I don't entirely like the output (chapter and paragraph numbers drop off, which makes it harder to correspond over documentation, and the template is a bit playful), but I haven't looked into customizing it at all, so maybe something can be done there. I haven't tested the generated CHMs with the textmode IDE yet too.

One can select PlasTeX by passing USEPLASTEX=1 to the make process.

Troubleshooting

Trouble viewing in Vista

Have a look at this link. We still must find something that make the installer do this automatically.

Or turn it off entirely using:

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\HTMLHelp\1.x\HHRestrictions]
 "MaxAllowedZone"=dword:00000001
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\HTMLHelp\1.x\ItssRestrictions]
 "MaxAllowedZone"=dword:00000001

You can add a registry key to repair this issue in XP and Vista by setting the ItssRestrictions.MaxAllowedZone to 3 or 4 or 5.

A simpler solution is to unzip with e.g. infozip, instead of Windows built-in support. (beware, newer infozips might conform at any time!)

(added later)

Apparantly this is an alternate filestream: http://stackoverflow.com/questions/1617509/unblock-a-file-with-powershell

I made a small program that wraps this, and committed it as "chmunblock" util to chm.