Dynamic HTML Conversion to WML

(Generic Document Conversion)

Project Description

This Library dynamically translates HTML pages to WML (WAP protocol) pages. WML, which is intended for hand held devices, is different from HTML and introduces a number of conversion issues.  The library provides a simple C++ API to read a HTML file and generate an equivalent WML file.  If the HTML file is longer than the acceptable WML length, the converter splits the WML file into multiple files and links them together with hyper links.

Library Details

The primary API is a C++ class called DocumentConverter.  That class can be instantiated with an implementation of an interface called ConversionRules.  The interface ConversionRules tells the DocumentConverter class what to do when it encounters a token in the HTML stream.  For example, the ConversionRules class can say what to do when the start element <head> is encountered.  For converting an HTML document to WML, an implementation of ConversionRules, called WMLConversionRules is provided.  This class embodies the rules for converting HTML documents to WML documents.  It converts all the HTML tags to the closest equivalents in WML, replaces control characters in the text with the appropriate escape sequences, etc.  The DocumentConverter class itself is generic, and is meant to be able to convert an HTML document to any mark up language, given the appropriate conversion rules.

The DocumentConverter class is actually a call back to the SP SGML parser (see http://www.jclark.com/sp).  When you start the SGML parser on your document, you will need to supply the DocumentConverter class as the implementation of the SGMLApplication class.  The sample program provided with this converter shows how to do this:

ParserEventGeneratorKit *parserKit = new ParserEventGeneratorKit();
EventGenerator *egp = parserKit->makeEventGenerator(1, (char **)&fname);
WMLConversionRules *wmlConversionRules = new WMLConversionRules();
DocumentConverter *converter = new DocumentConverter(wmlConversionRules);
//
// Start the parsing... dont display parse error messages to the screen.
//
egp->inhibitMessages(true);
egp->run(*converter);

Finally, when you want to print the document, you invoke the print() method on the DocumentConverter class as:

converter->print(fileName)

Where fileName is the name of the output WML  file to create (See documentation on what happens when multiple out files need to be created due to splitting).

Testing

On un-tarring the tar ball, just type make (Only tested on a Linux box so far).

There is a test program (DynamicWML.cc) that uses this converter.  It simulates a web server, listening for HTTP GET requests.  The syntax for the GET  request is 'GET /url_to_convert HTTP/1.0'  where url_to_convert can be any WWW document.  For example, you issue the following GET command 'GET /www.cnn.com HTTP/1.0' and it will dynamically convert www.cnn.com to WML and return the WML content.  The best way to test this is by using the Phone.com UP.simulator and typing in the following URL 'http://www.yourserver.com:8080/www.cnn.com' and the phone will render a WML page of CNN (substitute your web server for www.yourserver.com and whatever port you compiled the program with).

You can start the test program as ./dynamicwml IP_ADDRESS where IP_ADDRESS is the IP address of your machine that you will run this program on.

Bugs

  1. Does not convert images (JPEG and GIF) to WBMP yet.
  2. Does not incorporate the  hyper links in the master document yet into the generated WML page yet.

Downloads

Current Release (Aplha 0.2): html2wml-alpha-0.2.tar.gz
Previous Releases (Aplha 0.1): html2wml.tar.gz
Other Releases (Mailer tool for the conversion server): html2wml.mail.tar.gz
Other Releases (WAP browser for the RIM pager): rim.browser.zip

Contacts

x_coder@hotmail.com