An HTML parser and browser following functional ideals. (https://ag.eitilt.life/willow/)

root

About

I enjoy working in the terminal, but none of the existing pure-text web browsers quite provide the full-featured experience I'm looking for; quite good for retro computing, not as comfortable as current standards of development.
On the graphical side, there are a number of extensions which tack keyboard controls onto existing browsers, but the resulting join is necessarily a bit ugly (there are a few browsers which have been specifically designed around the keyboard -- qutebrowser stands out among them -- but they all still have their own pain points in turn). So, I'm making my own. There's no way this will ever be able to compete with the big names1, but I hope to gradually grow, one stable step at a time, and eventually reach something I find comfortable to use.

Instructions for writing other packages using these libraries are generated as Hackage pages; those can be found in the standard locations (Hackage, the hoogle CLI, ~/.cabal/store/*/share/doc/, etc.), and are mirrored to the project web page2.

Cloning

In order to ensure the modules provide all relevant information, this file has been added to the extra-source-files list in addition to the README specific to that module, but to avoid duplicate patches, the copy/link has not been added to Darcs. If you have obtained this via cabal get or a single-module tarball, everything should be good. If instead you have cloned this through Darcs or otherwise downloaded the entire repository, you won't have the duplicate in the place Cabal expects it to be. That shouldn't be an issue unless you try packaging it up for some reason, but if you do run into trouble:

ln README.md mangrove/README.project.md
ln README.md willow/README.project.md

Contribution

Warnings are only useful to developers; while an end user seeing a warning might feel inclined to fix the program, chances are it's just going to be ignored as the list of built files scrolls by. Reflecting that, a build flag dev has been added to all packages to toggle the printing of warnings, and potentially other developer aids in the future. If you decide to help improve this code, please enable it:

cabal configure -fdev

Depending on the size of your hack, I welcome either a diff file, or you can bundle your complete darcs patches with darcs send. Either way, attach the changes to an email addressed to ag@eitilt.life and I'll see about adding it to the codebase.

Patch format

If you do send the latter, every patch should have a comment with a (at least mildly) descriptive name prefixed with a tag indicating the general category addressed by the patch: for fixing issues, that is an "i/" followed by the issue number (without leading zeros, unlike the reference file); for general development, "f/" followed by the best topic in the FEATURES key; for completeness, version tags are "v/" followed by the package reference followed by the version. All are terminated with a final period.

Please do not squash patches. Each patch should represent a minimal but complete change: certainly enough context to successfully compile, hopefully enough context to not break any previously-passing tests, and potentially combining closely-related changes at your discretion, but if you wind up trying to decide between multiple tags or you're adding all of your work over the entire day, you should probably look closely at whether you can break the patch apart any farther.

Additionally, make good use of the --ask-deps flag. Until I get a CI integration together, it's easy to miss a dependency, but do your best to select anything which your patch may require to build successfully.

External test suites

Several test suites have already been written against the HTML specification (including some spillover into other web technology), and I have taken advantage that previous work to ensure the correctness of the modules here.
Rather than duplicate the storage and have to worry about keeping my mirror up-to-date, I've instead just set up the tests to look for the repositories in specific locations, and rely on the programmer to clone them manually.

If possible, these should be saved in the primary data folders; however, as the test data is shared between multiple packages, a naïve setup will take up a lot of unnecessary space. If you're only building a single package, then you're already good and can just use the data/ folder in this same directory. If you do have the full repo, Haskell can easily handle symlinks, so I recommend cloning the data to a single location and then linking to it from the other package(s); the top-level directory has been used in the examples, but you can also just choose, e.g., willow/data/ and link the other packages to that.

html5lib

The supplemental html5lib conformance suite may be enabled via the Cabal flag html5lib. Before doing so, however, the test data must be downloaded to data/test/html5lib-tests:

git clone 'https://github.com/html5lib/html5lib-tests' data/test/html5lib-tests

And, unless you only downloaded a single module:

ln -s data/test/html5lib-tests mangrove/data/test
ln -s data/test/html5lib-tests willow/data/test

Followed either way by:

cabal configure --enable-tests -fhtml5lib
cabal v2-test mangrove:html5lib willow:html5lib

wpt

Likewise, most of the web-platform-tests conformance suite (Cabal flag wpt) requires the directory data/test/wpt to be populated:

git clone 'https://github.com/web-platform-tests/wpt' data/test/wpt

ln -s data/test/wpt willow/data/test

cabal configure --enable-tests -fwpt
cabal v2-test willow:wpt

Footnotes