An HTML parser and browser following functional ideals. (https://ag.eitilt.life/willow/)

root / README.md

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
# About

I enjoy working in the terminal, but none of the existing pure-text web 
browsers quite provide the full-featured experience I'm looking for; quite good 
for retro computing, not as comfortable as current standards of development.  
On the graphical side, there are a number of extensions which tack keyboard 
controls onto existing browsers, but the resulting join is necessarily a bit 
ugly (there *are* a few browsers which have been specifically designed around 
the keyboard -- [qutebrowser](https://qutebrowser.org) stands out among them -- 
but they all still have their own pain points in turn).  So, I'm making my own. 
There's no way this will ever be able to compete with the big names[1], but I 
hope to gradually grow, one stable step at a time, and eventually reach 
something I find comfortable to use.

Instructions for writing other packages using these libraries are generated as 
Hackage pages; those can be found in the standard locations (Hackage, the 
`hoogle` CLI, `~/.cabal/store/*/share/doc/`, etc.), and are mirrored to the 
project web page[2].


# Cloning

In order to ensure the modules provide all relevant information, this file has 
been added to the `extra-source-files` list in addition to the README specific 
to that module, but to avoid duplicate patches, the copy/link has not been 
added to Darcs.  If you have obtained this via `cabal get` or a single-module 
tarball, everything should be good.  If instead you have cloned this through 
Darcs or otherwise downloaded the entire repository, you won't have the 
duplicate in the place Cabal expects it to be.  That shouldn't be an issue 
unless you try packaging it up for some reason, but if you do run into trouble:

	ln README.md mangrove/README.project.md
	ln README.md willow/README.project.md


# Contribution

Warnings are only useful to developers; while an end user seeing a warning 
*might* feel inclined to fix the program, chances are it's just going to be 
ignored as the list of built files scrolls by.  Reflecting that, a build flag 
`dev` has been added to all packages to toggle the printing of warnings, and 
potentially other developer aids in the future.  If you decide to help improve 
this code, please enable it:

	cabal configure -fdev

Depending on the size of your hack, I welcome either a diff file, or you can 
bundle your complete darcs patches with `darcs send`.  Either way, attach the 
changes to an email addressed to ag@eitilt.life and I'll see about adding it to 
the codebase.

## Patch format

If you do send the latter, every patch should have a comment with a (at least 
mildly) descriptive name prefixed with a tag indicating the general category 
addressed by the patch: for fixing issues, that is an "i/" followed by the 
issue number (without leading zeros, *unlike* the reference file); for general 
development, "f/" followed by the best topic in the `FEATURES` key; for 
completeness, version tags are "v/" followed by the package reference followed 
by the version.  All are terminated with a final period.

Please do not squash patches.  Each patch should represent a minimal but 
complete change: certainly enough context to successfully compile, hopefully 
enough context to not break any previously-passing tests, and potentially 
combining closely-related changes at your discretion, but if you wind up trying 
to decide between multiple tags or you're adding all of your work over the 
entire day, you should probably look closely at whether you can break the patch 
apart any farther.

Additionally, make good use of the `--ask-deps` flag.  Until I get a CI 
integration together, it's easy to miss a dependency, but do your best to 
select anything which your patch may require to build successfully.


## External test suites

Several test suites have already been written against the HTML specification 
(including some spillover into other web technology), and I have taken 
advantage that previous work to ensure the correctness of the modules here.  
Rather than duplicate the storage and have to worry about keeping my mirror 
up-to-date, I've instead just set up the tests to look for the repositories in 
specific locations, and rely on the programmer to clone them manually.

If possible, these should be saved in the primary data folders; however, as the 
test data is shared between multiple packages, a naïve setup will take up a lot 
of unnecessary space.  If you're only building a single package, then you're 
already good and can just use the `data/` folder in this same directory.  If 
you do have the full repo, Haskell can easily handle symlinks, so I recommend 
cloning the data to a single location and then linking to it from the other 
package(s); the top-level directory has been used in the examples, but you can 
also just choose, e.g., `willow/data/` and link the other packages to that.

### html5lib

The supplemental html5lib conformance suite may be enabled via the Cabal flag 
`html5lib`.  Before doing so, however, the test data must be downloaded to 
`data/test/html5lib-tests`:

	git clone 'https://github.com/html5lib/html5lib-tests' data/test/html5lib-tests

And, unless you only downloaded a single module:

	ln -s data/test/html5lib-tests mangrove/data/test
	ln -s data/test/html5lib-tests willow/data/test

Followed either way by:

	cabal configure --enable-tests -fhtml5lib
	cabal v2-test mangrove:html5lib willow:html5lib

### wpt

Likewise, most of the web-platform-tests conformance suite (Cabal flag `wpt`) 
requires the directory `data/test/wpt` to be populated:

	git clone 'https://github.com/web-platform-tests/wpt' data/test/wpt

	ln -s data/test/wpt willow/data/test

	cabal configure --enable-tests -fwpt
	cabal v2-test willow:wpt


# Footnotes

[1]: https://drewdevault.com/2020/03/18/Reckless-limitless-scope.html
[2]: https://ag.eitilt.life/willow/