Excuse me if I go off on a technical rant for a moment. I find it very irritating when people don’t use HTML mark-up properly. I can forgive the occasional user, or those relying on WYSIWYG editors, but for large, professionally coded websites, there is no excuse for mark-up which does not apply standards correctly.
What has vexed me so? The Houses of Parliament website. In many ways this is a great resource. They offer video of parliamentary debates, and the Hansard of the previous day’s proceedings is posted promptly the following moring. However, the underlying mark-up is flawed.
I am busy this morning reading a House of Lords debate on the Defamation Bill. The underlying markup for part of a speech given by Lord Browne of Ladyton is rendered like this:
<a name="121217-gc0001.htm_para7"></a>
<p><a name="1212173000011"></a>The amendments have also been tabled against the background of the history of this process. The draft Bill had a different test. The Joint Committee recommended yet another test. The tests are similar to each other, and the Government chose a third one. There is an argument that the process has confused rather than clarified the position. I refer to the first sitting of the Committee on 19 June 2012, where Karl Turner, the MP for Kingston-upon-Hull, rose to support an amendment similar to those before your Lordships today. He started off by setting out his agreement with the underlying principle behind the existing clause. He said he was,</p>
<ul><a name="1212173000072"></a>"searching for clarity in the face of some possible confusion",</ul>
<p><a name="1212173000058"></a>and he set out broadly the argument that I have sought to set out.</p>
There are some good things about this attempt to render the text. Each paragraph in the Hansard has its own anchor with a unique name attribute. This means we can link to specific passages in what can be quite verbose and detailed speeches.
However, HTML has many useful features, over and above the ability to create hyperlinks. One of these is the blockquote feature. This HTML tag allows you to wrap a quoted piece of text in a specific tag, which in turn allows you to style the text in the way you wish (for example, an indentation or italics). However, the aesthetic flexibility it offers is only one aspect of why the tag is useful. Through the cite attribute the tag also allows another form of cross-referencing.
On the parliament website, there is an unforgivable error. The website team have chosen to render quotations using the <ul> tag instead, which stands for an ‘unordered list’ (i.e., a bullet point list, not a numbered list). They have done this because a <ul> tag renders an indent, which is the visual formatting that the transcribers desire.
This is blinkered thinking. The whole point of digitising the text is to make it more accessible, and to reveal the cross-references between different documents. This is crucial when dealing with legislation and parliamentary oratory, where both speakers and readers should be mindful of the citations and references within. Moreover, by incorrectly tagging a piece of text, the transcribers cause confusion for search engines, automated indexer and analyses, and also anyone using assistive technologies like screen readers or Braille readers. In all these cases, the machine needs to be told that a piece of text is a citation or quote, not a list item. This information cannot be conveyed visually, becuase the aesthetics of the typography are irrelevant.
Here is what the line mark-up code should look like:
<a name="1212173000072"></a>
<blockquote cite="http://www.publications.parliament.uk/pa/cm201213/cmpublic/defamation/120619/am/120619s01.htm#12061977000055">searching for clarity in the face of some possible confusion</blockquote>
In this version, the passage is correctly tagged as a quote from somewhere else, and that ‘somewhere’ else is embedded into the text. The casual reader will not have their experience disrupted with the long link… but the fact that it is there helps researchers like me, who may wish to read the cited text at some point. If the site were to apply this markup consistently, then its web developers could begin to write other navigation features into the site.
Elsewhere
Clay Shirky says that version tracking software like GitHub should be used to help us see how legislation evolved.
One Reply to “The mess under the bonnet of the Houses of Parliament website”