The not-so-great escape
Escaping HTML is the process of converting a user's input into something which can be displayed back to the user in a web browser. For example, in a comment section on a blog, or a wiki editable by users.
Given user input such as <script>, to display that correctly, an HTML
escaper must output <script>. This is then converted into
<script> rather than an actual HTML script tag by the browser:
But supposing the user inputs <script>, what should be done with it?
If the <script> is not altered by the HTML escaper, then when it is displayed back to the user, it gets converted by the browser back into <script>, which is not what was intended, and even worse if the user tries to edit the comment again, the HTML tag may get removed from the text.
The solution to this problem is to also convert the ampersand, &, into an HTML entity, like this:
This has to be done before the conversions of < and >.
If we do it after, we get < converted into <, then into &lt;.
The bug which occurs when ampersands are not escaped to & for display occurs with CPAN modules like HTML::Scrubber, and also with the CPAN ratings service.