This parser strips down all potentially dangerous content within HTML:
- opening tag without its closing tag
- closing tag without its opening tag
- any of these tags: “base”, “basefont”, “head”, “html”, “body”, “applet”, “object”, “iframe”, “frame”, “frameset”, “script”, “layer”, “ilayer”, “embed”, “bgsound”, “link”, “meta”, “style”, “title”, “blink”, “xml” etc.
- any of these attributes: on*, data*, dynsrc
- javascript:/vbscript:/about: etc. protocols
- expression/behavior etc. in styles
- any other active content
It also tries to convert code to XHTML valid, but htmltidy is far better solution for this task.
Advantages comparing to strip_tags:
1. strip_tags works on white-list basis, deleting all tags except
allowed. HTML_Safe works on black-list basis, deleting only dangerous
content.
2. strip_tags can only strip tags. HTML_safe strips down all active
content, including tags, attributes and values of atrributes.
3. strip_tags is not intended to fight XSS. HTML_Safe has primary goal
to prevent any XSS attack.
4. strip_tags does not try to produce XHTML compliant code. It does
not close unclosed tags.
HTML_Safe is successor of SafeHTML project. HTML_Safe fixes all known issues with SafeHTML.
|