Orbits  1
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Macros Pages
Public Member Functions | Public Attributes | Private Member Functions | List of all members
pip._vendor.html5lib.html5parser.HTMLParser Class Reference
Inheritance diagram for pip._vendor.html5lib.html5parser.HTMLParser:
_object

Public Member Functions

def __init__
 
def reset
 
def isHTMLIntegrationPoint
 
def isMathMLTextIntegrationPoint
 
def mainLoop
 
def normalizedTokens
 
def parse
 
def parseFragment
 
def parseError
 
def normalizeToken
 
def adjustMathMLAttributes
 
def adjustSVGAttributes
 
def adjustForeignAttributes
 
def reparseTokenNormal
 
def resetInsertionMode
 
def parseRCDataRawtext
 

Public Attributes

 strict
 
 tree
 
 tokenizer_class
 
 errors
 
 phases
 
 innerHTMLMode
 
 container
 
 tokenizer
 
 firstStartTag
 
 log
 
 compatMode
 
 innerHTML
 
 phase
 
 lastPhase
 
 beforeRCDataPhase
 
 framesetOK
 
 originalPhase
 

Private Member Functions

def _parse
 

Detailed Description

HTML parser. Generates a tree structure from a stream of (possibly
    malformed) HTML

Constructor & Destructor Documentation

def pip._vendor.html5lib.html5parser.HTMLParser.__init__ (   self,
  tree = None,
  tokenizer = tokenizer.HTMLTokenizer,
  strict = False,
  namespaceHTMLElements = True,
  debug = False 
)
strict - raise an exception when a parse error is encountered

tree - a treebuilder class controlling the type of tree that will be
returned. Built in treebuilders can be accessed through
html5lib.treebuilders.getTreeBuilder(treeType)

tokenizer - a class that provides a stream of tokens to the treebuilder.
This may be replaced for e.g. a sanitizer which converts some tags to
text

Member Function Documentation

def pip._vendor.html5lib.html5parser.HTMLParser._parse (   self,
  stream,
  innerHTML = False,
  container = "div",
  encoding = None,
  parseMeta = True,
  useChardet = True,
  kwargs 
)
private
def pip._vendor.html5lib.html5parser.HTMLParser.adjustForeignAttributes (   self,
  token 
)
def pip._vendor.html5lib.html5parser.HTMLParser.adjustMathMLAttributes (   self,
  token 
)
def pip._vendor.html5lib.html5parser.HTMLParser.adjustSVGAttributes (   self,
  token 
)
def pip._vendor.html5lib.html5parser.HTMLParser.isHTMLIntegrationPoint (   self,
  element 
)
def pip._vendor.html5lib.html5parser.HTMLParser.isMathMLTextIntegrationPoint (   self,
  element 
)
def pip._vendor.html5lib.html5parser.HTMLParser.mainLoop (   self)
def pip._vendor.html5lib.html5parser.HTMLParser.normalizedTokens (   self)
def pip._vendor.html5lib.html5parser.HTMLParser.normalizeToken (   self,
  token 
)
HTML5 specific normalizations to the token stream 
def pip._vendor.html5lib.html5parser.HTMLParser.parse (   self,
  stream,
  encoding = None,
  parseMeta = True,
  useChardet = True 
)
Parse a HTML document into a well-formed tree

stream - a filelike object or string containing the HTML to be parsed

The optional encoding parameter must be a string that indicates
the encoding.  If specified, that encoding will be used,
regardless of any BOM or later declaration (such as in a meta
element)
def pip._vendor.html5lib.html5parser.HTMLParser.parseError (   self,
  errorcode = "XXX-undefined-error",
  datavars = {} 
)
def pip._vendor.html5lib.html5parser.HTMLParser.parseFragment (   self,
  stream,
  container = "div",
  encoding = None,
  parseMeta = False,
  useChardet = True 
)
Parse a HTML fragment into a well-formed tree fragment

container - name of the element we're setting the innerHTML property
if set to None, default to 'div'

stream - a filelike object or string containing the HTML to be parsed

The optional encoding parameter must be a string that indicates
the encoding.  If specified, that encoding will be used,
regardless of any BOM or later declaration (such as in a meta
element)
def pip._vendor.html5lib.html5parser.HTMLParser.parseRCDataRawtext (   self,
  token,
  contentType 
)
Generic RCDATA/RAWTEXT Parsing algorithm
contentType - RCDATA or RAWTEXT
def pip._vendor.html5lib.html5parser.HTMLParser.reparseTokenNormal (   self,
  token 
)
def pip._vendor.html5lib.html5parser.HTMLParser.reset (   self)
def pip._vendor.html5lib.html5parser.HTMLParser.resetInsertionMode (   self)

Member Data Documentation

pip._vendor.html5lib.html5parser.HTMLParser.beforeRCDataPhase
pip._vendor.html5lib.html5parser.HTMLParser.compatMode
pip._vendor.html5lib.html5parser.HTMLParser.container
pip._vendor.html5lib.html5parser.HTMLParser.errors
pip._vendor.html5lib.html5parser.HTMLParser.firstStartTag
pip._vendor.html5lib.html5parser.HTMLParser.framesetOK
pip._vendor.html5lib.html5parser.HTMLParser.innerHTML
pip._vendor.html5lib.html5parser.HTMLParser.innerHTMLMode
pip._vendor.html5lib.html5parser.HTMLParser.lastPhase
pip._vendor.html5lib.html5parser.HTMLParser.log
pip._vendor.html5lib.html5parser.HTMLParser.originalPhase
pip._vendor.html5lib.html5parser.HTMLParser.phase
pip._vendor.html5lib.html5parser.HTMLParser.phases
pip._vendor.html5lib.html5parser.HTMLParser.strict
pip._vendor.html5lib.html5parser.HTMLParser.tokenizer
pip._vendor.html5lib.html5parser.HTMLParser.tokenizer_class
pip._vendor.html5lib.html5parser.HTMLParser.tree

The documentation for this class was generated from the following file:

Copyright 2014 Google Inc. All rights reserved.