[ Avaa Bypassed ]




Upload:

Command:

hmhc3928@18.119.126.72: ~ $
:mod:`htmllib` --- A parser for HTML documents
==============================================

.. module:: htmllib
   :synopsis: A parser for HTML documents.
   :deprecated:

.. deprecated:: 2.6
    The :mod:`htmllib` module has been removed in Python 3.


.. index::
   single: HTML
   single: hypertext

.. index::
   module: sgmllib
   module: formatter
   single: SGMLParser (in module sgmllib)

This module defines a class which can serve as a base for parsing text files
formatted in the HyperText Mark-up Language (HTML).  The class is not directly
concerned with I/O --- it must be provided with input in string form via a
method, and makes calls to methods of a "formatter" object in order to produce
output.  The :class:`HTMLParser` class is designed to be used as a base class
for other classes in order to add functionality, and allows most of its methods
to be extended or overridden.  In turn, this class is derived from and extends
the :class:`SGMLParser` class defined in module :mod:`sgmllib`.  The
:class:`HTMLParser` implementation supports the HTML 2.0 language as described
in :rfc:`1866`.  Two implementations of formatter objects are provided in the
:mod:`formatter` module; refer to the documentation for that module for
information on the formatter interface.

The following is a summary of the interface defined by
:class:`sgmllib.SGMLParser`:

* The interface to feed data to an instance is through the :meth:`feed` method,
  which takes a string argument.  This can be called with as little or as much
  text at a time as desired; ``p.feed(a); p.feed(b)`` has the same effect as
  ``p.feed(a+b)``.  When the data contains complete HTML markup constructs, these
  are processed immediately; incomplete constructs are saved in a buffer.  To
  force processing of all unprocessed data, call the :meth:`close` method.

  For example, to parse the entire contents of a file, use::

     parser.feed(open('myfile.html').read())
     parser.close()

* The interface to define semantics for HTML tags is very simple: derive a class
  and define methods called :meth:`start_tag`, :meth:`end_tag`, or :meth:`do_tag`.
  The parser will call these at appropriate moments: :meth:`start_tag` or
  :meth:`do_tag` is called when an opening tag of the form ``<tag ...>`` is
  encountered; :meth:`end_tag` is called when a closing tag of the form ``<tag>``
  is encountered.  If an opening tag requires a corresponding closing tag, like
  ``<H1>`` ... ``</H1>``, the class should define the :meth:`start_tag` method; if
  a tag requires no closing tag, like ``<P>``, the class should define the
  :meth:`do_tag` method.

The module defines a parser class and an exception:


.. class:: HTMLParser(formatter)

   This is the basic HTML parser class.  It supports all entity names required by
   the XHTML 1.0 Recommendation (http://www.w3.org/TR/xhtml1).   It also defines
   handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements.


.. exception:: HTMLParseError

   Exception raised by the :class:`HTMLParser` class when it encounters an error
   while parsing.

   .. versionadded:: 2.4


.. seealso::

   Module :mod:`formatter`
      Interface definition for transforming an abstract flow of formatting events into
      specific output events on writer objects.

   Module :mod:`HTMLParser`
      Alternate HTML parser that offers a slightly lower-level view of the input, but
      is designed to work with XHTML, and does not implement some of the SGML syntax
      not used in "HTML as deployed" and which isn't legal for XHTML.

   Module :mod:`htmlentitydefs`
      Definition of replacement text for XHTML 1.0  entities.

   Module :mod:`sgmllib`
      Base class for :class:`HTMLParser`.


.. _html-parser-objects:

HTMLParser Objects
------------------

In addition to tag methods, the :class:`HTMLParser` class provides some
additional methods and instance variables for use within tag methods.


.. attribute:: HTMLParser.formatter

   This is the formatter instance associated with the parser.


.. attribute:: HTMLParser.nofill

   Boolean flag which should be true when whitespace should not be collapsed, or
   false when it should be.  In general, this should only be true when character
   data is to be treated as "preformatted" text, as within a ``<PRE>`` element.
   The default value is false.  This affects the operation of :meth:`handle_data`
   and :meth:`save_end`.


.. method:: HTMLParser.anchor_bgn(href, name, type)

   This method is called at the start of an anchor region.  The arguments
   correspond to the attributes of the ``<A>`` tag with the same names.  The
   default implementation maintains a list of hyperlinks (defined by the ``HREF``
   attribute for ``<A>`` tags) within the document.  The list of hyperlinks is
   available as the data attribute :attr:`anchorlist`.


.. method:: HTMLParser.anchor_end()

   This method is called at the end of an anchor region.  The default
   implementation adds a textual footnote marker using an index into the list of
   hyperlinks created by :meth:`anchor_bgn`.


.. method:: HTMLParser.handle_image(source, alt[, ismap[, align[, width[, height]]]])

   This method is called to handle images.  The default implementation simply
   passes the *alt* value to the :meth:`handle_data` method.


.. method:: HTMLParser.save_bgn()

   Begins saving character data in a buffer instead of sending it to the formatter
   object.  Retrieve the stored data via :meth:`save_end`. Use of the
   :meth:`save_bgn` / :meth:`save_end` pair may not be nested.


.. method:: HTMLParser.save_end()

   Ends buffering character data and returns all data saved since the preceding
   call to :meth:`save_bgn`.  If the :attr:`nofill` flag is false, whitespace is
   collapsed to single spaces.  A call to this method without a preceding call to
   :meth:`save_bgn` will raise a :exc:`TypeError` exception.


:mod:`htmlentitydefs` --- Definitions of HTML general entities
==============================================================

.. module:: htmlentitydefs
   :synopsis: Definitions of HTML general entities.
.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>

.. note::

   The :mod:`htmlentitydefs` module has been renamed to :mod:`html.entities` in
   Python 3.  The :term:`2to3` tool will automatically adapt imports when
   converting your sources to Python 3.

**Source code:** :source:`Lib/htmlentitydefs.py`

--------------

This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``,
and ``entitydefs``. ``entitydefs`` is used by the :mod:`htmllib` module to
provide the :attr:`entitydefs` attribute of the :class:`HTMLParser` class.  The
definition provided here contains all the entities defined by XHTML 1.0  that
can be handled using simple textual substitution in the Latin-1 character set
(ISO-8859-1).


.. data:: entitydefs

   A dictionary mapping XHTML 1.0 entity definitions to their replacement text in
   ISO Latin-1.


.. data:: name2codepoint

   A dictionary that maps HTML entity names to the Unicode codepoints.

   .. versionadded:: 2.3


.. data:: codepoint2name

   A dictionary that maps Unicode codepoints to HTML entity names.

   .. versionadded:: 2.3


Filemanager

Name Type Size Permission Actions
2to3.txt File 12.37 KB 0644
__builtin__.txt File 1.45 KB 0644
__future__.txt File 4.84 KB 0644
__main__.txt File 535 B 0644
_winreg.txt File 22.76 KB 0644
abc.txt File 6.99 KB 0644
aepack.txt File 4.16 KB 0644
aetools.txt File 3.45 KB 0644
aetypes.txt File 4.16 KB 0644
aifc.txt File 6.91 KB 0644
al.txt File 5.18 KB 0644
allos.txt File 695 B 0644
anydbm.txt File 3.87 KB 0644
archiving.txt File 424 B 0644
argparse.txt File 68.77 KB 0644
array.txt File 10.4 KB 0644
ast.txt File 9.7 KB 0644
asynchat.txt File 8.99 KB 0644
asyncore.txt File 12.37 KB 0644
atexit.txt File 3.81 KB 0644
audioop.txt File 10.15 KB 0644
autogil.txt File 1015 B 0644
base64.txt File 5.93 KB 0644
basehttpserver.txt File 9.98 KB 0644
bastion.txt File 2.55 KB 0644
bdb.txt File 12.14 KB 0644
binascii.txt File 6.04 KB 0644
binhex.txt File 1.87 KB 0644
bisect.txt File 5.29 KB 0644
bsddb.txt File 7.4 KB 0644
bz2.txt File 7.72 KB 0644
calendar.txt File 11.01 KB 0644
carbon.txt File 15.58 KB 0644
cd.txt File 11.69 KB 0644
cgi.txt File 22.12 KB 0644
cgihttpserver.txt File 2.72 KB 0644
cgitb.txt File 2.81 KB 0644
chunk.txt File 4.82 KB 0644
cmath.txt File 7.45 KB 0644
cmd.txt File 8.14 KB 0644
code.txt File 6.93 KB 0644
codecs.txt File 63.19 KB 0644
codeop.txt File 3.69 KB 0644
collections.txt File 40.08 KB 0644
colorpicker.txt File 913 B 0644
colorsys.txt File 1.78 KB 0644
commands.txt File 2.53 KB 0644
compileall.txt File 4.49 KB 0644
compiler.txt File 36.59 KB 0644
configparser.txt File 19 KB 0644
constants.txt File 2.18 KB 0644
contextlib.txt File 5.36 KB 0644
cookie.txt File 9.3 KB 0644
cookielib.txt File 27.09 KB 0644
copy.txt File 3.29 KB 0644
copy_reg.txt File 2.27 KB 0644
crypt.txt File 2.24 KB 0644
crypto.txt File 771 B 0644
csv.txt File 21.07 KB 0644
ctypes.txt File 86.41 KB 0644
curses.ascii.txt File 8.8 KB 0644
curses.panel.txt File 2.68 KB 0644
curses.txt File 70.87 KB 0644
custominterp.txt File 570 B 0644
datatypes.txt File 864 B 0644
datetime.txt File 68.78 KB 0644
dbhash.txt File 3.77 KB 0644
dbm.txt File 2.89 KB 0644
debug.txt File 446 B 0644
decimal.txt File 68.95 KB 0644
development.txt File 640 B 0644
difflib.txt File 29.85 KB 0644
dircache.txt File 1.77 KB 0644
dis.txt File 20.82 KB 0644
distutils.txt File 1.13 KB 0644
dl.txt File 3.31 KB 0644
doctest.txt File 71.42 KB 0644
docxmlrpcserver.txt File 3.66 KB 0644
dumbdbm.txt File 2.62 KB 0644
dummy_thread.txt File 1.03 KB 0644
dummy_threading.txt File 799 B 0644
easydialogs.txt File 10.1 KB 0644
email-examples.txt File 1.24 KB 0644
email.charset.txt File 9.42 KB 0644
email.encoders.txt File 2.32 KB 0644
email.errors.txt File 3.73 KB 0644
email.generator.txt File 5.99 KB 0644
email.header.txt File 7.35 KB 0644
email.iterators.txt File 2.28 KB 0644
email.message.txt File 24.56 KB 0644
email.mime.txt File 9.42 KB 0644
email.parser.txt File 9.71 KB 0644
email.txt File 14.61 KB 0644
email.util.txt File 6.43 KB 0644
errno.txt File 6.55 KB 0644
exceptions.txt File 18.01 KB 0644
fcntl.txt File 6.65 KB 0644
filecmp.txt File 5.22 KB 0644
fileformats.txt File 302 B 0644
fileinput.txt File 7.06 KB 0644
filesys.txt File 806 B 0644
fl.txt File 17.23 KB 0644
fm.txt File 2.64 KB 0644
fnmatch.txt File 3.03 KB 0644
formatter.txt File 12.92 KB 0644
fpectl.txt File 4.07 KB 0644
fpformat.txt File 1.71 KB 0644
fractions.txt File 5.17 KB 0644
framework.txt File 11.18 KB 0644
frameworks.txt File 378 B 0644
ftplib.txt File 14.79 KB 0644
functions.txt File 72.74 KB 0644
functools.txt File 7.15 KB 0644
future_builtins.txt File 1.86 KB 0644
gc.txt File 8.76 KB 0644
gdbm.txt File 4.71 KB 0644
gensuitemodule.txt File 3.04 KB 0644
getopt.txt File 6.51 KB 0644
getpass.txt File 1.9 KB 0644
gettext.txt File 28.35 KB 0644
gl.txt File 5.87 KB 0644
glob.txt File 2.31 KB 0644
grp.txt File 2.2 KB 0644
gzip.txt File 4.62 KB 0644
hashlib.txt File 5.01 KB 0644
heapq.txt File 12.64 KB 0644
hmac.txt File 1.82 KB 0644
hotshot.txt File 4.19 KB 0644
htmllib.txt File 7.03 KB 0644
htmlparser.txt File 11.34 KB 0644
httplib.txt File 35.65 KB 0644
i18n.txt File 409 B 0644
ic.txt File 4.89 KB 0644
idle.txt File 7.88 KB 0644
imageop.txt File 3.91 KB 0644
imaplib.txt File 16.77 KB 0644
imgfile.txt File 2.7 KB 0644
imghdr.txt File 2.57 KB 0644
imp.txt File 12.3 KB 0644
importlib.txt File 1.1 KB 0644
imputil.txt File 6.86 KB 0644
index.txt File 2.23 KB 0644
inspect.txt File 27.21 KB 0644
internet.txt File 950 B 0644
intro.txt File 2.74 KB 0644
io.txt File 36.31 KB 0644
ipc.txt File 631 B 0644
itertools.txt File 34.69 KB 0644
jpeg.txt File 3.77 KB 0644
json.txt File 23.39 KB 0644
keyword.txt File 617 B 0644
language.txt File 523 B 0644
linecache.txt File 1.84 KB 0644
locale.txt File 24.19 KB 0644
logging.config.txt File 29.76 KB 0644
logging.handlers.txt File 26.45 KB 0644
logging.txt File 43.67 KB 0644
mac.txt File 791 B 0644
macos.txt File 3.73 KB 0644
macosa.txt File 3.87 KB 0644
macostools.txt File 3.92 KB 0644
macpath.txt File 650 B 0644
mailbox.txt File 66.51 KB 0644
mailcap.txt File 3.59 KB 0644
markup.txt File 1.22 KB 0644
marshal.txt File 5.47 KB 0644
math.txt File 10.64 KB 0644
md5.txt File 2.75 KB 0644
mhlib.txt File 3.87 KB 0644
mimetools.txt File 4.4 KB 0644
mimetypes.txt File 9.3 KB 0644
mimewriter.txt File 3.2 KB 0644
mimify.txt File 3.44 KB 0644
miniaeframe.txt File 2.5 KB 0644
misc.txt File 248 B 0644
mm.txt File 447 B 0644
mmap.txt File 10.02 KB 0644
modulefinder.txt File 3.3 KB 0644
modules.txt File 382 B 0644
msilib.txt File 18.94 KB 0644
msvcrt.txt File 4.24 KB 0644
multifile.txt File 6.46 KB 0644
multiprocessing.txt File 79.92 KB 0644
mutex.txt File 1.89 KB 0644
netdata.txt File 432 B 0644
netrc.txt File 2.54 KB 0644
new.txt File 2.59 KB 0644
nis.txt File 2.06 KB 0644
nntplib.txt File 14.18 KB 0644
numbers.txt File 7.82 KB 0644
numeric.txt File 751 B 0644
operator.txt File 21.57 KB 0644
optparse.txt File 75.22 KB 0644
os.path.txt File 12.45 KB 0644
os.txt File 79.94 KB 0644
ossaudiodev.txt File 16.9 KB 0644
othergui.txt File 2.73 KB 0644
parser.txt File 15.02 KB 0644
pdb.txt File 15.61 KB 0644
persistence.txt File 826 B 0644
pickle.txt File 36.25 KB 0644
pickletools.txt File 1.95 KB 0644
pipes.txt File 3.7 KB 0644
pkgutil.txt File 7.53 KB 0644
platform.txt File 9.15 KB 0644
plistlib.txt File 4.02 KB 0644
popen2.txt File 6.86 KB 0644
poplib.txt File 6.07 KB 0644
posix.txt File 3.51 KB 0644
posixfile.txt File 7.03 KB 0644
pprint.txt File 8.86 KB 0644
profile.txt File 27.81 KB 0644
pty.txt File 1.72 KB 0644
pwd.txt File 2.66 KB 0644
py_compile.txt File 2.42 KB 0644
pyclbr.txt File 3.22 KB 0644
pydoc.txt File 3.34 KB 0644
pyexpat.txt File 27.83 KB 0644
python.txt File 531 B 0644
queue.txt File 6.8 KB 0644
quopri.txt File 2.61 KB 0644
random.txt File 12.71 KB 0644
re.txt File 51.28 KB 0644
readline.txt File 7.08 KB 0644
repr.txt File 4.57 KB 0644
resource.txt File 9.61 KB 0644
restricted.txt File 3.24 KB 0644
rexec.txt File 11.47 KB 0644
rfc822.txt File 13.71 KB 0644
rlcompleter.txt File 2.44 KB 0644
robotparser.txt File 2.14 KB 0644
runpy.txt File 6.46 KB 0644
sched.txt File 4.49 KB 0644
scrolledtext.txt File 1.32 KB 0644
select.txt File 20.17 KB 0644
sets.txt File 14.54 KB 0644
sgi.txt File 322 B 0644
sgmllib.txt File 10.41 KB 0644
sha.txt File 2.74 KB 0644
shelve.txt File 7.96 KB 0644
shlex.txt File 10.82 KB 0644
shutil.txt File 12.88 KB 0644
signal.txt File 10.33 KB 0644
simplehttpserver.txt File 4.34 KB 0644
simplexmlrpcserver.txt File 9.7 KB 0644
site.txt File 7.4 KB 0644
smtpd.txt File 2.31 KB 0644
smtplib.txt File 14.1 KB 0644
sndhdr.txt File 1.72 KB 0644
socket.txt File 39.7 KB 0644
socketserver.txt File 20.12 KB 0644
someos.txt File 599 B 0644
spwd.txt File 2.76 KB 0644
sqlite3.txt File 34.28 KB 0644
ssl.txt File 27.8 KB 0644
stat.txt File 7.59 KB 0644
statvfs.txt File 1.27 KB 0644
stdtypes.txt File 115.81 KB 0644
string.txt File 42.78 KB 0644
stringio.txt File 4 KB 0644
stringprep.txt File 4.15 KB 0644
strings.txt File 746 B 0644
struct.txt File 16.7 KB 0644
subprocess.txt File 32.68 KB 0644
sun.txt File 249 B 0644
sunau.txt File 6.96 KB 0644
sunaudio.txt File 5.71 KB 0644
symbol.txt File 975 B 0644
symtable.txt File 4.89 KB 0644
sys.txt File 45.76 KB 0644
sysconfig.txt File 7.38 KB 0644
syslog.txt File 3.84 KB 0644
tabnanny.txt File 1.97 KB 0644
tarfile.txt File 26.51 KB 0644
telnetlib.txt File 7.31 KB 0644
tempfile.txt File 10.23 KB 0644
termios.txt File 3.66 KB 0644
test.txt File 17.06 KB 0644
textwrap.txt File 8.35 KB 0644
thread.txt File 6.59 KB 0644
threading.txt File 31.1 KB 0644
time.txt File 24.79 KB 0644
timeit.txt File 11.25 KB 0644
tix.txt File 22.17 KB 0644
tk.txt File 1.57 KB 0644
tkinter.txt File 30.56 KB 0644
token.txt File 2.39 KB 0644
tokenize.txt File 5 KB 0644
trace.txt File 6.57 KB 0644
traceback.txt File 10.45 KB 0644
ttk.txt File 56.02 KB 0644
tty.txt File 1011 B 0644
turtle.txt File 62.57 KB 0644
types.txt File 6.04 KB 0644
undoc.txt File 6.4 KB 0644
unicodedata.txt File 5.59 KB 0644
unittest.txt File 80.78 KB 0644
unix.txt File 490 B 0644
urllib.txt File 22.47 KB 0644
urllib2.txt File 33.13 KB 0644
urlparse.txt File 15.61 KB 0644
user.txt File 2.68 KB 0644
userdict.txt File 8.69 KB 0644
uu.txt File 2.31 KB 0644
uuid.txt File 8.17 KB 0644
warnings.txt File 19.32 KB 0644
wave.txt File 4.93 KB 0644
weakref.txt File 12.66 KB 0644
webbrowser.txt File 8.97 KB 0644
whichdb.txt File 931 B 0644
windows.txt File 273 B 0644
winsound.txt File 4.87 KB 0644
wsgiref.txt File 29.84 KB 0644
xdrlib.txt File 7.89 KB 0644
xml.dom.minidom.txt File 10.91 KB 0644
xml.dom.pulldom.txt File 1.53 KB 0644
xml.dom.txt File 39.2 KB 0644
xml.etree.elementtree.txt File 31.82 KB 0644
xml.sax.handler.txt File 14.93 KB 0644
xml.sax.reader.txt File 11.65 KB 0644
xml.sax.txt File 6.06 KB 0644
xml.sax.utils.txt File 3.4 KB 0644
xml.txt File 5.56 KB 0644
xmlrpclib.txt File 21.4 KB 0644
zipfile.txt File 17.22 KB 0644
zipimport.txt File 5.78 KB 0644
zlib.txt File 10.13 KB 0644