[ZPT] CVS: Packages/TAL - HTMLParser.py:1.15

fred@digicool.com fred@digicool.com
Mon, 9 Apr 2001 13:23:51 -0400 (EDT)


Update of /cvs-repository/Packages/TAL
In directory korak:/tmp/cvs-serv31141

Modified Files:
	HTMLParser.py 
Log Message:

Fix two buffer boundary issues; this restores this to passing the test
suite without any large restructuring.

Guido & I will be looking at how this is structured later; buffer
boundary checks will make this nearly unmaintainable if we can't
bring about a better structure to the code.  (Better tests would also
be nice!)



--- Updated File HTMLParser.py in package Packages/TAL --
--- HTMLParser.py	2001/04/09 14:15:19	1.14
+++ HTMLParser.py	2001/04/09 17:23:50	1.15
@@ -243,6 +243,10 @@
         rawdata = self.rawdata
         j = i + 2
         assert rawdata[i:j] == "<!", "unexpected call to parse_declaration"
+        if rawdata[j:j+1] in ("-", ""):
+            # Start of comment followed by buffer boundary,
+            # or just a buffer boundary.
+            return -1
         # in practice, this should look like: ((name|stringlit) S*)+ '>'
         n = len(rawdata)
         while j < n:
@@ -340,14 +344,24 @@
             next = rawdata[j:j+1]
             if next == ">":
                 return j + 1
-            if rawdata[j:j+2] == "/>":
-                return j + 2
+            if next == "/":
+                s = rawdata[j:j+2]
+                if s == "/>":
+                    return j + 2
+                if s == "/":
+                    # buffer boundary
+                    return -1
+                # else bogus input
+                self.updatepos(i, j + 1)
+                raise HTMLParseError("malformed empty start tag",
+                                     self.getpos())
             if next == "":
                 # end of input
                 return -1
-            if next in ("abcdefghijklmnopqrstuvwxyz="
+            if next in ("abcdefghijklmnopqrstuvwxyz=/"
                         "ABCDEFGHIJKLMNOPQRSTUVWXYZ"):
-                # end of input in or before attribute value
+                # end of input in or before attribute value, or we have the
+                # '/' from a '/>' ending
                 return -1
             self.updatepos(i, j)
             raise HTMLParseError("malformed start tag", self.getpos())