Issue13273
Created on 2011-10-27 07:56 by Christopher.Allen-Poole, last changed 2011-10-28 10:27 by ezio.melotti. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue13273.diff | ezio.melotti, 2011-10-27 14:49 | review | ||
| Messages (5) | |||
|---|---|---|---|
| msg146479 - (view) | Author: Christopher Allen-Poole (Christopher.Allen-Poole) | Date: 2011-10-27 07:56 | |
This is is encountered when extending html.parser.HTMLParser and running with strict mode False. Expected behavior: When '''<div style="" ><b>The <a href="some_url">rain</a> <br /> in <span>Spain</span></b></div>''' is passed to the feed method, div, b, a, br, and span should all be passed to the handle_starttag method. Actual behavior The handle_data method receives the values <div style="" >,<b>,<a href="some_url">,<br />,<span> in addition to the regular text. This can be fixed by changing this (inside the parse_starttag method): m = hparse.attrfind_tolerant.search(rawdata, k) to m = hparse.attrfind_tolerant.match(rawdata, k) |
|||
| msg146481 - (view) | Author: Ezio Melotti (ezio.melotti) * | Date: 2011-10-27 08:31 | |
Incidentally I was just investigating this very same issue, and your suggestion seems to work for me too. I'll see if the change has any downside and come up with a patch + test. Thanks for the report! |
|||
| msg146490 - (view) | Author: Ezio Melotti (ezio.melotti) * | Date: 2011-10-27 14:49 | |
The attached patch fixes replaces search with match as you suggested and tweaks a regex to make the old tests pass. |
|||
| msg146550 - (view) | Author: Roundup Robot (python-dev) | Date: 2011-10-28 10:24 | |
New changeset 41d41776aa6d by Ezio Melotti in branch '3.2': #13273: fix a bug that prevented HTMLParser to properly detect some tags when strict=False. http://hg.python.org/cpython/rev/41d41776aa6d New changeset b194117f176c by Ezio Melotti in branch 'default': #13273: merge with 3.2. http://hg.python.org/cpython/rev/b194117f176c |
|||
| msg146552 - (view) | Author: Ezio Melotti (ezio.melotti) * | Date: 2011-10-28 10:27 | |
Fixed, thanks a lot for the report! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2011-10-28 10:27:48 | ezio.melotti | set | status: open -> closed versions: - Python 2.7 messages: + msg146552 resolution: fixed |
| 2011-10-28 10:24:13 | python-dev | set | nosy:
+ python-dev messages: + msg146550 |
| 2011-10-27 14:49:41 | ezio.melotti | set | files:
+ issue13273.diff versions: + Python 2.7, Python 3.3 messages: + msg146490 keywords:
+ patch |
| 2011-10-27 08:31:15 | ezio.melotti | set | assignee: ezio.melotti messages: + msg146481 |
| 2011-10-27 08:15:13 | ezio.melotti | set | nosy:
+ ezio.melotti stage: test needed |
| 2011-10-27 07:56:01 | Christopher.Allen-Poole | create | |