|
Subject: wget css parsing Newsgroups: gmane.comp.web.wget.patches Date: 2006-12-05 17:00:15 GMT (2 years, 30 weeks, 2 days, 15 hours and 52 minutes ago) Expires: This article expires on 2006-12-19
Hello,
wget.h:
added TEXTCSS to dt flags enum
convert.h:
added link_css_p to struct urlpos
added link_expect_css to struct urlpos
added downloaded_css_set hash table
added register_css function prototype
convert.c:
added #include "html-url.h"
added downloaded_css_set hash table
added register_css function
moved most of convert_all_links to function convert_links_in_hashtable
call convert_links_in_hashtable for each of downloaded_{html,css}_set
made convert_links_in_hashtable handle css files
added replace_plain function
retr.c:
added #include "html-url.h"
added TEXTCSS -> register_css in retrieve_url
http.c:
added #define TEXTCSS_S
added check for CSS content type -> TEXTCSS flag
added function ensure_extension
changed code to use ensure_extension for HTML and CSS files
recur.h:
removed prototypes for functions from html-url.c
recur.c:
added #include "html-url.h"
added #include "css-url.h"
added 'css_allowed' to enqueue/dequeue_url functions, struct queue_element
modified retrieve_tree to handle css_allowed, set descend properly,
call get_urls_css_file
html-url.h:
added prototypes from recur.h
added prototype for append_url
added definition of struct map_context
html-url.c:
added #include "html-url.h"
removed definition of struct_map_context
added ATTR_POS and ATTR_SIZE to shorten calls to append_url
changed append_url to be non-static and take position and size
as parameters instead of tag/attrind
modified tag_handle_link to set link_expect_css for link rel="stylesheet"
added "style" to additional_attributes array
added check_style_attr function
modified collect_tags_mapper to call check_style_attr, and handle
uninteresting tags
added check in collect_tags_mapper
modified get_urls_html to call map_html_tags with NULL as the
interesting_tags parameter, so we receive all tags
html-parse.c:
added struct tagstack_item (for tag stack, a doubly linked list)
added functions tagstack_push, tagstack_pop, tagstack_find
added head and tail tag stack pointers in map_html_tags
in map_html_tags, if not a start tag, push tag details to the stack
when end of tag is found, set contents_begin on stack
before calling map function, if an end tag, find matching start tag
on the stack, pop it, and save contents info in taginfo struct
cleanup tag stack when finished
|
|
|