Go to selected websites and save the page to a file

Started by Magnum, October 13, 2011, 11:12:36 PM

Previous topic - Next topic

Magnum

I am thinking of writing a program that would go to selected websites, wait til the pages are loaded, and then save the web page including its' images.

Saving of the webpage is what I don't know how to do.

Have a great day,
                         Andy

fearless

Could use COM to allow IE webcontrol to save pages (dont know if thats possible, but probably is, and also might be a lot of work) or use InternetOpenUrl then InternetReadFile to get the content of the page - save it, then you would need to go through the page (parse) looking for links (<img= etc etc) and resubmitting requests (InternetOpenUrl) for images etc that are found and then use InternetReadFile to get the images and save them in a subfolder or somewhere relative to the image url path found in the main index page. Im sure i seen an example of simple web browser in asm somewhere that might be worth searching for as it might be a good place to start.
ƒearless

Magnum

Have a great day,
                         Andy

Tedd

If you're doing it for the challenge, good luck :lol
Though it's not too much work to knock something together that will work as long as nothing goes wrong (send HTTP request, receive reply, save html to file, parse html to find links to css files and images, make requests for those and save them to files.)

If you want a program that already does it for you.. http://gnuwin32.sourceforge.net/packages/wget.htm
No snowflake in an avalanche feels responsible.

jj2007

Quote from: Magnum on October 13, 2011, 11:12:36 PM
I am thinking of writing a program that would go to selected websites, wait til the pages are loaded, and then save the web page including its' images.

The good news is that such a program exists, see attachment ::)

anunitu

#5
There is this,and it seems it also has source available..though I think it is in C++

http://www.httrack.com/

I have used this off and on,it is concidered an offline viewer,and it can download a whole site,or pick and choose pages(including all graphics on the page.

The source might give you some idea about writing your own program in assembler.

Magnum

Thanks Tedd,anunitu, and jj2007.

I will try out your suggestions, no need to re-invent the wheel.

Have a great day,
                         Andy

jj2007

Quote from: Magnum on October 15, 2011, 11:29:06 PM
I will try out your suggestions, no need to re-invent the wheel.

It has always worked for MSIE. I remember vaguely that Mozilla developers refused to implement it in Firefox for "security reasons" ::)

MSIE 7 greeted me with a "run once" page, saying I could choose my search provider. Either the predefined Live Search, or another one. So I clicked "another one", and guess what? MSIE didn't find Google :green2