You are not logged in. [Log In]


Forums » General Discussion » The Water Cooler » Downloading HTML files

Page 2 of 3 < 1 2 3 >
Topic Options
Rate This Topic
#338851 - 02/16/11 11:57 PM Re: Downloading HTML files [Re: Kruncher]
pmbuko Offline
shareholder in the making

Registered: 04/02/03
Posts: 16437
Loc: Ben Lomond, California
To avoid any sort of browser "infection", you could try curl for Windows. It's built in on most UNIX/linux OSes and very easy to use. E.g., from the command prompt:

curl http://www.somesite.com/stuff.html -o somesite-stuff.html

This would download the stuff.html file from somesite.com and save it locally as somesite-stuff.html.

Since it's a command-line utility, you could batch a bunch of sites together.
_________________________
I can explain it to you but I can't understand it for you.

Top
#338855 - 02/17/11 12:02 AM Re: Downloading HTML files [Re: Kruncher]
Kruncher Offline
devotee

Registered: 10/05/06
Posts: 484
Loc: Maple Ridge, BC
The plot thickens. The fellow who approached me with this in the first place maintains that he was using IE8 on both a Win 7 box and an XP box. Good, usable results files downloaded with the XP system, and bloated unmanageable files on the Win 7 box.

He bought a Win 7 box to progress with his project, but instead it's stopped the project dead in its tracks. He's still got the XP box, but the future is with Win 7, so that's what he'd prefer to use. Understandable, I believe.

Top
#338856 - 02/17/11 12:07 AM Re: Downloading HTML files [Re: pmbuko]
Kruncher Offline
devotee

Registered: 10/05/06
Posts: 484
Loc: Maple Ridge, BC
That sounds like a great plan Peter. Thanks very much for that information. Really top notch.

I left my AIX/Unix days behind me in the '90s, and it's easy to forget just how useful utilities built for those OS's are.

I'll pass that along and will try to post back his feedback here this week.

Top
#338860 - 02/17/11 12:25 AM Re: Downloading HTML files [Re: Kruncher]
ClubNeon Offline
connoisseur

Registered: 02/06/09
Posts: 3466
Loc: Western Maryland, USA
wget may be easier to use than curl. It's my tool of choice.
_________________________
Pioneer PDP-5020FD, Marantz SR6011
Axiom M5HP, VP160HP, QS8
Sony PS4, surround backs
-Chris

Top
#338912 - 02/17/11 02:00 PM Re: Downloading HTML files [Re: ClubNeon]
pmbuko Offline
shareholder in the making

Registered: 04/02/03
Posts: 16437
Loc: Ben Lomond, California
True. wget is a bit more powerful and I'd use it instead if you need to grab a bunch of different files from a web server and want to filter out anything other than .htm or .html files. Here's an example I used recently:

I have a server that holds install and configuration files for the linux desktops I deploy and manage. In one of my automated installs, I need to grab the latest NVIDIA driver from my install server. The name of this file is not constant, but it always ends with a .run extension, so I use the following command to grab it:

wget -r -nH -np -nd -A run http://yum1:8080/nvidia/

The options basically say "look at all the files in the nvidia directory on that web server but only grab the ones that have a '.run' file extension." This works since I only ever keep one in there.
_________________________
I can explain it to you but I can't understand it for you.

Top
#338922 - 02/17/11 02:54 PM Re: Downloading HTML files [Re: pmbuko]
ClubNeon Offline
connoisseur

Registered: 02/06/09
Posts: 3466
Loc: Western Maryland, USA
wget can also be as simple as:

wget "http://www.axiomaudio.com/"

That'll create a file named "index.html" in your current directory. So that is easier than curl for getting a single document. You have at least tell curl what name to save the file with, or it'll just write to the screen.

Of course you can tell wget to save with a different name by just giving it the "-o filename.html" option too.
_________________________
Pioneer PDP-5020FD, Marantz SR6011
Axiom M5HP, VP160HP, QS8
Sony PS4, surround backs
-Chris

Top
#338925 - 02/17/11 03:16 PM Re: Downloading HTML files [Re: ClubNeon]
pmbuko Offline
shareholder in the making

Registered: 04/02/03
Posts: 16437
Loc: Ben Lomond, California
curl -O http://the.url.com will also save an index.html (or whatever default file the server gives you) in your current directory.
_________________________
I can explain it to you but I can't understand it for you.

Top
#338927 - 02/17/11 03:18 PM Re: Downloading HTML files [Re: pmbuko]
ClubNeon Offline
connoisseur

Registered: 02/06/09
Posts: 3466
Loc: Western Maryland, USA
Still too much typing. laugh
_________________________
Pioneer PDP-5020FD, Marantz SR6011
Axiom M5HP, VP160HP, QS8
Sony PS4, surround backs
-Chris

Top
#338940 - 02/17/11 05:08 PM Re: Downloading HTML files [Re: ClubNeon]
pmbuko Offline
shareholder in the making

Registered: 04/02/03
Posts: 16437
Loc: Ben Lomond, California
A spurious criticism for a board regular to make.
_________________________
I can explain it to you but I can't understand it for you.

Top
#338949 - 02/17/11 05:50 PM Re: Downloading HTML files [Re: pmbuko]
ClubNeon Offline
connoisseur

Registered: 02/06/09
Posts: 3466
Loc: Western Maryland, USA
If I was typing -O every time I wanted to save a file, how would I have time to spend here?
_________________________
Pioneer PDP-5020FD, Marantz SR6011
Axiom M5HP, VP160HP, QS8
Sony PS4, surround backs
-Chris

Top
Page 2 of 3 < 1 2 3 >

Moderator:  alan, Amie, Andrew, axiomadmin, Brent, Debbie, Ian, Jc 
Forum Stats

15,371 Registered Members
16 Forums
24,242 Topics
429,389 Posts

Most users ever online:
883 @ 03/04/17 05:06 PM

Top Posters
Ken.C 18044
pmbuko 16437
SirQuack 13635
CV 11738
MarkSJohnson 11445
2 registered (Kevin1, rrlev)
303 Guests and
3 Spiders online.
Key: Admin, Global Mod, Mod
Newsletter Signup