Previous Thread
Next Thread
Print Thread
Rate Thread
Page 2 of 3 1 2 3
Re: Downloading HTML files
Kruncher #338851 02/17/11 04:57 AM
Joined: Apr 2003
Posts: 16,441
shareholder in the making
Offline
shareholder in the making
Joined: Apr 2003
Posts: 16,441
To avoid any sort of browser "infection", you could try curl for Windows. It's built in on most UNIX/linux OSes and very easy to use. E.g., from the command prompt:

curl http://www.somesite.com/stuff.html -o somesite-stuff.html

This would download the stuff.html file from somesite.com and save it locally as somesite-stuff.html.

Since it's a command-line utility, you could batch a bunch of sites together.

Re: Downloading HTML files
Kruncher #338855 02/17/11 05:02 AM
Joined: Oct 2006
Posts: 484
devotee
OP Offline
devotee
Joined: Oct 2006
Posts: 484
The plot thickens. The fellow who approached me with this in the first place maintains that he was using IE8 on both a Win 7 box and an XP box. Good, usable results files downloaded with the XP system, and bloated unmanageable files on the Win 7 box.

He bought a Win 7 box to progress with his project, but instead it's stopped the project dead in its tracks. He's still got the XP box, but the future is with Win 7, so that's what he'd prefer to use. Understandable, I believe.

Re: Downloading HTML files
pmbuko #338856 02/17/11 05:07 AM
Joined: Oct 2006
Posts: 484
devotee
OP Offline
devotee
Joined: Oct 2006
Posts: 484
That sounds like a great plan Peter. Thanks very much for that information. Really top notch.

I left my AIX/Unix days behind me in the '90s, and it's easy to forget just how useful utilities built for those OS's are.

I'll pass that along and will try to post back his feedback here this week.

Re: Downloading HTML files
Kruncher #338860 02/17/11 05:25 AM
Joined: Feb 2009
Posts: 3,466
connoisseur
Offline
connoisseur
Joined: Feb 2009
Posts: 3,466
wget may be easier to use than curl. It's my tool of choice.


Pioneer PDP-5020FD, Marantz SR6011
Axiom M5HP, VP160HP, QS8
Sony PS4, surround backs
-Chris
Re: Downloading HTML files
ClubNeon #338912 02/17/11 07:00 PM
Joined: Apr 2003
Posts: 16,441
shareholder in the making
Offline
shareholder in the making
Joined: Apr 2003
Posts: 16,441
True. wget is a bit more powerful and I'd use it instead if you need to grab a bunch of different files from a web server and want to filter out anything other than .htm or .html files. Here's an example I used recently:

I have a server that holds install and configuration files for the linux desktops I deploy and manage. In one of my automated installs, I need to grab the latest NVIDIA driver from my install server. The name of this file is not constant, but it always ends with a .run extension, so I use the following command to grab it:

wget -r -nH -np -nd -A run http://yum1:8080/nvidia/

The options basically say "look at all the files in the nvidia directory on that web server but only grab the ones that have a '.run' file extension." This works since I only ever keep one in there.

Re: Downloading HTML files
pmbuko #338922 02/17/11 07:54 PM
Joined: Feb 2009
Posts: 3,466
connoisseur
Offline
connoisseur
Joined: Feb 2009
Posts: 3,466
wget can also be as simple as:

wget "http://www.axiomaudio.com/"

That'll create a file named "index.html" in your current directory. So that is easier than curl for getting a single document. You have at least tell curl what name to save the file with, or it'll just write to the screen.

Of course you can tell wget to save with a different name by just giving it the "-o filename.html" option too.


Pioneer PDP-5020FD, Marantz SR6011
Axiom M5HP, VP160HP, QS8
Sony PS4, surround backs
-Chris
Re: Downloading HTML files
ClubNeon #338925 02/17/11 08:16 PM
Joined: Apr 2003
Posts: 16,441
shareholder in the making
Offline
shareholder in the making
Joined: Apr 2003
Posts: 16,441
curl -O http://the.url.com will also save an index.html (or whatever default file the server gives you) in your current directory.

Re: Downloading HTML files
pmbuko #338927 02/17/11 08:18 PM
Joined: Feb 2009
Posts: 3,466
connoisseur
Offline
connoisseur
Joined: Feb 2009
Posts: 3,466
Still too much typing. laugh


Pioneer PDP-5020FD, Marantz SR6011
Axiom M5HP, VP160HP, QS8
Sony PS4, surround backs
-Chris
Re: Downloading HTML files
ClubNeon #338940 02/17/11 10:08 PM
Joined: Apr 2003
Posts: 16,441
shareholder in the making
Offline
shareholder in the making
Joined: Apr 2003
Posts: 16,441
A spurious criticism for a board regular to make.

Re: Downloading HTML files
pmbuko #338949 02/17/11 10:50 PM
Joined: Feb 2009
Posts: 3,466
connoisseur
Offline
connoisseur
Joined: Feb 2009
Posts: 3,466
If I was typing -O every time I wanted to save a file, how would I have time to spend here?


Pioneer PDP-5020FD, Marantz SR6011
Axiom M5HP, VP160HP, QS8
Sony PS4, surround backs
-Chris
Page 2 of 3 1 2 3

Moderated by  alan, Amie, Andrew, axiomadmin, Brent, Debbie, Ian, Jc 

Link Copied to Clipboard

Need Help Graphic

Forum Statistics
Forums16
Topics24,943
Posts442,465
Members15,617
Most Online2,082
Jan 22nd, 2020
Top Posters
Ken.C 18,044
pmbuko 16,441
SirQuack 13,840
CV 12,077
MarkSJohnson 11,458
Who's Online Now
0 members (), 558 guests, and 4 robots.
Key: Admin, Global Mod, Mod
Newsletter Signup
Powered by UBB.threads™ PHP Forum Software 7.7.4