- To: "Peter Rundle" <slug@xxxxxxxxxxxxxxxxxx>
- Subject: Re: [SLUG] Spider a website
- From: "Jonathan Lange" <jml@xxxxxxxxx>
- Date: Tue, 3 Jun 2008 14:24:30 +1000
- Cc: SLUG <slug@xxxxxxxxxxx>
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; bh=LpxOKdwU5tovUw4ahxXB2zAf2vua1s8+p8jFioT+zlQ=; b=mPr0UenlX8zxiNN+4l5wnf23qZPW6zGfelwnWRWoYTe4Cp+wYtFS2w9rhnb+if/As3ek57bHODFBffsEGcrPRyMr+rdVm9uy9zPT6c8iPzHh0Du2kL5hXbBSYsTGUD85o2P9M337fR1ZEaHumAECvyFDvtSvaqxTOrpHRny9rcI=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=jyGs6NsKaCAF+iyyx8M9D/C3dQV1LwaPZ+E6WDuCUwf/8ARxeYyNICwnic4yT0QwDzrIof/Ihsz9AYhlUvzDTXSXzIqd0G3SZct6gPZ+yA+UgDBkB9TIOWnErrW7FtTPFXqzG6iSxLSFe9Mttes++wpLn5F1UWNiMn7Xk/pNuq8=
On Tue, Jun 3, 2008 at 2:20 PM, Peter Rundle <slug@xxxxxxxxxxxxxxxxxx> wrote:
> I'm looking for some recommendations for a *simple* Linux based tool to
> spider a web site and pull the content back into plain html files, images,
> js, css etc.
>
> I have a site written in PHP which needs to be hosted temporarily on a
> server which is incapable (read only does static content). This is not a
> problem from a temp presentation point of view as the default values for
> each page will suffice. So I'm just looking for a tool which will quickly
> pull the real site (on my home php capable server) into a directory that I
> can zip and send to the internet addressable server.
>
> I know there's a lot of code out there, I'm asking for recommendations.
>
I'd use 'wget'. From what you describe, 'wget -r' should be very close
to what you want. Consult the manpage for details about fiddling with
links etc.
jml