Saturday, July 23, 2011

Wget (or curl) can be a magic command at the right times.

Heres some options I’ve used to retrieve a directory from svn

I wanted to download the everything below
https://remotehost/svn/Base/Streams/1.3/Projects/ProjectName/trunk/osb/Interface/Resources

wget -r --no-parent -nH --cut-dirs=10 --no-check-certificate --http-user=user --http-password=password https://remotehost/svn/Base/Streams/1.3/Projects/ProjectName/trunk/osb/Interface/Resources
or



-r = recursive
--no-parent stops it looping back to parent directory should any references point there
-nH. Ignore the host when saving any resources (removes remotehost directory)
--cut-dirs=10. Removes 10 directories when saving resources. This removes /svn/Base/Streams/1.3/Projects/ProjectName/trunk/osb/Interface/Resources, and will only save any subsequent directories and filenames.

--no-check-certificate, ignores any https cert problems. Handy for local sites with untrusted certs.

-m = mirror. Handy replacement for, -N -r -l inf --no-remove-listing

-Rindex.html : do not download index.html pages

--http-user, --http-password specify some HTTP basic auth if required

Proxy
--proxy-user=, --proxy-password= specify some proxy auth if required. The actual proxy itself

To specify the proxy. Yo must set environment variables
e.g. export http_proxy=http://192.168.10.250:80


variables are
http_proxy
https_proxy
If set, the http_proxy and https_proxy variables should contain the urls of the proxies for http and https connections respectively.

ftp_proxy
This variable should contain the url of the proxy for ftp connections. It is quite common that http_proxy and ftp_proxy are set to the same url.

no_proxy

This variable should contain a comma-separated list of domain extensions proxy should not be used for. For instance, if the value of no_proxy is ‘.mysite.com’, proxy will not be used to retrieve documents from mysite.

No comments: