Forum www.public4you.fora.pl Strona Główna

www.public4you.fora.pl
Forum gazetki blogowej The Public.
 

peep toe pump Powerful tool wget to download detai

 
Napisz nowy temat   Odpowiedz do tematu    Forum www.public4you.fora.pl Strona Główna -> Our blogs
Zobacz poprzedni temat :: Zobacz następny temat  
Autor Wiadomość
du17qeo69
Coraz więcej gadania
Coraz więcej gadania



Dołączył: 29 Kwi 2011
Posty: 105
Przeczytał: 0 tematów

Ostrzeżeń: 0/5
Skąd: England

PostWysłany: Nie 15:17, 29 Maj 2011    Temat postu: peep toe pump Powerful tool wget to download detai

wget to download detailed




If you have not used the download tool wget. you try it,[link widoczny dla zalogowanych], too strong , the following are details:




$ wget [link widoczny dla zalogowanych]




it can control the ftp site to download the entire web directory at all levels, of course, if you're not careful,[link widoczny dla zalogowanych], you may will, and his entire site and other sites to do all the download links.




$ wget-m [link widoczny dla zalogowanych]




Since this tool has is the ability to download, so you can put it on the server as mirror sites of the tools. it in accordance with the to do the right image, you can restrict the type of link and download the file type and so on. such as: download only link and ignores the associated GIF image:




$ wget-m-L - reject = gif [link widoczny dla zalogowanych]




wget can also achieve the endpoint resume (-c parameters), of course, this operation is the need to support the remote server.




$ wget-c [link widoczny dla zalogowanych] file




can resume and the mirroring endpoint combination, so many times in the previous case of broken mirror to a large number of selective files Sites. How to automatically achieve this we will discuss later in even more.

If you feel disconnected when downloading always affect your office, you can limit the number of retries wget. ;

$ wget-t 5 [link widoczny dla zalogowanych]

so try again gave up after five. with the never give up. constantly try again.




B. That agency services for how to do it?
http proxy can use the parameters or. Wgetrc configuration file to specify a way to download through a proxy. But there is such a problem, / p>
If the resume through a proxy to the endpoint, then there may be a few failures. If there is a download through the proxy process is interrupted, then the proxy server cache is preserved in the complete

file copies. So when you use / p>
this time you can add a specific request parameter to make clear their cache proxy server:




$ wget-c - -header = numbers, add a variety of ways. Through which we can change

web server or proxy server of some properties. Some sites do not provide an external connection file services, only through the same site on some other page content

will be submitted. This time you can add page >
$ wget - header = br>




C. How do I set the download time?
If you need a computer in your office colleagues and shared through a connection to download some large files,[link widoczny dla zalogowanych], and you want your colleagues will not slow down the closing speed of the network's to affect,

then you should try to avoid peak hours. Of course, not in the office to wait until people are gone so does not need to run out at home after dinner, still misses to the Internet to download again.
to use can be very good at custom work time:

$ at 2300

warning: commands will be executed using / bin / sh

at> wget [link widoczny dla zalogowanych]

at> press Ctrl-D

this way, we set set the download work at 11 points. For this arrangement to be normal, make sure

recognize atd the daemon is running.



D. Download spend a lot of time?
when you need to download large amounts of data, and you have enough bandwidth and no, this time you will often find the downloads in your arrangement is not yet complete, but the work day to begin .
as a good colleague, you can only stopped these tasks,[link widoczny dla zalogowanych], and start another job. And then you need to repeatedly re-use This will certainly be too cumbersome,

so it is best to use crontab to run automatically. Create a plain text file, called ;

0 6 * * 1-5 killall wget

the crontab file specifies the implementation of certain tasks on a regular basis. When did the first five columns a statement run this command, and the rest of each line tells what crontab execution.
specifies the first two columns to 11 pm every day began with wget, one at 6 am to stop all wget

download. * Indicates the third and fourth columns every day of every month to perform this task. The fifth column specifies which days a week to perform this procedure. -



so that at 11 pm each working day, download work began, to the morning of 6, any wget task

was stopped. You can use the following command to execute crontab:

$ crontab crontab.txt

wget of the match, the download will stop, because it shows the entire file has downloaded completely.
with I have repeatedly used this method, the telephone dial-up by sharing many of the ISO to download the image file, or

more practical.



E. Dynamic web page how to download

some pages every day to change several times on request. So, technically, the target is no longer a file, it does not file size. So This argument loses its meaning.

example: a PHP written and often change linux weekend news website:

$ wget [link widoczny dla zalogowanych] / bigpage.php3




conditions of the network in my office are often poor, give me a download with a lot of trouble, so I wrote a simple to detect whether the script is completely dynamic pages updated.

#! / bin /




# create it if absent

touch bigpage.php3




# check if we got the whole thing

while,[link widoczny dla zalogowanych]! grep-qi bigpage.php3

do

rm-f bigpage.php3




# download LWN in one big page

wget [link widoczny dla zalogowanych]




done

this script able to download the page to ensure continuing until the pages inside a / p>



F. Cookies for ssl and how to do?
ssl if you want to come through the Internet, then the Web site address should be to It can

readily available. Some sites force users to use the time in the browser cookie. so you must be on the website that there are Cookie ;

parameters can ensure the correct download. For the lynx, and Mozilla's Cookie file format, with the following:

$ cookie = $ (grep nytimes ~ /. lynx_cookies | awk {printf (, you have to use the browser to complete the registration on the site.

w3m uses a different, more compact Cookie file format:

$ cookie = $ (grep nytimes ~ / .w3m/cookie | awk {printf (:

$ wget - header = > or use the curl tool:

$ curl-v-b $ cookie-o supercomp.html [link widoczny dla zalogowanych]




G. How do I create address lists?
to now,[link widoczny dla zalogowanych], we are a single file or download the entire site. Sometimes we need to download a large number of links on a page file, but it is not necessary to the entire website

are mirrored down. For instance, we want a descending order from 100 before the 20 songs there to download. Note that here >
work, because they only work for file operations. so be sure to use .com / pub / lg / | grep gz $ | tail -10 | awk {print $ 2}> urllist.txt

lynx output can be a variety of GNU worry too much text processing tools. In the above example, we link the address is Download this file in the destination file:




$ for x in $ (cat urllist.txt)

> do ;

> wget $ x

> done

so that we can successfully download the Linux Gazette website (ftp://ftp.ssc. com / pub / lg /) the latest 10 topics.








H. Expanding the use of bandwidth

If you choose to download a file by the bandwidth limitations, you download because of server-side constraints become very slow. The following tips will greatly reduce the download process. But This technique

require you to use curl and the remote server can have multiple images for you to download. For example, suppose you want to download from the following three addresses Mandrake 8.0:

url1 = [link widoczny dla zalogowanych]

url2 = [link widoczny dla zalogowanych] Mandrake/iso/Mandrake80-inst.iso

url3 = [link widoczny dla zalogowanych]

this file The length is 677,281,792 bytes, so the processes with the curl part1 $ url1 &

$ curl-r 200000000-399999999-o mdk-iso.part2 $ url2 &

$ curl-r 400000000 --o mdk -iso.part3 $ url3 &

This created three background processes. each transfer process from a different server in different parts of the ISO file. The byte range. When these three

end of the process, the use of a simple cat command to link up these three files - cat mdk-iso.part?> mdk-80 . iso. (strongly recommended before the dials check md5)

You can also use




Conclusion

not worry about using non-interactive mode will affect your download download the results. No matter how the web designer racking their brains trying to stop us from their website, we can get

free tool to automatically download the task. This will greatly enrich our web experience.


Post został pochwalony 0 razy
Powrót do góry
Zobacz profil autora
Wyświetl posty z ostatnich:   
Napisz nowy temat   Odpowiedz do tematu    Forum www.public4you.fora.pl Strona Główna -> Our blogs Wszystkie czasy w strefie EET (Europa)
Strona 1 z 1

 
Skocz do:  
Nie możesz pisać nowych tematów
Nie możesz odpowiadać w tematach
Nie możesz zmieniać swoich postów
Nie możesz usuwać swoich postów
Nie możesz głosować w ankietach


fora.pl - załóż własne forum dyskusyjne za darmo
Powered by phpBB © 2001, 2005 phpBB Group
deoxGreen v1.2 // Theme created by Sopel stylerbb.net & programosy.pl

Regulamin