Archive for the ‘Web’ Category

Script for Downloading Images and Links From a Web Page

There are occasions when an individual might wish to download any or all of the images that may be linked from a web page, such as when there is a thumbnail image that is linked to a larger version of the same image (view an example of one such page).  Perhaps too, an individual might wish to obtain a list of all hyperlinks that are referenced in a web page.  After running across Guillermo Garron’s article where he provides some creative commands that will allow you to perform the two tasks listed above, I decided that it would be fun to write a script that executes all of this for you.  My Bash script is called “imageDownloader“, although in addition to downloading images, it will also create a text file containing all of the hyperlinks that are referenced from an html page.  Please note that the images that are downloaded are not the actual images that are displayed on the web page, but are the images that the page links to.

Upon executing the script, the user is welcomed with a short message that explains what the program does, and gives the user a series of choices:

This program will allow you to do one of the following:
(1) List all hyperlinks referenced in a web page and store the list in a text file
(2) Download all images that are hyperlinked from a web page,
    such as when you would click on a thumbnail image
    in order to view a larger version of the same image.

*************************************************************************************************
This script relies on the program called "lynx", so if you don't already have it installed,
you may want to quit (q) now and install "lynx".
*************************************************************************************************

What would you like to do?
Enter "1" to download a list of hyperlinks, "2" to download images that this page links to, or "q" to QUIT:

So, as requested, enter the appropriate choice that most suits your needs, and make sure that you already have “lynx” installed.  Entering either option 1 or 2 will prompt you to enter the desired URL.  It is helpful if you are using a terminal emulator that allows for copy/paste editing; my personal favorite is Terminator, which incidentally allows you to split your terminal screen into multiple panes.  You will then be asked to enter a directory name where you wish to either save your text file containing a list of hyperlinks or the location for your images that will be downloaded and saved, and then it begins working its magic.  You’ll have the option to start over or quit the program at the end.

Note: This was a fun learning opportunity for me and although the concepts used here are not overly difficult, it was still a fun learning experience.  For those who are more experienced coders, if you see that there are places where I could improve my coding practices, please feel free to send me your suggestions and upgrades for this little program.

You can download or view imageDownloader script here, or follow the process outlined below.  You might save it without the “.txt” file extension if you like, as I added this to make it viewable from the comfort of your web browser.  Remember to make the file executable before running it.

$ wget http://www.hilltopyodeler.com/scripts/imageDownloader.txt
$ mv imageDownloader.txt imageDownloader
$ chmod 777 imageDownloader
$ ./imageDownloader

When prompted to enter a URL, you might like to try using the example page that I used above for downloading images (copy/paste): http://ubuntustudio.org/screenshots

Happy downloading!

Collaborative Text Editing With EtherPad

Recently, I have become aware of a really neat online collaboration tool called EtherPad. EtherPad is a tool that allows you and many other participants to collaborate in the creation/modification of a text document. It’s much like working in Notepad, Gedit, Kate, or any of the other basic text editors, but what makes it different is that you and your colleagues can work together on the document in an online setting (from the comfort of their own computers) and make changes in a real-time environment. You can save the document at any time, and you can also revert back to a previously saved version. There is no special text formatting allowed as this is a text tool, but you could easily dump the final version of the document into OpenOffice.org (or whatever your other favorite office word processing tool is) and format it to your liking. EtherPad also incorporates a chat tool so that you can easily communicate with your colleagues while editing the document (or you could also use a telephone).

Due to a very popular response to this tool, the folks at EtherPad have set up a wait-list which you will have to sign up for if you wish to try the beta version of this free software. I just received my email from them today letting me know the following:

“20 days ago you signed up for the EtherPad Beta.  Thanks for being patient!  Now here’s a link that will allow you to create new pads immediately…”
[you can get your own link by signing up]

Below is a screenshot of EtherPad in action. Notice that the different colors represent text that was either written or modified by a different user. The text that is highlighted in green is the original version of the document. Click on the image to see a larger version.

EtherPad Screenshot

You can also read more about the product on their product page and at their FAQ. I am really excited about the release of this new tool and can see the potential for its use in the fields of education, research, and numerous other industries (as well as for personal use too).

WordPress Upgrade

Just upgraded from 2.1.2 to 2.6.5.  Had to blow out some of the cob webs, but all seems to be fine.  Will try to post here more often.

Also, I enabled reCaptcha, which I like because when you translate the image, you are helping to digitize books that have been scanned in.  Cool!  Captcha images are used to prevent spam-bots from spamming you through a form field.

TwEEt!

Recently I attended a conference in Raleigh-Durham.  The topic of “social networking” seemed to be on everyone’s minds, including services such as Twitter, Second Life, del.icio.us, Facebook, MySpace, and more.

Social Bookmarks: I can see the benefit of creating social bookmarks and using a service such as del.icio.us.  By creating a searchable bookmarks page online,  not only can you visit your bookmarked websites from any computer that has access to an Internet connection, but other people can see what it is that you find to be interesting.  This is a good way for colleagues to stay abreast with each other’s research.

Second Life: This 3-D virtual world seems to me to be a little bizarre as well as time consuming, but I can certainly see the advantages of being able to meet with others in a virtual environment and share in communication.  You can go sit down (virtually) next to other virtual people, watch a movie or view a presentation, and then discuss with others issues that surround the presentation.  Pretty neat way for people to come together and communicate or collaborate.

Twitter: At first, this seemed to fall into the category of stupid activities, in my opinion.  After reading more about it, I read that other people usually feel that same way at first.  Now, I find that it’s interesting to read what other people are doing throughout the day, as well as post some of the activities that I have been accomplishing throughout the day.  This is a form of micro-blogging, and each “tweet” is 140 characters or less.  It would be fun to see other friends on this service and be able to catch a glimpse into their daily lives.  Also, it’s interesting to see colleagues tweeting and see what they are up to throughout the day.  Tweet me @hilltop_yodeler.

As for Facebook and MySpace, I just can’t seem to like either of these sites.  Any MySpace or Facebook site I’ve ever visited is painfully difficult to look at; they usually score a 10 on the ugly-factor, they’re difficult to read, and they’re full of comments from “friends” who are generally stating nothing of particular importance (but I guess that’s the whole point).  Seems like a complete waste of time to me, except perhaps for people who have no idea about how to build their own website.

Let’s not forget about blogging.  This is also a form of social networking, however can often be a more informative form of communication than some of the services listed above.

Return top

-==[ Hilltop_Yodeler ]==-

Welcome to HilltopYodeler, a place where we'll do some hollerin' about Linux, OSS/FOSS, CSS/XHTML, pickin', paddlin', tinkering, snow, rock, bicycles, and other stuff that we're freaky for. Much of what will be discussed here will be related to Ubuntu Linux, Debian Linux, Crunchbang (#!) Linux, Damn Small Linux, OpenBox, PekWM, and Gnome. Grab your coffee... pick up your piolet... tuck in your whiskey nipper... have paddle in hand... grease your boards... bend some wires... plug into your lappie, mow down some sushi... and get your fool-freak yodel on!