Journal

By Steve Challis

Recent Entries

Archive

RSS/Atom

Home

Projects

@schallis

A Portable Bash Function to List Process Hierarchies

2 years, 1 month ago — 0 Comments — Permalink

  • bash
  • script

It can be useful to find the process hierarchy of a particular process when diagnosing a problematic machine. Say for instance you run into a rogue program and want to find out what spawned it before you kill it. The Unix command pstree (or ptree on Solaris) will show you a tree of running processes but is not always available.

Since I ran into this problem recently, I concocted the following recursive function to traverse a process tree to its root from a given pid:

parents(){ :(){                                                                                                                       
          read p n <<<`ps -o ppid,comm -p $1 | awk 'NR>1{print $1,$2}'`;
          echo -e "$1\t$n";
          test $1 -ne 0 && : $p; };
      : $1; }

Let’s see it in action:

stevechallis:~$ parents 28311                                                                                                         
28311   /bin/sh
488 /Applications/Emacs.app/Contents/MacOS/Emacs
1   /sbin/launchd
0

NASA Astronomy Picture of the Day Background

2 years, 2 months ago — 1584 Comments — Permalink

  • nasa
  • astronomy
  • script
  • bash

Update: An improved version of this script now lives on Github Gist at https://gist.github.com/1144996

NASA have this really neat Astronomy Picture of the Day webpage with some incredible pictures. I thought it’d be neat to set these as my desktop background picture so wrote the following script to accomplish this. It basically downloads the most recent picture to whatever folder you put it in and will wipe the folder whenever run. Adding this to a daily cron and then setting your desktop background to image.jpg should give you a changing picture.

#!/bin/sh
dst=`dirname $0`
base="http://apod.nasa.gov"
rm -rf $dst/*.jpg
wget -qO- http://apod.nasa.gov/apod.rss |
    grep "link" | head -n 1 |
    sed "s/.*<link>\(.*\)<\/link>.*/wget -qO- \1/" | bash - |
    grep "href=\"image" | head -n 1 |
    sed "s;.*\"\(.*\)\".*;wget -O $dst/image.jpg $base/\1;" | bash -

The accompanying crontab entry will be something along the lines of:

0   0   *   *   *   ~/Pictures/Wallpapers/apod/getpic.sh

The Lazy Way To Download Journal Articles

2 years, 6 months ago — 3 Comments — Permalink

  • awk
  • jquery
  • script
  • unix
  • wget

So imagine you have a load of PDF’s you’d like to download from a website but they all have the same filename and it takes you a while to rename them, plus it’s getting very tiring clicking on them all. Yep, it’s time for a script …

It turns out Springerlink are offering a bunch of such articles for download on their website and I wanted to browse them all locally. I started out by retrieving a list of titles and urls in the format ‘[title] [url]\n’ using jQuery and the following snippet of code:

$('.journalArticle').each(function(index, object) {                                                                                                                             
    var base = ' http://www.springerlink.com';
    var title = $(this).find('.title a').text();
    var url = $(this).find('.pdf a').attr('href');
    console.log(title + base + url);
})

Each article is wrapped in a div with class="journalArticle" so we can just iterate through those and pull out the title and url (both of which are also conveniently marked with ‘title’ and ‘pdf’ classes respectively). The resulting list can be copied into a text file. Now all we need is a script to run through this file and do the downloading:

awk '{
    system("wget --user-agent=1337 " $NF);
    $NF=""; NF--;
    gsub(" ","",$0);
    print $0;
    system("mv fulltext.pdf " $0 ".pdf")
}' mylist.txt

Springerlink reject requests without a user agent header so I’ve added in a false one which they accept nicely. The script just calls wget on the last part of everyline (the url) then removes spaces from the remaining part of the line (the title) and moves the downloaded file to a file with this name. It’s pretty horrible so let’s have another go:

cat mylist.txt | \
   sed 's;\(.*\) \(.*\)\(\.[^.]*\)$;wget "\2\3" -O "\1\3";' | \
   bash -

Sed does a much better job of grabbing the arguments, and utilising the -O parameter of wget is a far easier way to achieve the rename. Job done.

« NewerOlder »

Log in

Powered by Mumblr – a basic Django tumblelog application that uses MongoDB with MongoEngine. Fork it on Github. Designed and developed by Harry Marr and Steve Challis.

Unless otherwise noted, everything here is available under the Creative Commons Attribution-Share Alike 3.0 license. Sharing is fucking cool.

Home / Projects / Recent / Archive / RSS /