Archive for the ‘Bash/Shell Scripting’ Category
Extract list of all Apple WikiServer wiki titles into CSV format
An interesting request came in today from a coworker. She wanted to create a spreadsheet that contained all of our intranet’s wiki pages (which uses the Apple WikiServer), presumably because Apple doesn’t provide an easy way to “list all pages” in the wiki itself. Along with the page title, she also wanted to extract its internal ID, its URL, and the time the page was created as well as the time it was last modified.
I spent about an hour looking into this this afternoon and it turns out that much of this information is readily available on the filesystem in the Apple WikiServer’s data store. I whipped up the following shell script to extract this information in CSV format, exactly as requested.
I’m posting this script here in case someone else wants similar “export a list of WikiServer pages to a comma-separated values (CSV) file” functionality but isn’t sure how to go about getting it. To use this, just edit the line that reads http://my-server.example.com/groups/wiki/ so that it refers to the wiki base URI of your own server.
Update: The latest version of this script is now available at its Github-hosted repository. You should probably use that instead of the script below.
#!/bin/sh -
#
# Script to extract data from an Apple WikiServer's data store by querying the
# filesystem itself. Creates a 'wikipages.csv' file that's readable by any
# spreadsheeting application, such as Numbers.app or Microsoft Excel.app.
#
# USAGE: To use this script, change to the WikiServer's pages directory, then
# just run this script. A file named wikipages.csv will be created in
# your current directory. For instance:
#
# cd /Library/Collaboration/Groups/mygroup/wiki # dir to work in
# wikipages2csv.sh # run the script
# cp wikipages.csv ~/Desktop # save output
#
# WARNING: Since the WikiServer's files are only accessible as root, this script
# must be run as root to function. Additionally, this is not extremely
# well tested, so use at your own risk.
#
# Author: Meitar Moscovitz
# Date: Mon Sep 22 15:03:54 EST 2008
##### CONFIGURE HERE ########
# The prefix to append to generated links. NO SPACES!
WS_URI_PREFIX=http://my-server.example.com/groups/wiki/
##### END CONFIGURATION #####
# DO NOT EDIT PAST THIS LINE
#############################
WS_CSV_OUTFILE=wikipages.csv
WS_PAGE_IDS_FILE=`mktemp ws-ids.tmp.XXXXXX`
function extractPlistValueByKey () {
head -n \
$(expr 1 + `grep -n "<key>$1</key>" page.plist | cut -d ':' -f 1`) page.plist | \
tail -n 1 | cut -d '>' -f 2 | cut -d '<' -f 1
}
function linkifyWikiServerTitle () {
echo $1 | sed -e 's/ /_/g' -e 's/&/_/g' -e 's/>/_/g' -e 's/</_/g' -e 's/\?//g'
}
function formatISO8601date () {
echo $1 | sed -e 's/T/ /' -e 's/Z$//'
}
function csvQuote () {
echo $1 | grep -q ',' >/dev/null
if [ $? -eq 0 ]; then
echo '"'$1'"'
else
echo $1
fi
}
ls -d [^w]*.page | \
sed -e 's/^\([a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]\)\.page$/\1/' > $WS_PAGE_IDS_FILE
echo "Title,ID,Date Created,Last Modified,URI" > $WS_CSV_OUTFILE
while read id; do
cd $id.page
title=$(extractPlistValueByKey title)
created_date="$(formatISO8601date $(extractPlistValueByKey createdDate))"
modified_date="$(formatISO8601date $(extractPlistValueByKey modifiedDate))"
link=$WS_URI_PREFIX"$id"/`linkifyWikiServerTitle "$title"`.html
cd ..
echo `csvQuote "$title"`,$id,$created_date,$modified_date,`csvQuote "$link"` >> $WS_CSV_OUTFILE
done < $WS_PAGE_IDS_FILE
rm $WS_PAGE_IDS_FILE
For those new to the Wiki Server, this introduction to the Apple WikiServer for web developers may be of interest.
One Minute Mac Tip: Remove .DS_Store files from ZIP Archives
The Mac OS X Finder has some nifty features, one of which is an exceptionally useful contextual menu item to create ZIP archives of folders. Unfortunately, the Finder also has some really, really annoying habits, one of which is to create a file named .DS_Store in each folder a user opens (when not in Column view). What this means is that if you create a ZIP archive on your Mac and then send it to someone who unzips it without the Finder (such as a Windows user using the Windows Explorer), the recipient will see a lot of litter in the form of useless and meaningless .DS_Store files.
If you’re not afraid of the Terminal, this can be avoided. Put the following lines in your ~/.profile (or similar):
alias rmds='find . -name ".DS_Store" -type f -print0 | xargs -0 rm'
What this does is creates a new command that you can use (rmds) which recursively finds and deletes any regular file named “.DS_Store” starting from the current directory. Thus, running this command in the folder you are about to create an archive out of will clean it first, and will prevent unnecessary confusion on the part of your archive file recipient.
Alternatively, another way to do this is to use the command-line zip program and an (admittedly more complicated) pipeline to remove the .DS_Store files after they have been added to the archive. To do that, use this series of commands:
zip -d ZIPfile.zip `unzip -l ZIPfile.zip | grep .DS_Store | awk '{print $4}'`
where, naturally, ZIPfile.zip is the ZIP archive you want to remove the .DS_Store files from. Creating an alias out of that command (and making it work for paths that contain spaces) is left as an exercise for the reader. ;)
As an aside, the alias, find and xargs commands are incredibly useful in their own right and can be used to do a lot of pretty amazing things. As always, man command will give you the nitty gritty.
Also as an aside, you can stop the Finder from creating .DS_Store files entirely when browsing network volumes (like Windows shares) with another command, documented in Apple’s Knowledge Base.
Quick ‘N’ Dirty Drupal Module SVN Tagging Script
In a (rather beastly) project at work today, I found myself needing to import a significant number of contributed Drupal modules into Subversion vendor branches to prepare for custom development. To do so manually would have been quite the hassle, so after downloading the appropriate tarballs and creating a module_name/current directory under my vendor/drupal/modules vendor branch directory, I concocted this little (relatively untested) script to handle the mass tagging operations I needed to perform.
for i in *; do
v=`grep 'version = "' "$i/current/$i/"*.info |
cut -d ':' -f 2 |
sed -e 's/^version = "/v/' -e 's/"$//'`
svn cp "$i/current" "$i/$v"
done;
It’s a bit buggy for some modules that have multiple .info files, but I’m sure a few more pipeline stages can fix that. (Which, because I’m done with this at the moment, I will leave as an exercise to the reader.)
Chalk this one up as another testament to the power of shell scripting and how it can help every developer get their job done faster.