list2xml script – Mnemosyne vocabulary flashcards!
With the October SAT coming up, I had to work on my vocabulary. I found some wordlists on the Internet and in books, but I didn’t find it very convenient to learn from a list.
Flash cards seem to work really well. I found a program called ‘mnemosyne’ that manages flash cards. You assign grades to each card depending on how familiar you are with the question. It then schedules the cards to reappear at appropriate times (‘bad’ cards appear soon, ‘good’ cards appear later).
Now I needed some way to put words into this thing without have to use its ‘Add cards’ feature (I’m rather lazy, and besides, it would take forever that way). I found a SAT cards database online that was in the mnemosyne format, but it was huge, and it seemed that the other lists I found online and in books before were more ‘accurate’. This called for automation!
I wrote a bash script that took as its standard input a newline-separated list of words and gave as standard output a file in the mnemosyne XML format with the words as the ‘questions’ and their definitions as the ‘answers’ (these are the definitions as given by WordNet – I’ve found them to be the best, with good examples and synonyms too):-
#!/bin/sh
echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
echo "<mnemosyne core_version=\"1\" time_of_start=\"1224014401\" >"
echo "<category active=\"0\">"
echo " <name>$1</name>"
echo "</category>"
while read word
do
echo
echo "<item>"
echo "<cat>$1</cat>"
echo "<Q>"$word"</Q>"
echo "<A>"
curl dict://dict.org/d:$word:wn | head --lines=-3 | tail --lines=+5
echo "</A>"
echo "</item>"
echo
done
echo "</mnemosyne>"
The first argument is the name of the mnemosyne ‘category’ you want the cards to be put into. For example, you can run it like this (assuming you named the script list2xml.sh and have permission to execute it):-
cat wordlist | ./list2xml.sh SAT-Words-1 > satWords1.xml
There are lots of lists online you could convert into the newline-separated-word format required by the script using a vim macro or sed or something. You could even scan in some books and use an OCR. :P
Here’s a picture from mnemosyne with a script-converted category open:-
