I have left Harvard as of July 1, 2008 to take a position at NYU. This website has been cached and left static. Feel free to browse my new website, aka "What the heck is a Clinical Associate Professor?"

09.27.07

Hacking your signature file (baseball bragging rights division)

Posted in Computing at 10:06 am by leingang

I wanted to blog about this for a while, and suddenly it’s almost too late. I hacked together some stuff to put the Red Sox’ magic number for clinching the American League East division in my e-mail signature automatically. The process is replicable for any kind of updating content.

About the magic number

In case you’re still confused, the magic number for a team in first place is a countdown to the “clinching” or mathematical assurance of the championship. Every time the team in first place wins, the number goes down. Every time the team in second places loses, the number goes down. When the number reaches zero, the race is over.

The race is clinched when the first place team is more games ahead of the second-place team (and all the other teams) than the number of games remaining. So one way to define the magic number is the lead in the standings, minus the number of games remanining, plus one or rounded up in the case of half-game differences in the standings. What amounts to the same thing, however, is the total number of games in the season, minus the number of wins by the first-place team, minus the number of losses by the second place team, plus one. It’s clear by that last definition that it does what you want (goes down with each win by first-place team or loss by second-place team), and the number of games plus one makes it reach zero at the point of clinching.

The Red Sox have been in first place in the AL East since April, and although they have been in first place a lot, they have not finished the division in first place since 1995. I’ve had the magic number in my signature since it was in the 60s (shortly after the all-star break), and today it stands at two. Which means the division could be clinched today.

Generating the content

The idea is simple: write a script which visits an internet location of baseball standings and munge that data to get the magic number. There are tons out there, such as every newspaper’s website or the megasites like EPSN. Scrolling through source code of Dashboard widgets I found out that MLB publishes this data in a lightweight, machine-readable format on a secret page. Today’s AL East standings can be found at

http://mlb.com/components/game/year_2007/month_09/day_27/standings_rs_ale.js

This file has lots of standings data in the form of a JavaScript object. This is really useful if you’re writing a JavaScript program to display the standings. XML would have been another choice, but this is just as easy to parse, maybe even easier. Also, this resource includes the magic number, not that your script couldn’t calculate it for you. Here’s an excerpt from the file:

var standings_rs_ale = [{
w: '94',
elim: '-',
rs: '850',
div: 'ale',
gameid: '2007_09_27_minmlb_bosmlb_1',
status: 'P',
pre: null,
last10: '5-5',
onerun: '22-26',
xtr: '2-5',
nextg: '9/27 v MIN, 7:05P',
vsW: '20-17',
ra: '643',
gb: '-',
wrap: '/news/wrap.jsp?ymd=20070926&content_id=2231984&vkey=wrapup2005&fext=.jsp&c_id=mlb',
home: '49-28',
code: 'bos',
pct: '.595',
league_sensitive_team_name: 'Boston',
vsC: '20-11',
vsE: '42-30',
vsR: '69-41',
vsL: '25-23',
xwl: '99-59',
strk: 'W2',
l: '64',
lastg: '9/26 v OAK, W 11-6',
interleague: '12-6',
team: 'Boston',
road: '45-36'
}, ...

There’s one of these records for each team in the division. The “elim” key is the magic number to be eliminated, so the Red Sox’ magic number can be found as the “elim” value for the second-place team.

I wanted to write this part of the program in python. I like perl, too, but I’ve been flummoxed trying to re-read my own perl scripts after setting them aside for a while. Python is hyper-strict on syntax but that makes it pretty easy to read.

Seeing the page in JavaScript made me think of JSON, a data-interchange format similar to JavaScript syntax. I thought parsing the JavaScript data in python would be as easy as reading it and hitting it with a python JSON library. Not quite. The standings page is valid JavaScript but not valid JSON. But rather than write my own parser I decided to massage the included JavaScript string so that it did parse as JSON. A little quoting of key values, switching double and single quotes, and getting rid of html links did that trick.

So here’s my script, which I saved as ~/Library/Scripts/getmn.py:

#!/sw/bin/python

# Get the Red Sox magic number

import cjson, re, urllib, datetime

today=datetime.date.today()
mlburl=”http://mlb.com/components/game/year_%04d/month_%02d/day_%02d/standings_rs_ale.js” % (today.year,today.month,today.day)
rawdata=urllib.urlopen(mlburl).read(-1)
(varname,data) = rawdata.split(’=',1)
# although this data is legal javascript, it’s not JSON. So we have
# to munge it a bit:
# quote property names with double quotes
data=re.compile(”\t([a-zA-Z0-9_]+):”,re.M).sub(’\t”\\1″:’,data)
# get rid of links
data=re.compile(”<a[^>]*>|</a>”).sub(”",data)
# replace single quotes by double quotes
data=data.replace(”‘”,’”‘)

dd=cjson.decode(data)

if (dd[0][’team’]==’Boston’):
print “Red Sox magic number: %d” % (int(dd[1][’elim’]))
What I like about this script is that it does very little lifting of its own, farming out most of the retrieval, data-crunching, and parsing. Even the massaging is done with ultra-powerful (as a class, not to say that mine are) regular expressions. This script works on the command-line if that’s all you want.

Adding the content to the sig

I use Microsoft Entourage for reading my mail and keeping calendar and contact info. I’m not in love with it but it is AppleScriptable, so you can do a lot with it. I wrote an AppleScript to run my magic-number-getter and append it to my standard “work” signature (that has my title, address, and a link to my vCard file in it):

set theScript to quoted form of POSIX path of (path to home folder as string) & “Library/Scripts/getmn.py”

set crlf to (ASCII character 13)

tell application “Microsoft Entourage”

set txt to do shell script theScript

set defaultSig to the content of the first signature where name is “Harvard”

set theSig to the first signature where name is “Harvard + Red Sox”

set the content of theSig to defaultSig & crlf & crlf & txt

end tell

AppleScript is a pain in the rear to write—it takes a lot of trial and error. But once it’s done, it reads pretty well. The crlf’s just insert blank lines.

If you have a different mail reader, this is going to change. But I’m confident the idea will work in Apple’s Mail.app as well.

Automatically updating

I saved this script in the folder ~/MIcrosoft User Data/Entourage Script Menu Items/ as “Get Red Sox Magic Number for Signature” So it’s accessible within the Script menu of Entourage.

But the final feature I wanted was to have that information update automatically. Entourage also has a scheduler, so I configured a job to run this script once a day:

That’s it! The workflow is easily copied for whatever oft-changing data you would want in your signture (for instance, the output of /usr/bin/fortune).

PS: While researching this post I found a site (with various different URLS) that computes the magic number for you and gives you HTML widgets. Plus a lot of Google ads, but useful all the same.

technorati tags:, , , , ,

Blogged with Flock

Resizing PDFs

Posted in Computing, LaTeX at 8:57 am by leingang

Yesterday I used my Facebook status line to say I was wondering how to resize PDFs. This seems to be a nontrivial issue. I have at least one workaround now, though.

Background: In my classes I often make slideshows with beamer (yay beamer)! It’s a great package for those who need the full flexibility of TeX or LaTeX to present math-rich course material. Some of my colleagues use Keynote with pretty good results, too. You can cut-and-paste PDF from TeXShop (an open-source TeX development environment for Mac OS X) to Keynote, with antialiasing and everything “just working”, and apply all of Keynote’s bells and whistles. (Among devotees of this technique is my colleague Tom, who likes the blackboard theme and the page-turning transition. Kind of mixing metaphors, if you ask me, but he didn’t. :-) )

I also use the InterWrite pad for classes, which allows me to draw on the screen and/or annotate the slides I’ve made. Anything that’s annotated gets attached to a file, which can be exported to PDF. It taxes my laptop’s speed, but I have a 4 year old laptop.

After classes I had been posting the slides and the screen notes as separate PDFs. This wasn’t ideal, because the notes file only has the slides I annotated. What I wanted was something that was as close to what happened in class as possible. Then I realized that Acrobat (among other tools, but I have access to Acrobat Pro so why not use that) can rearrange pages of PDFs, so it’s possible to merge the two into one file and post that.

Until yesterday this has worked fine. But suddenly my screen notes PDF came out at a different size. The slides are 128mm x 96mm and you use the reader or view to blow them up to screen size. After cropping the notes, however, the page of writing was for some reason closer to 210mm wide. I wanted to shrink that down to 128 so the merged document would have all its pages the same size.

So I never figured out how to do that in Acrobat Pro. The philosophy is that PDF is “electronic paper,” meant to be a published version of a document, so a lot of editing methods aren’t provided. You can annotate a PDF (in Acrobat Pro or also in Preview), decorate it with a rubber stamp, or move pages around, but you can’t for instance erase parts of a document (although you can white them out).

But LaTeX users are familiar with including graphics into their documents with the graphicx package. PDF is a popular format for included graphics, especially on the Mac. Graphicx does allow scaling of images, which on vector-based PDFs doesn’t lose any detail. Finally, the pdfpages package provides a frontend to including whole pages of a multi-page PDF rather than embedding a graphic on a page, with graphicx-style options.

So here’s what I did: created a wrapper in LaTeX that looked like this:

\documentclass{article}
\usepackage{pdfpages}
\begin{document}
\includepdf[pages=-,width=128mm,fitpaper]{interwrite_notes.pdf}
\end{document}

Then I pdflatex-ed that file and the resulting PDF was exactly what I wanted.

I’m still kind of surprised that there’s not a Acrobat Pro-based solution to this.  But this works; it’s simple enough to be automated into a script, and it’s free, although it requires a TeX installation. 


technorati tags:, ,


Blogged with Flock

09.17.07

Happy first day of school

Posted in Courses, Math 20 at 12:02 pm by leingang

Well, it’s finally here: the first class day of the academic year. Back on Cinco de Mayo when we celebrated the end of classes Tom said, “Can you believe we won’t have to teach for another four months?” Thin air.

I have one class which meets this week. The intro meeting for Math 1a (Calculus I) is tomorrow morning, and those sections start next Monday.

Here’s my intro talk from Math 20:


I thought about the lessons that Merlin Mann wrote about presentations last week. So unlike previous intro talks I’ve given, the slides seem pretty incomplete (the syllabus, handed out on paper as they walk in, has all the nitty-gritty on it). You’re not able to page through the slides and get a full picture of what I said. And I came to that concept, along with Merlin’s example, with the following rhetorical question:

If the slideshow tells the whole story, what do you add by standing in front of it?

So I took all my bullet points and moved them into the notes section, and printed that out to make sure I hit them in class (then forgot to bring them, but luckliy I internalized them by this process). I replaced them with one, maybe two words, and a big picture (shout-out to sxc.hu for royalty-free stock photos). The result was a much more impressive-looking slideshow, even if devoid of facts.

Before After
before.png PowerPoint with a bunch of words image by leingang_math after.png Keynote slide with one word image by leingang_math

Also, I started using Keynote instead of PowerPoint. Starting from zero, after about four hours of working, I realized I was almost as good at Keynote as I was at PowerPoint, even though I’ve usd that program for years. That’s pretty amazing. And even factoring out my re-thinking of the content, I think Keynote just “pops” more then PPT.


technorati tags:


Blogged with Flock

09.12.07

Wednesday Links and DITL

Posted in Uncategorized at 9:21 pm by leingang

Today was Day 2 of the Bok Center Conference, with my colleague Bret leading a case discussion on group work. We are getting ready for our annual freshman advising period, which starts tomorrow and continues until September 24th.

Here’s what piqued my interest on the internet today:

Blogged with Flock

09.11.07

Tuesday reading list and DITL

Posted in Teacher Training at 6:47 am by leingang

Today I’ll be attending the Fall Teaching Conference sponsored by the Derek Bok Center for Teaching and Learning. The math department, especially the preceptors, collaborate a lot with the Bok Center in the training of graduate student teaching fellows. I’ve given several talks in these conferences, but I’m free this year.

There are sessions on teaching and learning in the conference. Erin Driver-Linn will speak on “The Confused Problem-Solver in Math and Science,” presenting research on student learning Ryan Hickox will speak on “Successful Science Sections”, a nuts-and-bolts session. There will be panel discussions, case studies, all kinds of great stuff.

One of my favorite Bok Center talks was given by a colleague of mine, talking about the importance of the first day. He decided to create the worst example of first-day teaching and model it without telling the audience. He showed up late, he mumbled, he looked disheveled, he forgot his handouts. Someone asked a question and he said, “What a stupid question!” Fortunately, most of the attendants either figured it out or waited long enough for Andy to reveal the ruse. That’s something I wouldn’t be able to pull off!

The conference runs every Fall and Spring (two days in the Fall, one in the Spring), and features one of the more famous free lunches on campus.

Links for today:

Blogged with Flock

09.10.07

Day in the Life: Teaching Fellow Orientation today

Posted in Teacher Training at 6:46 am by leingang

I’m fond of saying that “Sepetember is a the cruelest month” for a preceptor (apologies to T.S. Eliot).  Next week is the first week of classes, but now that the freshmen are here we are in the “Opening Days” period that ramp up to the academic year.

Today we have our annual teaching fellow orientation. We’ve developed a set of activities that we hope will get our graduate students and postdocs excited and prepared for teaching calculus, including:

  • Selections from popular movies on teaching
  • A multiple-choice “pop quiz” on teaching choices
  • “The Preceptor Players Present”–Case studies in the form of skits
  • Chinese Food!!

Should be a good time. 

technorati tags:,


Blogged with Flock

Monday reading list

Posted in News at 6:38 am by leingang

Spotted in my multiple RSS feeds today:

Blogged with Flock

09.03.07

Running arbitrary text munging commands in TeXShop

Posted in Computing, LaTeX at 12:15 pm by leingang


File this under “If I blog about it, it might actually be worth the time it took to figure it out”…

I use TeXShop to handle my LaTeX documents. It works, it’s mac-like, and it’s pretty flexible. One thing I like about it is the Macro Editor, where you can define some boilerplate strings to insert in the document (and tie keyboard shortcuts to them). The Macro Editor has some simple replacement features: for instance #SEL# will get replaced by the current selection. In this way, you can define a macro which wraps an enviroment around whatever’s currently highlighted. The Macro Editor also allows you to write AppleScript code and execute it.

Another thing I like is the regular-expression enabled find and replace dialog. Anything you can write with a sed one-liner, you can use to transform text.

But what I needed was to be able to do this:

  • highlight text in TeXShop
  • run a macro that transforms that text (something more complicated than a single regex search-and-replace), probably running AppleScript which in turn would run a shell script.
  • replace the selected text with the transformed text.

The actual job I wanted to do was to take the text exported by OmniOutliner, which is plain text with items indented and marked by “-”, and convert it to the LaTeX itemiz environment.

The Google wasn’t helping me with examples on how this is done, so I had to figure it out myself. First, I forgot about writing the macro inside of TeXShop; I used Apple’s built-in (at least I think it’s built-in, if not, you have to install the giant XCode package) Script Editor.

Writing AppleScripts is a pretty trying process of wading through Script Dictionaries and reading the Result Log and multiple, multiple trials and errors. In fact, Chapter 3 of AppleScript: The Definitive Guide shows that this is actually the Best (available) Practice (for some reason that got left out of the latest edition. Pity.) But I got it, eventually:

convert Omni Outliner exported text to LaTex itemize environment

Matthew Leingang

2007-09-03

set crlf to

tell application “TeXShop”

set theSelection to the selection of the first document

set input to the content of the selection of the first document as string

write selection to a file to avoid quoting issues

putting “as string” in the definition of input makes this file an ascii file rather than a unicode file

set fileSpec to (path to temporary items folder as string) & (do shell script “echo $$”)

try

set f to open for access fileSpec with write permission

set eof f to 0

write input to f

close access f

on error

try

close access f

end try

beep

return

end try

do the actual text munging

set theScript to “perl -pe ” & quoted form of “s/^( *)- /\\\\item\\n\\1 /” & ” < ” & quoted form of POSIX path of fileSpec

set output to “\\begin{itemize}” & crlf & (do shell script theScript) & crlf & “\\end{itemize}”

remove the temporary file

do shell script “rm ” & quoted form of POSIX path of fileSpec

replace

set the selection of (first document) to output

end tell

The code block below “–write selection as a file…” takes the current selection and writes it to a temporary file.  There are ways of escaping troublesome characters on the command-line, but AppleScript is not a very good text editor.  So I went the way of the temporary file, which, as you can see, involves generating a path name, opening the file, erasing the file, writing the selection to the file, then closing the file.

It was here I ran into one of my favorite AppleScript gotchas: the selection has type “unicode text”, while shell scripts have type string.  Because of the extra bytes in unicode strings, my regexes would match only when they were one character long!  This is the kind of thing you can only catch when you examine that temporary file in the Terminal: you get all the extra non-printable bytes and you think “what the hell?  Aw, man, unicode <curse />.”

Casting the selection to a string fixes this.  I suppose it breaks unicode compatibility, but regular LaTeX isn’t unicode compatible anyway (you need XeTeX for that).  I briefly tried casting the shell script string to unicode, but since I wasn’t going to need unicode for the input, I didn’t pursue it beyond one trial, error, undo.

the “echo $$” script in the generation of the temporary file name is a big of unix magic: $$ evaluates to the number of the operating system’s current process.  Basically, the OS starts a new process each time it runs a program, so each execution of the script is going to result in a new process number.  I actually don’t know what process this evaluates to, that of the AppleScript instance or a brand-new one spawned by this line, only to provide its number and die.  The point is it’s unique to the running of the script, so I know that running it twice at the same time (or starting another run of it while one hasn’t finished) won’t result in the script clobbering its temporary files. 

The code block below “–do the actual text munging” is the meat of this script. I don’t want to blog about perl or regexps here (maybe some other time), so let me just leave that with one piece of advice: when nothing else seems to work, add more slashes.

The rest of the script does the actual replacement of the selection with the desired transformed text, then cleans up. 

I haven’t yet tried this with multiple documents open (I can’t figure out how to access “the current document” in TeXShop), but this is a good start.  If you enable the Script Menu and save this script in “~/Library/Scripts/TeXShop/munge.scpt,” you can now invoke it whenever you want.  But to invoke it within the Macro Editor, you can create a macro and put “–Applescript direct” on the first line and delete the “tell” … “end tell” lines.  That changes the behavior of the Macro Editor to run the contents of the macro in AppleScript (rather than insert the text itself into the current document). 

So there’s plenty more error checking to do, but this works and I’m done with it.  Hopefully putting this up will help me remember, and anyone else who needs to do the same thing.


technorati tags:, , ,


Blogged with Flock