I have left Harvard as of July 1, 2008 to take a position at NYU. This website has been cached and left static. Feel free to browse my new website, aka "What the heck is a Clinical Associate Professor?"

09.03.07

Running arbitrary text munging commands in TeXShop

Posted in Computing, LaTeX at 12:15 pm by leingang


File this under “If I blog about it, it might actually be worth the time it took to figure it out”…

I use TeXShop to handle my LaTeX documents. It works, it’s mac-like, and it’s pretty flexible. One thing I like about it is the Macro Editor, where you can define some boilerplate strings to insert in the document (and tie keyboard shortcuts to them). The Macro Editor has some simple replacement features: for instance #SEL# will get replaced by the current selection. In this way, you can define a macro which wraps an enviroment around whatever’s currently highlighted. The Macro Editor also allows you to write AppleScript code and execute it.

Another thing I like is the regular-expression enabled find and replace dialog. Anything you can write with a sed one-liner, you can use to transform text.

But what I needed was to be able to do this:

  • highlight text in TeXShop
  • run a macro that transforms that text (something more complicated than a single regex search-and-replace), probably running AppleScript which in turn would run a shell script.
  • replace the selected text with the transformed text.

The actual job I wanted to do was to take the text exported by OmniOutliner, which is plain text with items indented and marked by “-”, and convert it to the LaTeX itemiz environment.

The Google wasn’t helping me with examples on how this is done, so I had to figure it out myself. First, I forgot about writing the macro inside of TeXShop; I used Apple’s built-in (at least I think it’s built-in, if not, you have to install the giant XCode package) Script Editor.

Writing AppleScripts is a pretty trying process of wading through Script Dictionaries and reading the Result Log and multiple, multiple trials and errors. In fact, Chapter 3 of AppleScript: The Definitive Guide shows that this is actually the Best (available) Practice (for some reason that got left out of the latest edition. Pity.) But I got it, eventually:

convert Omni Outliner exported text to LaTex itemize environment

Matthew Leingang

2007-09-03

set crlf to

tell application “TeXShop”

set theSelection to the selection of the first document

set input to the content of the selection of the first document as string

write selection to a file to avoid quoting issues

putting “as string” in the definition of input makes this file an ascii file rather than a unicode file

set fileSpec to (path to temporary items folder as string) & (do shell script “echo $$”)

try

set f to open for access fileSpec with write permission

set eof f to 0

write input to f

close access f

on error

try

close access f

end try

beep

return

end try

do the actual text munging

set theScript to “perl -pe ” & quoted form of “s/^( *)- /\\\\item\\n\\1 /” & ” < ” & quoted form of POSIX path of fileSpec

set output to “\\begin{itemize}” & crlf & (do shell script theScript) & crlf & “\\end{itemize}”

remove the temporary file

do shell script “rm ” & quoted form of POSIX path of fileSpec

replace

set the selection of (first document) to output

end tell

The code block below “–write selection as a file…” takes the current selection and writes it to a temporary file.  There are ways of escaping troublesome characters on the command-line, but AppleScript is not a very good text editor.  So I went the way of the temporary file, which, as you can see, involves generating a path name, opening the file, erasing the file, writing the selection to the file, then closing the file.

It was here I ran into one of my favorite AppleScript gotchas: the selection has type “unicode text”, while shell scripts have type string.  Because of the extra bytes in unicode strings, my regexes would match only when they were one character long!  This is the kind of thing you can only catch when you examine that temporary file in the Terminal: you get all the extra non-printable bytes and you think “what the hell?  Aw, man, unicode <curse />.”

Casting the selection to a string fixes this.  I suppose it breaks unicode compatibility, but regular LaTeX isn’t unicode compatible anyway (you need XeTeX for that).  I briefly tried casting the shell script string to unicode, but since I wasn’t going to need unicode for the input, I didn’t pursue it beyond one trial, error, undo.

the “echo $$” script in the generation of the temporary file name is a big of unix magic: $$ evaluates to the number of the operating system’s current process.  Basically, the OS starts a new process each time it runs a program, so each execution of the script is going to result in a new process number.  I actually don’t know what process this evaluates to, that of the AppleScript instance or a brand-new one spawned by this line, only to provide its number and die.  The point is it’s unique to the running of the script, so I know that running it twice at the same time (or starting another run of it while one hasn’t finished) won’t result in the script clobbering its temporary files. 

The code block below “–do the actual text munging” is the meat of this script. I don’t want to blog about perl or regexps here (maybe some other time), so let me just leave that with one piece of advice: when nothing else seems to work, add more slashes.

The rest of the script does the actual replacement of the selection with the desired transformed text, then cleans up. 

I haven’t yet tried this with multiple documents open (I can’t figure out how to access “the current document” in TeXShop), but this is a good start.  If you enable the Script Menu and save this script in “~/Library/Scripts/TeXShop/munge.scpt,” you can now invoke it whenever you want.  But to invoke it within the Macro Editor, you can create a macro and put “–Applescript direct” on the first line and delete the “tell” … “end tell” lines.  That changes the behavior of the Macro Editor to run the contents of the macro in AppleScript (rather than insert the text itself into the current document). 

So there’s plenty more error checking to do, but this works and I’m done with it.  Hopefully putting this up will help me remember, and anyone else who needs to do the same thing.


technorati tags:, , ,


Blogged with Flock

Comments are closed.