Download - Banal Because Format Checking is So Trite
BanalBanal
Because Format Checking Because Format Checking is So Triteis So Trite
Geoffrey M. VoelkerGeoffrey M. VoelkerUniversity of California, San DiegoUniversity of California, San Diego
Workshop on Organizing Workshops, Conferences, and Workshop on Organizing Workshops, Conferences, and Symposia for Computer Systems (WOWCS’08)Symposia for Computer Systems (WOWCS’08)
This Talk is This Talk is Not Very InterestingNot Very Interesting
Banal is a format checker for PDF documents
Deduces how a document was formatted Optionally compares it with a specification
Intended for conference management systems Now being used in HotCRP and EDAS Seemed timely to document its genesis and implementation
April 15, 2008 WOWCS’08 2
Why?Why? Preserving reviewer anonymity
Acrobat javascript that calls home when pdf is loaded Assisting conference management tasks
Ensuring anonymity rules Possibly helping do initial assignments by mining the bib
Fairness Everyone else obeyed the rules…
Time Already enough time spent on reviewing Frustrated that abuse meant taking even more of my time
April 15, 2008 WOWCS’08 3
How?How? Convert PDF
To XML (with pdftohtml)
Track the locations of all segments of text, essentially form bounding boxes
Compute margins, columns, body font, etc. Heuristics for page #s, headers, footers, etc.
April 15, 2008 WOWCS’08 4
Where?Where? A handful of SIGOPS/SIGCOMM conferences
OSDI’06, SIGCOMM’07, SIGCOMM’08 Eddie Kohler has integrated it into HotCRP
Henning Schulzrinne also integrated banal with EDAS Since 2006, used for over 800 events
April 15, 2008 WOWCS’08 5
So?So? What are our community goals for having formatting
requirements? Evil: Annoying trifles that negatively impact our ability to
communicate our results and ideas? Helpful: Reflect practicalities of publishing costs and
community time? Not surprisingly, I’m in the practical camp
April 15, 2008 WOWCS’08 6