Archive

Posts Tagged ‘maine’

Introducing GovRake

March 28th, 2008

I combined a couple projects I've been working on, added some features, bought a domain and created GovRake.com.

This site indexes 8+ years of Maine state legislative records from the 119th congress to the 123rd (this session). Each day's session has a detail page with all the bills mentioned as well as a public comment area. GovRake keeps an updated pubic hearing schedule and links to bill details on each hearing. Bill detail pages have public comment areas and link to sessions in which they were mentioned. The front page also contains the public hearing schedule with links to the bill to be discussed and the audio stream for that hearing.

There is a wealth of data available from the state, some of it easier to get at than others. Now that I have a pretty good framework in place, it's just a matter of prying the data from the cold dead hands of the state web services which hold it hostage, and providing some modern services on top of that. My intent is to provide RSS and email updates for hearings, bills and anything else that is updated from time to time as well as a platform for community discussion and research.

This project is in ACTIVE development and there should be new features and improvements every few days.

I'd like to thank my representative, Seth Berry, for helping to fill me in on various state processes and putting up with my extreme political rants. If your government reps don't believe in transparency and accountability, like mine do… keep that in mind when November rolls around.

Leave a comment or email me ( dan-AT-codesushi-DOT-com ) with your question, comments or suggestions.

admin , , , ,

Fun Quote From The Maine Legislative Record.

March 27th, 2008

Ahh.. the
things you find
when you're using unknown data…


Under suspension of the rules, the presiding officer was allowed to wear a sweater.

- House Record 2001-03-31

admin , , ,

Searching The Unsearchable

March 23rd, 2008

There is a big difference between making something available and making something useful… Our house and senate legislative record is a great example of this. There are hundreds of word documents holding the legislative record… current house records, historic house records, current & historic senate records. These documents hold quite a bit of interesting stuff but they are unindexed and a bit hard to find.



So I wrote some code that downloads all those docs, converts them to a couple different formats and indexes them for your searching pleasure. Here it is. This is something I've had on the back burner for a while and just dusted it off a little bit. I have a bunch of ideas for this app… hopefully I will soon be integrating this data with the public hearing data I mentioned in my previous post and a few other bits of public data to make a truly useful resource for activists, legislators, reporters and the public at large. If you have ideas that fit this theme (or not) feel free to leave a comment.

admin , , , , , ,

Getting Groovy With The Maine Public Hearing Schedule

March 23rd, 2008

I wrote a groovy script that grabs the Maine legislation public hearing schedule and outputs all the items. It kinda looks like this:



committee       : LVA
document number : LD-2261
date            : 2008-04-02 13:00
room            : Room 437 State House
bill title      : I.B. 3, An Act To Allow a Casino in Oxford County

committee       : ACF
document number : LD-2262
date            : 2008-03-26 14:00
room            : Room 206, Cross Building
bill title      : H.P. 1626, An Act Pertaining to the Definition of "Milk"

committee       : BEC
document number : LD-2257
date            : 2008-03-25 13:00
room            : Room 208 Cross Office Building
bill title      : H.P. 1619, An Act To Establish a Uniform Building and Energy Code



I'm going to push this stuff into a database or maybe directly into an RSS feed once I grab a few more bits of data to go with it.



the following is the entire script I used to produce the above output… it's nothing too exciting, but shows how I used groovy and tidy to grab some data from several entirely un-styled documents as a first step in providing some structure to that data for future applications.

import org.w3c.tidy.Tidy
import java.text.SimpleDateFormat

// some date formats we'll use to validate and reformat dates later
SimpleDateFormat dateInHtml = new SimpleDateFormat('EEE MMM dd, yyyy, h:mm a')
SimpleDateFormat dateWeLike = new SimpleDateFormat('yyyy-MM-dd HH:mm')

String base = 'http://www.mainelegislature.org'

// this url is for the index page for all public hearings..
// right now it's hard coded to look 180 days from today
String url = base + '/legis/lio/phSched.asp?DAYS=180'

// get the "index" file and tidy it.
download(url, 'out.xml')

// parse the index page so we can rip through it looking for links we care about.
def indexPage = new XmlParser().parse(new File('out.xml'))

// find all the links in the main hearing schedule document that
// link to specific committee schedules.
// TODO - once I find all the committee codes I can just go after them without this loop
indexPage.depthFirst().grep{ it.'@href'?.contains('phSched.asp') }.each {flag ->
    // find the committee code from the url... we will do something with this later
    String cmty = (flag.'@href' =~ /.*?CODE=(.*?)&.*?/)[0][1]

    // download and Tidy the HTML schedule for each committee
    // this url looks just like the index url with the addition of a
    // "CODE" param which contains a unique code for the committee
    download(base + flag.'@href', 'schedule.xml')
    // parse this sucker
    def scheduleNodes = new XmlParser().parse(new File('schedule.xml'))

    // there are no classes or ids to use to navigate this document..
    // so we have to go by structure...
    // we are looking for all table rows with 4 columns
    // where the first column does NOT contain the string "LD"
    scheduleNodes.breadthFirst().findAll {it.name().localPart == 'tr' &&
            it.children().size() == 4 &&
            !it.children()[0].value().contains('LD')
    }.each { row ->
        // each of the rows we found has the following 4 columns

        // the LD number (legislative document number)
        String ld = findText(row.children()[0])
        // the partial title of the bill... not much use as it is truncated.
        String title = findText(row.children()[1])
        // the date and time of the hearing
        String date = findText(row.children()[2])
        // the room and building in which the meeting will take place.
        String room = findText(row.children()[3])

        // for some reason the white space character right before the am/pm in
        // the html doesn't appear to be a space... so we'll replace it with one.
        // then parse the date into something we like more
        date = date.replaceAll('.pm$', ' PM').replaceAll('.am$', ' AM')

        // print out some details for now
        println "committee       : $cmty"
        println "document number : LD-$ld"
        println "date            : ${dateWeLike.format(dateInHtml.parse(date))}"
        println "room            : $room"
        println "bill title      : $title"
        println ""
    }
}
new File('schedule.xml').delete()
new File('out.xml').delete()

// pass this a node and it takes the first child until it finds a text node and returns that.
// also replaces line breaks with spaces... so... watch that.
private def findText(node) {
    def var = node;
    while (var.class.name != 'java.lang.String' && var.children().size() > 0) {
        var = var.children()[0]
    }
    return var.replaceAll('\n', ' ');
}

// this will grab an html page (url)
// then run it through Tidy to clean it up and save it to outFile.
def download(String url, String outFile) {
    // temporary file which will contain the html in need of a good tidy
    File tmpOutFile = File.createTempFile('out', '.html');

    // write the url to the tmp file
    def file = new FileOutputStream(tmpOutFile)
    def out = new BufferedOutputStream(file)
    out << new URL(url).openStream()
    out.close()
    file.close();

    // run tmp file through tidy
    Tidy tidy = new Tidy();
    tidy.setQuiet(true);
    tidy.setShowWarnings(false);
    tidy.setMakeClean(true);
    tidy.setXHTML(true);
    tidy.parseDOM(new FileInputStream(tmpOutFile),new FileOutputStream(outFile));

    // delete the temp file.
    tmpOutFile.delete();
}

click “expand source” to see the script.



This is one of several irons I have in the fire related to increasing transparency in my local government and providing a modicum of usability to our amazingly scattered and 1996 looking Maine state web services.


admin , , , , ,