Friday, March 23, 2007

First post!

After hitting ExtremeTech, Slashdot, Digg, and a *ton* of other blogs in the past week, I've received quite an outpouring of interest about gCensus, so I figured I'd set up this blog to try to keep interested folk informed about the development work on gCensus.

Hardware
Loyd dropped off his old PC to me last week, and I've been working on setting it up to be the new gecensus.stanford.edu. Right now I'm stalled, waiting for a new 400GB hard drive to arrive in the mail so that I can have a matched RAID 5 set in the machine - 800GB of storage.

Data Sources
Once I get the new gecensus online, I'm going to be aggressively adding more states' Summary File 1 data into the database. There are a couple problems here that I'd like to resolve - for example, the ESRI shapefiles at the block group and tract levels have some corrupt metadata that makes it impossible to identify certain areas correctly.

I'm also looking at adding the Summary File 3 (income, etc.) data. Since the data file format is very similar to Summary File 1, this might be doable without too much trouble - I still need to look at it more closely.

I've had generous offers to help with adding new data sets. I'm still trying to work out good projects for some of those who have offered, but one project I'm excited about is a tool that we're hoping will let gCensus import data from common GIS formats, so that I wouldn't have to write a one-off import script for every new format that comes in. One of the problems is that I'm not a GIS specialist (yet, anyway), so I'm not very familiar with the popular formats out there - if someone reading this can offer advice, that'd be great.

Frontend
I'm interested in setting up some new capabilities on the frontend - like multidimensional data plotting, and some alternative means of visualization. I'd also love to be able to do a cleanup of the user interface, as it could stand to be prettier and implement backtracking in a useful manner. If you're a Web pro and want to help out with a public-service open-source project, let me know!

That's a pretty good summary of what's going on right now. gCensus is definitely in the growth phase - trying to set up some new collaborations, fork off a bunch of projects - so the near future should be exciting!


(P.S. - If anyone reading this is interested in supporting the project with hardware donations, my biggest needs are in the storage department. New hard drives, or a proper RAID controller that can dynamically grow arrays like the Areca 1210 or 1220, would be awesome. I don't really expect one of these to fall out of the sky, since they're not cheap...but that's the sort of thing that would be great to handle multi-terabytes.)

No comments: