^{[1]}John W. Tukey said something about how valuable it is to think about the world while pawing through a set of data: that’s the essence of “exploratory data analysis.” Meaning that in real life, the most fruitful time we spend is when we are mulling about what a set of data might mean. “Concluding” and “confirming” get more press but are a lot less fun and may be much less useful. Back when I was a real data geek at the University of Colorado, I remember getting quite bored when all the wrinkles of a dataset were worked out, but was completely engaged and focused as data retrieval and interpretation were explored.
So I live in awe and some envy and some skepticism at Marc Smith’s Twitter diagrams ^{[2]}. Each one seems like a tour de force, but they always leave me wanting. NodeXL ^{[3]} makes collecting Twitter data so easy, but I always walk away wondering what it is that I’ve seen. It seems to me that no single view of a set of data is interesting beyond all the others: what’s interesting (and useful) is when we can look at from angles, such as:
I think that data about communities and social interaction is even more full of diverse meanings, so we should always resist closing in on “this is what it means.” We need to come up with more stories from our vast treasure troves of data. More statements such as:
