Saturday, January 19, 2013

Flash of insight

To put you in my frame of mind, I have three months until my defense and three chapters to analyze data for and write about.

All week I've been working on chapter 5, which is about the database. At my committee meeting in December I presented some very rough data explorations along with the general idea for what the chapter would be about. I did emphasize, however, that the database is a virtual gold mine of projects so the most important thing for me to do is pick one doable piece for my dissertation that will mark my interest in the field, then continue with this line of research post-PhD. I pointed out that there was a fairly major complication with my chapter idea and I wasn't sure how to adjust for it. No one had any suggestions. Chip didn't really like the idea I presented, and pointed out some holes in the theory it was based on. He suggested something else instead.

Unsure how to implement Chip's idea, I spent this week forging ahead with mine. I spent hours in R paring down the data. As the dataset shrunk and the complications became even more obvious, it became clear that my idea won't work. We don't have the data for it yet.

On Thursday night Jon and I talked about the dismal situation: I had no idea what to do with these data, but I had to figure something out quickly. He suggested a variation on Chip's idea that tackles the complication head-on. I didn't know what to do that wasn't completely descriptive.

On Friday I had a meeting with the database team and Sam. I told Sam that I didn't think my original idea was going to work, and I had no clue what I was going to do instead. Sam said that a strong conceptual hypothesis-testing paper would be great, but he said he doesn't care if I do something descriptive and Herb doesn't care either. I just need to do something with the database. It was liberating news.

So I mentioned some kind of lame ideas and we contemplated those. Meh. Sam asked about an approach used with this kind of data sometimes. I said that I'd thought about it, but I just didn't see how we could possibly do that with our data without some crazy, totally unrealistic assumptions...


...we did it at a totally different scale... And if we did it at that scale, then we could include far more of our data... It could work!

And so an idea was born while I was explaining why it wouldn't work.

I spent my Friday night finding the appropriate R packages and getting the data in the right format for a test run. I think I'm on to something big.

