Category Archives: Crafting with data

Mashups Final

This here is a Mashups final! This project was started as part of a Crafting with Data project, and has now turned into this.

Future updates: It needs to be integrated with Google News so that its fully cross referenced with all topics, otherwise it’s pretty much as I imagined it.

Current: A News Project!

And here’s a Presentation about it!



So I’ve built myself a little Quincunx machine, but instead of the usual little bouncy balls I’ve decided to make one that is color based. Each particle goes through five iterations of “paint”, getting a coat of eatieh black or white each time before dumping its payload into the puddle at the bottom. The longer it runs, the closer to true grey the puddle turns. Bored of waiting for results? Turn that particle flow way up for a screen full of pretty!

Google Search Data

Pretty! This is some of the prettiest data I’ve ever taken a look at. These here are the trends, for a 24 hour period of the frequency of google searches – not how often these terms are searched for, but how quickly thier popularity is rising. I had expected a couple trends, but nothing like this – just look at this beautiful data!

You can already start seeing some trends here – for example, my script failed twice (I’ve since fixed the bug) and for some reason a lot of people who had been searching for pop culture topics (but not other topics) suddenly stopped at 4 AM for no reason, the started again at 5. I’ve already removed a lot of the noise from this version version – items had to occur for at least 4 hours to show up here, but as you can see it’s tough to pick out individual trends. Try this instead:

This is a little better. Topics tend to start high and then level slowly down as newer more interesting news takes it’s place.

Anyway, this is just a rough version – in actuality this is all just a test for the Stream Graph I plan to make with this data. I have been keeping track of whether or not each item is covered by the NYTimes at the time of its search, so that should make the final product even more interesting.

Want to play with the database or this dataset (including the NYTimes info)? Give me a comment below.

Data questions

Here are some complete, off the top of my head guesses about sme data:

Percentage of people who are “black” on the 1990 Census
35%, +-10

Total egg production in 1965 (number of eggs)
560 Million +_100 Million

Number of airline passengers killed in plane crashes in 1980 worldwide
300 +-100

Percentage of babies born in the US that are girls
50% +-1%

Percentage of entering college freshman whose probable field of study was physical sciences (1990)
8% +-10%

Number of people who watched the 1995 Super Bowl
50 Million +-20 Million

Number of native French speakers in Canada in 1992
10 Million +-5 Million

Number of babies born in the US in 1992
2 Million, +- 1 million

number of abortions in the US in 1992
1 million, +- 50 Thousand

median household income in the US in 1996
40,000, +-20 Thousand

And here are some ideas for data-related projects!

Compare an food bills from before and after NYC instituted the mandatory calorie labeling laws.

Use long-term savings numbers to chart off milestones in your life

How much do people pay in health insurance, vs. how many times a year someone sees a doctor.

Pick a social network and calculate the most common topics posted on.

What are the most searched for topics on a news site, and what are the most reported on topics.

Use output from itunes, Pandora, music folder dates, or other music listening site to tabulate your moods.

Smoking data: round 2

Alright, this is attempt 2 to collect some data on how smoking affects lungs. Last time I was going on force of breath before and after smoking. This time I’m going with length. And I’m finding….nothing. I tested three people, five breaths each before and after, and found nothing. If anything, smoking seemed to raise the average slightly, although this was probably a result of getting to know the test methods better. Either way, I’m willing to say that perhaps there really are no short-term effects of smoking a single cigarette.


Self portrait in data

For crafting with data.These are emails sent to varius family members by myself over the last 4 years. Click for full size.

Some data for you!

2009-09-23 21.55.50 2009-09-23 21.55.50

Here’s some data for crafting! I used a bend sensor to see how hard someone (Carolina) could blow air before and after a cigarette. I ran into a bit of a problem – instead of blowing less hard the second time she just stopped being able to blow about a third of the way though. I suppose technically this would bring the mean reading down, but it’s not quite what I had in mind. I may have to think of another way to measure this.

Check out the data here: Stuff!