Data: Making a List of the Top 300 Blogs about Data, Who Did We Miss?

Dear friends and neighbors, as part of my ongoing practice of using robots and algorithms to make grandiose claims about topics I know too little about, I have enlisted a small army of said implements of journalistic danger to assemble the above collection of blogs about data. I used a variety of methods to build the first half of the list, then scraped all the suggestions from this Quora discussion to flesh out the second half. Want to see if your blog is on this list? Control-F and search for its name or URL and your browser will find it if it’s there.

Why data? Because we live in a time when the amount of data being produced is exploding and it presents incredible opportunities for software developers and data analysts. Opportunities to build new products and services, but also to discover patterns. Those patterns will represent further opportunities for innovation, or they’ll illuminate injustices, or they’ll simply delight us with a greater sense of self-awareness than we had before. (I was honored to have some of my thoughts on data as a platform cited in this recent Slate write-up on the topic, if you’re interested in a broader discussion.) Data is good, and these are the leading people I’ve found online who are blogging about it.

How the Blogs Are Ranked

I then ran these blogs through my favorite web service, Postrank, which looks at every post across every one of these blogs and scores them in terms of social media engagement: comments left, inbound links from other blogs, times that link was shared on Twitter, bookmarked on Delicious and more. Postrank then ranks all the blogs in any collection in terms of the amount of social media engagement they have received in recent history. That’s where this ranking came from. Nothing but which sites get included is under my control – so I think I can be objectively proud that my co-workers at ReadWriteCloud have come in at #3. Note that you might find a blog or two here where Postrank’s analysis of its feed needs a reset, because it’s hit an error and returned blank results. That’s what happened with the primary O’Reilly feed about data, and I’ve emailed Postrank to ask them to reset their scoring machine for it. That’s especially in need of remedy given that O’Reilly is working hard on a forthcoming conference all about data called Strata. (I’ll be there, moderating a panel on data-driven journalism.)

After I ran these through Postrank, I pulled down the data the way I wanted it using Needlebase, then put it in this Google Spreadsheet and embedded it here.

I did the same thing with 300 blogs about geotechnology last week – and just like I did then, I’ll ask now: who did we miss? I’d love to get these leader boards built out for several of the top topics ReadWriteWeb covers and turn them into weekly posts, covering the leading and ascendent voices in niche blogospheres covering topics that will change the future of the web and world.

I imagine that Data Blogs may be a bigger world than Geo Blogs, so I may have missed more this time. Let me know in comments if you’d like your blog included in the index and I’ll add it. Or if you know others that ought to be included. Fun times – and thanks for continuing to blog, folks, in this era of 140 character utterances!

Making an Index of the 300 Top Geoblogs – Who Have We Missed?

If I had my choice in the matter, I would just sit around and read blogs about geotechnology all day.  It’s one of my very favorite topics.  I don’t get to do that, but I do track the sector for coverage of unusually interesting news to cover on the general interest site I co-edit, ReadWriteWeb.

To that end, using a somewhat complex process I came up with some time ago, and with the help of former RWW research intern and geo-nerd Justin Houk, we put together the following collection of nearly 300 blogs covering geotechnology.  Then we ran these puppies through Postrank to track the most-talked-about posts from across the geotechnology blogosphere.  We track those, along with the most-talked-about posts from across a number of other niche topics, to find cool news for nerds.

One of the features that Postrank offers is ranking the blogs in any collection by the amount of reader and social media engagement their posts receive. (Comments, inbound links, Tweets, delicious saves, etc.)  That sounds like fun, doesn’t it?  I thought it could be a cool way to help discover up and coming blogs that readers might not know about and more.  I also liked the way that Postrank showed how rankings changed week over week.

So I thought I’d blog about it!  Forgive me if this seems presumptuous (I can’t claim to be an expert in this field) – but it’s the robots doing the ranking!  What I ask of you, site visitor is this:  who am I missing?  Speak up, now or whenever, and I’ll add your georelated blog to the index.

I plan to make a weekly posting on ReadWriteWeb about the top geo blogs, the top movers (up and down, with a caveat or two) and probably some selected articles that were big hits.  I’m planning on doing the same thing with the top several hundred blogs in other topics we love at ReadWriteWeb: Internet of Things, Big and Structured Data, maybe education technology, we’ll see.

I’ve been wanting to figure out a good way to do this for awhile, but tonight I learned how to pull data from Postrank using Needlebase (which I love). Want to see quickly if your blog is included in the following list of 300? Control-F should help you search this page for your blog’s name. Let me know in comments if it needs to be added.

For now, let’s start with geotech. This is a fun list, but let me know who ought to be on it and isn’t.

Post of the Day: Location Privacy and Why It’s Legally Different

Kevin Pomfret, Executive Director of the Centre for Spatial Law and Policy, wrote a very good overview post last week about privacy law concerns with regard to location technology.  The overview is written from a clearly pro-technology perspective, but in a legalistic tone.  Pomfret cautions lawmakers not to create overly broad privacy laws covering location technology, at the cost of innovation.  Check it out:

Location data is just now being used to provide a growing number of critical governmental, societal and business services. The number and value of these services are increasing daily. Attempting to regulate the collection of location data without a full understanding of the technology and its vast potential could have a number of unintended consequences, including limiting the development of a number of critical governmental services. Such opportunity costs should be fully understood and explored before regulating location from a privacy standpoint. In addition, any such legislation should be narrowly tailored so as not to inhibit further growth of this important technology.

via Spatial Law and Policy: Location Privacy – Why It Is Different!.

I think the whole post is quite well articulated, and I agree with the sentiment.  What do you think?

More Beautiful Location Data Made More Beautiful

Some geo data related links I found interesting tonight in a Friday night reading a geo Twitter list in Flipboard.    

NAVTEQ Network for Developers: Check out some of these cool NAVTEQ products

These look very cool – real-time traffic, 3d maps, street sign visualizations and more – available for developers to enrich other location services.  I don’t know what the price point is, how much fun the data is to work with, etc. but I love the idea.

Google Geo Developers Blog: Five Great Fusion Tables + Maps Examples

Google added the ability to include geodata in its Fusion Tables product this week – now this post highlights 5 good examples of maps that you can thus create.  I love me some map publishing – I got to meet the fabulous Pete Warden of Open Heat Map yesterday and was happy to congratulate him on his tool’s coverage in the Columbia Journalism Review.


Google Map Edits Viewed Live…Eventually

The link above is to a new site called Google Map Maker Pulse, where (in theory) you can view live Google Map Maker edits with made to all around the world.  That’s one of the ways Google Maps gets improved over time.  Unfortunately it’s a big 404 right now.

FluidDB » Blog Archive » Importing data into FluidDB with Flimp

The good people at FluidDB (a crazy awesome tool I wrote about here) have built a data importer for their collaborative, dynamic database service and the first data sets they imported are metadata for everything on Data.gov and Data.gov.uk.  They say the (meta)data is now more searchable, cross-referencable and editable now than ever before.  A whole lot of that is geo data.  And what’s not geo data is related to place – because everything is, right?  I should add locations to all my blog posts.  Reminds me of this excited post I wrote this week about Extractiv – a bulk semantic analysis service that you simply must read about if you’re a data-loving geek.

Bonus: My wife found this video tonight – of the winners so far of Google’s DemoSlam contest.

Location: Home in Portland, Oregon.

Well Socialized Analyst Merv Adrian Goes to Gartner

Data analysis and business intelligence analyst Merv Adrian announced on his blog today that he’s going to giant analyst firm Gartner and his discussion of the decision is really interesting. He just spent the last two years independent, is very active in social media and will now join a much more traditional organization.  He’s on Twitter at @merv.

It was just two months ago that Michael Krigsman welcomed Adrian into the Enterprise Irregulars working group.  Other members of the group work in big firms as well.

Adrian credits boutique analyst firm RedMonk with inspiring many of his strategic beliefs about how analysts can participate in social media and offered a good critique of standard practices in response to a James Governor blog post discussing Gartner’s social media last Spring.

As for participation by the old guard, they have a way to go. Just today I heard of an analyst being called out for putting “too much good stuff” in his/her blog. The notion that it might be a way to draw eyeballs to the for-pay content is still beyond all of them. And with rare (though exemplary) exceptions, twitter is for broadcast, not for dialogue; even if they tolerate some limited interaction with those outside the paywall, it’s probably that they aren’t noticing it. They are most definitely not encouraging or motivating it.

That should give you a little taste of what Merv Adrian will try to bring to the biggest analyst firm in technology, and a firm that is widely considered behind the times when it comes to social media.  (Though neither Governor nor Adrian agree with that sentiment.) I haven’t listened to the Sage Circle podcast linked-to at the end of his announcement post yet, but I’m sure that will be good too.

Adrian describes himself as: Technology analyst and consultant, 30 years of industry experience, covering software mostly, hardware sometimes. Former Forrester SVP.

I don’t know Adrian, though I have been following him since putting up this post on ReadWriteWeb about how to follow hundreds of analysts on Twitter with a single click.  Anyone who gets props from James Governor, Carter Lusher and says the kinds of things it looks like Adrian does has got my interest piqued, though.  Good luck in the new gig Merv, and keep blogging.

via Going to Gartner « Merv Adrian’s IT Market Strategy.

This post is the beginning of an experiment wherein I put up quick bits about found links that are too long for Twitter but not quite the right fit to post on ReadWriteWeb.

Want 3 Minutes by Phone on ReadWriteWeb?

Do you want to record a three minute explanation of some important geeky topic, by phone, on ReadWriteWeb?

I’ve been doing experiments with audio using Cinch lately and I listened to a wonderful short-form Gov 2.0 event today while walking my dogs.

I want to try putting these two ideas together and record a 3 minute explanation of something important and interesting to post online. I like the idea of a tight time limit: it puts a premium on succinctness and density of information. Put that kind of opportunity in the hands of an articulate professional and they’ll create a high-value experience.

Let me know if you’re interested in contributing a segment and if I choose you and your topic, I’ll let you know and give you a call. We have a few million people stop by RWW each month & I think at least a handful of them will enjoy listening to something like this a lot. Shoot me an email at Marshall@readwriteweb.com to suggest a topic.

One-Click Blog Community Intelligence Button

I frequently discover new blogs and I want to learn more about them. One of my favorite ways to do that is to see which posts a blog’s own readers have been most interested in. The wonderful service PostRank will check out any blog’s feed and score the posts in it based on number of comments, shares on Twitter, in-bound links from other blogs, etc. and then let you view just the most popular posts from it.

That’s cool but I’ve had enough copying and pasting and typing in postrank.com/main. So I made this bookmarklet: PostRank It

(To tell the truth, the one I made was a step less simple, then I found this page that an even better version.)

Click and drag it up to your browser’s toolbar. Then visit a blog. Then click the magic button. Check it out and click on the drop-down button that says “All Posts” and pick something different. Yay! Then come back and tell me how much you love it. Clearly I’m not the first person to think of this – but I’m not going to let that stop me from making a small post about it.