5 Minute Intro to Yahoo Pipes

07.13.08

I'm in the San Francisco airport flying back from a wonderful Foo Camp where I lead a discussion about RSS power user tips. It was a lot of fun. Several of the attendees had never used Yahoo! Pipes, one of the most powerful tools in the RSS toolbox. I told them that I too didn't really learn to use Pipes for a long, long time after I first discovered it because it seemed too complicated for my poor little non-developer's head. Once I was shown just two buttons to push in the service, though, I found out that some great results are actually very easy to achieve using Pipes. Just seeing some one do the simplest things there makes it a lot less scary. In that same spirit, I offer the following 5 minute screencast demonstrating 3 simple things you can do with Pipes. I hope it emboldens you to learn how to do even more with the service, but even if you only feel comfortable doing this much - I believe it will still prove very, very useful. Plus it will keep your toes safe (you'll know what I mean after watching the video below.
(more...)

See other posts about:My Services, RSS, Search, Knowledge Management

The Awesome Potential of the Semantic Web

11.21.07

I just listened to the most amazing podcast about the future of the web and semantic analysis. It was an interview with BYU Phd student Yihong Ding, a researcher in what my ReadWriteWeb co-author Alex Iskold calls "the top-down semantic web." The first 15 minutes of the hour long show are about Yihong Ding's personal background, the next 15 about his research and the last 30 about his very compelling view of the future.

This interview shows just how much untapped potential remains in the world of web applications. Once our software is capable of deriving meaning from web pages it looks at for us, there's a whole lot of work that will already be done, allowing our human, creative minds to reach new heights.


Download MP3 [50 mins, 23Mb]

Ding's research combines the application of a manually supplied ontology (set of terms with connections for meaning), automated analysis of the structure of a web page (what's in h2 tags? that's probably a section title) and learned meaning after repeated application of the above and correction by the user. It's fascinating and a prototype should be available in the first half of next year. I hope to get an early look at it so I can write about it on ReadWriteWeb just before public launch.

The vision of the future described in the interview is beautiful. It's one of the most clear explanations of the semantic web and what some people call web 3.0 that I've heard yet. I'm just starting to dive deep into this, so forgive any excess enthusiasm, but I'm telling you - it's good stuff.

Ding's vision of a future web not of sites and pages but of "educated agents of meaning" (smart software applications is what I'm seeing), driven by human beings to serve our needs, is a really interesting one.

His conclusion makes me think of Google Custom Search, Lijit (which I must spend some time with) and I don't know what else. It's got me on fire, though.

I found the interview through a path you might find of interest. It was highlighted in the blog of Talis, a vendor in the semantic space, in their This Weeks Semantic Web round up. It's a very rich resource, not to mention a great marketing asset for the company. I found that via the blog of semantic web rock star Danny Ayers. I was reminded of Ayers' blog and have picked it back up with a renewed interest after seeing it in a list of 60+ Semantic Web Blogs at Semantic Focus, a fascinating looking group blog where, co-incidentally interview subject Yihong Ding is a regular contributor. So we come full circle and have found a whole lot of valuable resources along the way.

See other posts about:Podcasts, Search, Knowledge Management

Prioritizing your reading list and doing rapid niche research using AideRSS

12 Comments 08.31.07

AideRSS is a service I've wanted to make creative use of for some time. It's neat - you supply an RSS feed and it ranks posts in that feed in order of reader engagement. The company is Canadian, too, and Canadian internet stuff is totally hot.

AideRSS scores each post by the number of comments it received, number of times it's been tagged in del.icio.us, inbound links from a number of blogsearch engines, etc. Thankfully, it scores those posts relative only to other posts in the same feed. So while a post on TechCrunch with 20 comments might score a 5 out of 10, for example, a post on Marshallk.com with 20 comments would score a 10 out of 10! Unfortunately, and this is a big dissapointment, AideRSS is just plain wrong far too often - reporting, for example, completely inacurate numbers for several posts in my feed. Come on AideRSS team, fix these problems. So it's nothing to bet the bank on, but there's some real potential here and as a rough guide it could still be useful today. I've contacted AideRSS to ask why they are getting things wrong as often as they are.

That's all well and good, it's a good way to see which of your posts are getting the most reader engagement (at least via these gestures being measured) and the widget that AideRSS provides is a neat way to highlight your most popular posts - but I know there's a lot more that's possible here.

Tonight I tried something unusual, at least it seemed that way to me. I plugged the RSS feed for items I've tagged "toread" in del.cio.us into AideRSS. It worked! It appears that the service figured out which were the hottest items in my feed. What a handy way to prioritize! I could grab scored RSS feed from AideRSS, including "good posts", great posts or only the best posts. Here's a widget displaying the best posts currently in my "toread" feed, according to AideRSS.



Isn't that cool? Obviously it would be nice if users could define the number of characters and items displayed in that widget and the metrics used don't capture anything personalized - but nonetheless, I think there's some real potential here. (The numbers fetched aren't always accurate, either - hopefully that will improve.)

Here's an idea I thought of previously: say you're looking to identify some of the top blogs in real estate. (Woo hoo!?) I would recommend starting at http://technorati.com/blogs/real_estate and sorting from authority. There's an export in OPML link there, which unfortunately will not give you anything other than the top 10 blogs in that category no matter what you try to do, but you can import that OPML into AideRSS. You can then see the hottest posts in each blog, in other words: you can get a feel for what that blog's community of readers takes interest in. So Technorati+AideRSS = easy identification of the biggest interests of top niche bloggers' reading communities. Sounds invaluable to me.

These are the kinds of ideas I help come up with and implement with my consulting clients; though we wouldn't want to depend too much on a tool that's as loosely accurate as AideRSS is today.

If this general idea is of interest to you, perhaps more for personal use than marketing purposes, see also Rogers Cadenhead's recent post on APML - Attention Profiling Markup Language. I tagged it in my blog and shared items feed, which you might like to subscribe to.

Thanks for reading.

See other posts about:Advertising, Reviews, RSS, Search, Blogging, Knowledge Management

The best things about Technorati

12 Comments 08.17.07

Technorati CEO Dave Sifry stepped down yesterday and the news gave cynics another opportunity to talk smack about blog search in general. There are a handful of things I really like about Technorati and I think the company deserves a bit of defense. If Technorati takes a dirt nap, I'll be bummed for a number of reasons. (I've had the phrase "dirt nap" stuck in my head for weeks and am very relieved to have the chance to use it here!)

It's not the full text search of blog posts that Technorati is really good for. Google Blogsearch is faster if you want to know if anyone has beat you to a story and Ask.com has much better spam control as it only indexes feeds that have a certain number of subscribers in Bloglines (hello, Google Reader and Blogsearch teams). Technorati has created a whole bunch of awesome experimental features, some of which worked and some of which didn't. I don't know how many of the people behind much of that innovation are still at the company but I hope things brighten up over there in the future.

What is Technorati good for? First, the Blog Index section of the site is very useful. Go to http://technorati.com/blogs/wtfeveryourelookingfor and you'll find blogs that have been tagged as a whole, not on the level of a single post, by their own authors. Sort by "authority" (shudder) and you'll see the ones with the most inbound links. I was talking to a potential client on the phone last week he asked "are there a lot of real estate blogs?" I knew anecdotally that there were, but quickly visiting http://technorati.com/blogs/real_estate told me there were more than 12,000 in Technorati alone! The Blog Index makes it easy to see which, by one standard, are some of the top blogs in any niche. It's not perfect but it's a good start.

Unfortunately, OPML export of anything more than the first 10 results of these searches isn't possible. That looks to me like broken functionality and as the company slashes staff I have to worry that there's little hope of the best parts of the service being maintained or improved upon.

The second cool thing about Technorati is the company's partnerships with outside traditional large publishers. Specifically, the kinds of relationships they've built like the one with the Washington Post. In some sections of the WaPo website, you can see blogs linking to that article displayed in a little box, curtosy of Technorati. If those are sorted a bit for spam and crap then that becomes great stuff. I know that Sphere is providing related functionality on some sites, but it's not the same. The ins and outs of this sort of service deserve a big blog post in and of themselves.

Finally, the Technorati 100 is a good thing. I know there's a whole lot of criticism of it and a lot of that is valid. I don't like the word "authority" and I don't like measuring authority by links - but linking does mean something and the fact that Technorati shows off a leader board of that metric is worthwhile. FeedBurner ought to too, if the group feels like separating out blogs from the other feeds they publish.

I know that Technorati has been painfully slow at times, the most recent site redesign is awful and the focus on inbound links is overdone - but it's an important company that deserves support in my opinion.

See other posts about:Reviews, Search

Want a custom Web 2.0 search engine? Here’s one!

3 Comments 06.12.07

I'd never used Google Co-op before today. Thanks to a twitter reply by Josh Bancroft in response to one of my questions, I just did. (Turns out it was Rollyo I was looking for, but I don't like it as much so far.) If you'd like the ability to do a Google search inside the following leading web 2.0 sites - see the tool below.

"When, magic 8 ball, has my search term been used on..."

LifeHacker StartupSquad TechCrunch GigaOm Mashable PaidContent ArsTechnica CenterNetworks FranticIndustries ReadWriteWeb NewTeeVee and what the heck - http://marshallk.com !

Just drag this link to Marshall's Magic Search to your browser toolbar or add it to your favorites and kapow! you're searching some big blogs for company names, concepts, whatever! I regularly search TechCrunch for past posts on things I'm writing about, just by dragging the URL for a google search for site:http://techcrunch.com to my toolbar. Now I can do so much more.

Try it out:





Google Custom Search

See other posts about:Search

Rootly Relaunches - Looks Awesome

7 Comments 05.21.07

One of my consulting clients, a news search engine called Rootly, relaunched this afternoon and I'm so proud of them!

Rootly founder Mark Daher and I worked together to improve the aesthetics, functionality and differentiation of the service. It's been some time since I sent him my final recommendations and today the site looks totally unlike it did at the time.

The service provides highly customizable, RSS powered vertical news search based on about 1k preselected sources, plus any sources you add by feed. When a source is added by a sufficient number of users it gains trusted status and enters the general index. The search result feeds are good, there's really easy internal bookmarking, commenting and friends. The best part of it: Rootly accepts OpenID! I can't take any credit for that, but thank goodness! Who wants to create a new account for every service you want to try out? Not me. (I use MyOpenID, personally. It's great and local to Portland.)

In the near term future the site will allow OPML import - which has a whole lot of implications - and a customizable widget for personal startpages.

For more information about the relaunch, see the review at CenterNetworks and more details on the Rootly blog.

See other posts about:My Services, Search

Ask goes nuts on local search - again

2 Comments 03.07.07

Ask.com announced an upgrade today to their already impressive local search tool. Now you can draw a shape on the map with a drawing tool and limit your search to inside that shape. They do so many impressive things over there, yet they are so far behind in market share. Is it too complex? Like the blogsearch tool, I don't even use it myself but it's so smart! They filter out blogs that don't have at least a small number of subscribers in Bloglines. Goodbye blog spam in search results! I should start using them more myself.

See other posts about:Search

Now You Can Search YouTube Audio with Podzinger

5 Comments 01.03.07

I just wrote a review over at SplashCast of speech-to-text search engine Podzinger's new feature to search YouTube. It's very impressive and wanted to make sure readers here knew about it too.

Results are different from searching YouTube metadata, so subscribing to feeds for both searches would probably be a good idea. There are a number of ways to do that, including Vixy's YouTube RSS generator or through the official capacity with an URL like this: www.youtube.com/rss/tag/monkey.rss That's of course most useful if you want to subscribe to YouTube videos tagged "monkey."

How many people are going to want to subscribe to searches for words used in YouTube? A whole lot, I think.

See other posts about:Podcasts, RSS, Search

Goog sells Baidu shares

4 Comments 06.22.06

Google sold their 5% pre-IPO shares of Chinese search giant Baidu, it was reported today. I guess that means no buy-out and moves instead to increase Google share in China. Or maybe they'll just give up on total world domination and work on dominating search everywhere else. For what it's worth, the shares were bought for $5 mill and were worth $63 mill at the end of May when the sale actually went through. That's a whole lot of AdWords clicks that don't have to happen, I suppose. Just a quick note in case it's of interest; I find anything about non-US web giants of interest.

See other posts about:Search

Google may listen to your TV, but not too closely

2 Comments 06.06.06

Google Research on "Social- and Interactive-Television Applications Based on Real-Time Ambient-Audio Identification"

The Google Research team at last week's Euro ITV (the interactive television conference) won the best paper award for research just posted to the Google Research blog. Their topic? Personalized experiences synchronous with mass-media consumption. That means a system where your computer listens to the TV in your living room, compresses the sound for comparison to a Google sized audio database and then offers you services online related to whatever you are watching.

This does not appear to be functional yet, but the paper also seems to assure readers that it does not require much new technology either.

Google TVAdvertising? Wasn't discussed. The examples the Google scientists provided fell into the following four categories:

  • personalized information layers
  • ad hoc social peer communities
  • real-time popularity ratings
  • TV- based bookmarks

Of course advertising can be contextual to any of those, as is shown in the hypothetical screenshot above from the Google paper. There will also be the option of selecting Two Minutes Hate worth of advertising in exchange for access to premium content. Just kidding about that part. The rest of this is real, though.

"If friends of the viewer were watching the same episode of ‘Seinfeld’ at the same time," the paper says, "the social- application server could automatically create an on- line ad hoc community of these 'buddies'."

The paper assures skeptics that the privacy will be technically ensured.

The viewer’s acoustic privacy is maintained by the irreversibility of the mapping from audio to summary statistics. Unlike the speech-enabled
proactive agent by Hong et al. (2001), our approach will not “overhear” conversations. Furthermore, no one receiving (or intercepting) these statistics is able to eavesdrop, on such conversations, since the original audio does not leave the viewer’s computer and the summary statistics are insufficient for reconstruction. Further, the system can easily be
designed to use an explicit ‘mute/un-mute’ button, to give the viewer full control of when acoustic statistics are collected for transmission.input-data rates. This is especially important since we process the raw data on the client’s machine (for privacy reasons), and would like to keep computation requirements at a minimum.

There's no mention of localized versions for China, for example. Can the US government be trusted not to demand access to this kind of data? No. I can imagine the privacy concerns here are going to be huge. People may go for it though. I am open to the idea, but I don't think I like it. GMail's contextual advertising doesn't scare me though.

This seems like a recipe for nothing but shopping and superficial interaction. I suppose I could debate with people in my "snobby snobs" group about the veracity of a History Channel show. So maybe I'm wrong.

One way or the other, this seems like a pretty viable vision of the future.

See other posts about:Reviews, News, Search