Loading...

The World of Mashups

The MashupCamp is unlinke most conferences - according to the organizers its “The Unconference for the Uncomputer” - which roughly means - nothing is organized, there are no schedules, and its a Geekfest of mashup developers building/dissecting and boasting about their Mashups.

The ubiquitous nature of Feeds and the changed world order in which most companies (Yahoo, Google, MS, AOL etc.) are toppling each other to provide “Free” APIs to all sorts of data/applications - has created a revolution of sorts with a large contingent of developers building gadgets/widgets with some even dreaming of creating a “startup” with these offerings.

This was my second mashupcamp. However, this time, I had a demo/presentation and a tech session for our newly launched myAOL (and Magnet). It was sobering and a little daunting to note that unlike other conferences you had to “fight” for a room - put up your proposed talk/demo and let the other developers decide whether they would like to attend. The presentation can be found here (http://dev.aol.com/presentations/mashupcamp_feeds/index.html). The discussion went well - topics covered included “standardization of widget formats”, “authentication and ACLs for widgets/data”, etc. Here is an attendee’s view of the proceedings (http://www.column2.com/2007/07/mashup-camp-iv-day-1-aol-and-feed-mashups/)

Some of the interesing sessions/mashups were

  • Google’s Mashup Editor (http://code.google.com/gme/) [you need to sigup in order to use this beta product]
  • An integration of LignUp’s VOIP service with TWiki to allow for the integration of voice information into the wiki
  • Talk on OpenID by Kevin Lawver(http://presentations.lawver.net/philosophy/mashup_university_tapping_the/)

As mentioned above, there were “startup entrepreneurs” too such as one couple (Raghu and Veena) who built a mashup using Movie data. The site (http://www.seegest.com) was fully implemented by this couple using data from Amazon, Netflix etc. Its got promise, but the business model around it is as yet unformed (and yes, the developer was aware that the oft-quoted “I will integrate with Adsense” will only pay for the hosting bills at the most).

Here is a screenshot of the application. It pretty much says it all.

Seegest Mashup

While on the subject of conferences, Daya Baran of WebGuild was looking around for potential conference themes - Web 2.0 already looks old and jaded, and everyone and their neighbors have done Social Networking to death - dont be surprised if this years theme is on challenges of Social Networking, mostly revolving around identity, security and portability ;-)

Start Slide Show with PicLens Lite PicLens

Posted on July 23rd 2007 by stonse

Filed under mashup | No Comments »

AOL’s Personalization Product

Well, finally, the product that I have been working on for the past few months got launched as a Beta. Developing Mgnet (http://beta.my.aol.com/page/mgnet) (pronounced “Magnet”) has been a lot of fun.

Magnet Beta

Personalization has come a long way since the Web 1.0 days. There is still a long way to go, but AOL’s take on it via this Beta launch is the first step on the road to a personalized product with elements of social networking (for the company).

Mgnet is an innovative way to discover and consume content (news, blogs, videos etc.) in an easy to use, intuitive, light-weight manner. Along the way, you will get Recommended content that the system generates. You can of course find your own content by using the “Mgnet Search”. Its not meant to be a serious news reader. For that there are other services which do a much better job. Its not meant to be a RSS feed reader, which is mostly used by “geeks”. The suite though, does provide a feature rich RSS Feed Reader embedded as a Tab for all the tech-savvy audience.

And of course, all of this is housed inside myAOL, a personalized widget hosting system.

Here are a few mentions in the blogosphere.

  • TechCrunch: [http://www.techcrunch.com/2007/07/10/aol-launches-three-new-myaol-products-into-beta]
  • Jeremy O.’s Web Strategist blog [http://www.web-strategist.com/blog/2007/06/25/a-tour-of-myaol/]
  • Washington Post  : ‘The more interesting feature is the second tab, which the Dulles company has dubbed Mgnet. Pronounced Magnet’
  • NET’s Webware : ‘the real surprise here is Mgnet. This is one of the cooler things I’ve seen lately’
  • Rev2.org : ‘With so much content on the web, it becomes hard to sift through and find the kind that most interests you. Mgnet offers a great visual recommendation and discovery tool to help you find what you’re looking for.’
    ortable personalization platform.’

Give the new myAOL a try! And to those of you who do not have or like to use “Screennames”, the product can be used “anonymously” :-)

Start Slide Show with PicLens Lite PicLens

Posted on July 10th 2007 by stonse

Filed under Personalization | No Comments »

Timing based attacks

With the popularity of AJAX and Web 2.0, a lot of web sites are dependent on users submitting rich content data and process them using AJAX calls. The family of attacks termed as Cross-Site-Scripting attacks (or XSS for short) is well known with many articles and blogs discussing them along with known solutions and checklists.

e.g. http://en.wikipedia.org/wiki/Cross-site_scripting

What is not that well known in the developer community is the family of attacks known as “Timing based attacks“. Andrew Bortz et. all from Stanford University discuss these set of attacks in their paper hosted at http://www2007.org/papers/paper555.pdf.

To summarize,

  • There are two types of Timing based attacks
    • Direct Timing Attacks
    • Remote or Cross Site Attacks
  • Direct Timing Attacks
    • The idea here is to determine whether or not a username or more popularly an email address based loginid is present/accepted at a web site. Ever notice that when you log on to most systems (including your local windows box), it takes longer for the system to return back with an error message when you have entered the wrong password? Similarly most web sites take a discernably different amount of time to respond to a valid loginid v/s one that is not valid. Using this method an attacker for e.g. can find out whether user@somedomain.com is a valid user at say Bank Of America Web Site. It then becomes easy to send phishing emails to the user to further propagate the attack.
  • Remote or Cross Site Attacks
    • Here the idea is to use some browser based timing mechanisms to determine how long (on an average) it takes for a user to load any given web site. For e.g. The HTML tag for images, “<img src=”..”> can be used to time the amount of time it takes to load the destination site for the user. Using this information, the paper shows that they have been successful in finding private information such as the number of items in a shopping cart, the number of pictures in a user’s private gallery etc.

What can be done to prevent of protect against these attacks?

  • The authors of the paper suggest using a server side time padding that creates a more uniform response time
  • My own observation is that although using email addresses as a login ID is very popular for many reasons, its becomes easier for timing based attackers to use this mechanism to send your users spams and phising attacks. Its better to use other types of Identifications such as OpenID (http://openid.net/)

Posted on June 26th 2007 by stonse

Filed under Web Architecture | No Comments »

Surface Computing and Multipoint devices will kill the mouse? (see comment for Post)

Posted on June 1st 2007 by stonse

Filed under Devices | 1 Comment »

Opening up User Profiles is harmful? (for Personalization systems)

YourNews

Jae-wook Ahn from the University of Pittsburg, presented a paper on a News Aggregation and Recommendation site (http://www2007.org/program/paper.php?id=602) and concluded that based on their research, exposing a user’s online profile (data collected by the system based on user’s news reading habit) and allowing the user to tweak the profile was more harm than beneficial.

Jae-wook’s team developed a service (not fully functional/open to public), deployed at http://ir.exp.sis.pitt.edu/gale/news that offers news recommendation. Registration is simple: a matter of assigning yourself a username and choosing a password (yet another web service with a username and password to remember - hopefully OpenID will become popular - but thats a topic for another post).

A chat with Jae-wook revealed that the system currently indexes some 60 odd news RSS feeds. It uses the popular TF x IDF based approach for profiling content. It then exposes the user profile thus collected based on the user’s news browsing habits.
The system apparently has a concept of Short Term and Long Term profiles. It also apparently takes into account the amount of time the user spends reading the news (not quite sure how they do this or how accurate the information collected is).

A cursory look at the system did not impress me. It mostly is a work in progress. Notice the Tag Cloud that is dynamically updated based on the stories you read?
Well, the system provides the user with a method to disagree with any of the “features” in the user’s Tag Cloud. One simply clicks on the feature in question - and a strike mark appears against the said feature. The UI then refreshes to show a new set of stories apparently adapted to the new User Profile.

The paper (http://www2007.org/papers/paper602.pdf) concludes that exposing the User Profile as described above was clearly harmful for the system performance (well, at least based on their study). It also offers that in case one wants to offer up User Profiles for editing, some argument can be made in favor of removing terms from a User Profile as opposed to adding terms.

Whether or not this is true is not-conclusive to me. For starters, the authors claim that they still have numerous problems that have not been fully addressed, including duplicate news and other related issues.
Other studies (not by the YourNews team) have shown that giving control over to the user over his own profile helps in gaining the confidence of the user (as the user might be apprehensive of the information collected on his behalf)

The jury in this case is still out ….

Tags: ,

Start Slide Show with PicLens Lite PicLens

Posted on May 15th 2007 by stonse

Filed under Personalization | No Comments »

Many roads to Personalization and Recommendation

It appears that the next wave of innovations after the Web 2.0 became “so last year”, is “Personalization“.

There are many approaches to Personalization: Filtering, Rules based, Collaborating Filter based etc. Which one suites your needs depends on a multitude of factors, including the type/nature of your user base, the size of your user base and the nature of the content that is being offered for personalization.

Google is the king of general search. After Google’s success, a number of startups are vying for the market share of “vertical searches”. A hot area of research and study these days is the Holy Grail of Personalization - Personalized Search.

Most recent efforts in the area of Search concentrated on figuring out the “intent” of the user - e.g. when a user typed in “madonna” - did he mean Madonna the singer or Madonna the religious figure. Most efforts tried to obtain data on the popularity of the pre-disambiguated  query term. A number of features were added to the Search Engines to help address this - AOL SmartBox/FindAnything, Google Suggest etc. being some  of them.

Personalized Search is the next frontier. The goal here is to know the user, his/her past behavior in terms of search, browsing and general interaction with the network. Based on this information, the next time the user queries the system, results will be tailored to favor the user’s perceived profile that the system has collected thus far.

Recommendations come in many flavors too …
One can recognize these offerings in various web sites with sections such as “People who have bought this, have also bought ….”, “You may also like …” …

Some of the pouplar sites that offer Recommendations are

  • Netflix : Offers recommended DVDs for Rent based on your previous rental habits and ratings
  • Amazon: Offers recommendations on products to buy based on items other users have bought (for the item you are currently browsing)
  • Pandora: Offers music recommendations

A list of services that offer Personalized News/Recommendations are

Some other services that recommend news stories using techniques such as collaborative filtering are  TailRank (a collection of hottest news in the Blogopshere),  MSNBC Newsbot, and Spotback an Israeli startup that does rating based News/Blogs recommendation.

A couple of papers were discussed at the WWW2007 Conference at Banff.

YourNews is a news recommendation service built by graduate researchers from the University of Pittsburg. (Watch this blog for a detailed discussion on this service)

Google’s Mayur Datar presented the paper “Google News Personalization: Scalable Online Collaborative Filtering

I did have a chance to discuss some of the challenges in the area of Personalization with Mayur and Ashutosh. One of the topics of discussion was in the area of “Negative Feedback”.
Basically, Google News mostly uses Collaborative Filtering techniques to recommend content. In case a user does not like whats being recommended, how does he go about informing the system about it? Mayur said that while they did experiment with collecting the “non-clicks” as a form of negative feedback, they haven’t done anything extensive in this area.
Another question was on the merits of other forms of Recommendation algorithms such as the Naiive Bayesian Algorithm, the Feature Extraction (Bag of Words) based algorithms etc. Mayur admitted that while they did look into this, they did not do any substantial use of these techniques and did not offer any reasons for omitting these methods.

All in all, it was a very intense set of sessions in this fast growing, hot area. Watch out for more products and services on the Web in this area in the very near future ….

Posted on May 14th 2007 by stonse

Filed under Personalization | No Comments »

Wanna bid for a boeing 747?

Y! Research Plenary talk

Ever noticed those google/yahoo sponsored links when you typed in a search query?

I always wondered what the h$%k was eBay upto and how they managed to come up as the topmost sponsored link no matter what I was searching for. It got so ridiculous that even if I typed in “java nio” I would end up getting an ad that said “Bid for ‘java nio‘ at eBay“.

Prabhakar Raghavan, Yahoo’s Head of Yahoo! Research who was the Plenary speaker on May 10th at the WWW 2007 conference, described their modus-operandi.
Overture and other bidding based sponsored ad networks at the time used the “bid” price as one of the major factors for ranking sponsored ads. eBay quickly learnt that all they had to do was “bid” for the highest pay per click $. They could afford to do this, as eBay only had to pay up on a per click basis — and users seldom clicked on anything as ridiculous as “bid for ‘boeing 747‘ on eBay

Hence, eBay got a tremendous amount of brand eyeballs for relatively no expense :-) (Of course, its debatable on how some users might have taken to this intrusive ad). Needless to add, Yahoo, with its new Panama system have now fixed this anomaly and include other factors such as CTR etc. before allocating ranks.
Makes me wonder, was Overture so naive to not use previous CTRs for their ranking? No wonder Y! did not do too well with their sponsored ads.

PS: I worked on a Sponsored Link product and was doing research on search terms and sponsored links.

Tags: ,

Start Slide Show with PicLens Lite PicLens

Posted on May 11th 2007 by stonse

Filed under Personalization | No Comments »

Tagging and the Social Networking/Web 2.0 revolution ..

Posted on May 11th 2007 by stonse

Filed under Personalization | No Comments »

The Shape of Data …

Its not often that you get to meet and see in person someone who is considered as a very instrumental founder of the phenomenon known as the World Wide Web. Tim Berners-Lee, was the Plenary Speaker at the conference today.

His main thoughts were around the evolution of the Internet, from a “ideation” phase to a “tech solution”. It was a micro-level engineering solution to a macro-level problem. He explained how the Web was engineered to be a link based system. He then moved to his favorite topic these days - the Semantic Web - and the need to now focus on linking data.

His plea was for each and every one of us to contribute towards making that link happen - just as the document to document link happened to successfully create the WWW.

A point that he brought about intrigued me. It was on the Shape of Data
and how it has evolved …Lets see …

  • Line: data on a punchcard, tape etc.
  • Matrix: Data on an RDBMS …
  • Tree: XML, RDF ..
  • Net: The Internet  — its made of links - if you cut one link - it still functions as it has redundancy built in.

Now, what about the Web ?
The Web, Berners-Lee contends is like a lump, a Fractal  Tangle

Posted on May 10th 2007 by stonse

Filed under Personalization | No Comments »

WWW2007 Conference at Banff, Canada

Its my first time at the WWW conference. The conference is on for a whole week! Banff is breathtaking. One can only use superlatives while describing Banff, and the Fairmont Banff Springs Hotel where most of us attendees are put up.
Banff
The picture in my hotel room claims that King Edward VIII played Golf here — and I have my eyes set up on doing so too (If I can get a couple of hours break from the conference)

But the real candy jar is the rich set of tutorials/workshops and papers
that are in the offing. The themes/Tracks are near and dear to my heart.

  • Search and Search Related
  • XML and Data Representation/Interchange
  • Social Networking and Community
  • Tagging, Ontologies, Folksonomies …

Its definitely promising to be a very intense, information packed few days here :-)

One difference that I noticed between the other conferences that I attended and this one is that the WWW conference attendees display a distinct academic mindset .. Its a refreshing change from the Web 2.0, mashups-while-you-shop, commercial/startup idea driven conferences off late (although they have their charm too)

Stay Tuned …

Tags: , ,

Start Slide Show with PicLens Lite PicLens

Posted on May 10th 2007 by stonse

Filed under Personalization | No Comments »

« Prev