Thursday, 17 October 2013

Are the Videos We are Watching Online Media Fragment Ready?

The Media Fragments URI (basic) has been published as a W3C recommendation since Sep, 2012. It's pretty excited to see that many use cases have already come about to use media fragments for better linking and indexing of multimedia resources. A recent webniar given by one of the chairs of W3C Media Fragments Working Group, Raphael Troncy, has pointed out that YouTube and Dailymotion has partially implemented the specification (See the slices below).What really interest me now is how many "big players" in this area have actually implemented or partially implemented this specification, i.e. are they "media fragment ready"? So I spent a couple of days to investigate the video sharing applications online. If you are not interested in how I did the experiment, you can jump directly to see the result.

The Methdology

First things first, I needed to find out a list of websites I want to take a look at. Fortunately, I found a wikipedia page with a list of major video sharing platforms in different countries. In my opinion, some "major" players are not listed there, such as and So I add them into the list and removed a couple of them as they are not public video sharing services or they are severing adult content. You can see all the 59 websites I investigated in the final result.

Honestly, what I did next was not contacting all the websites one by one and asking them "can you give me a URI that points to a certain time or spatial area of the video resources hosting on your website?". I just went to each website and tried the following steps to find out the answer:

  1. Open a desktop browser. I used Google Chrome for this investigation.
  2. Go to the landing page of a random video. Login if necessary.
  3. Right click on the player, of course it must be a flash player, and see if there is an selection called something like "Get video url at the current time".
  4. On the landing page, find out whether there is any social sharing button (including the buttons emerged after you pause the video player) allowing you to share the video at a certain time point.
  5. Go to Twitter and search whether any video fragment has been shared recently.
  6. If none of the above works, I would make the conclusion that this website doesn't support media fragments, at least doesn't support W3C Media Fragment (basic) Specification.

The methodology is not flawless and I may have missed something. So if you find anything that is not correct in the result table, please feel free to make comments and I will mend it.

The Result Table

Table 1 and 2 are my investigation results. Table 1 shows the name of the website, their implementation status of temporal and spatial fragments and the page views per day. The page views per day is obtained by myself from manually searching . I am not 100% sure if they have provided the accurate data. But compared it with the site traffic data from wikipedia, they seem more or less match. So currently I just trust All the site views data is valid in Oct,2013.

Table 1. Media Fragment Compatibility on Video Hosting Services
Hoster t xywh PageViews/Day Notes Partially No 7,142,857 Chinese→ No No 5,978,260
AfreecaTV Unknown Unknown 91,674 Korean→ No No 214,174
BlogTV No No 33,475 Now at No No 804,681
Buzznet No No 120,733 No No 3,520
Crackle No No 344,611
DaCast No No 2,897 Online Video Streaming. Haven't tried.
Dailymotion Partially No 11,702,127
EngageMedia No No 3,426
ExpoTV No No 34,042
Facebook No No 18,600,000 video since 2007. Estimated based on this report. No No 506,678
Funshion No No 601,300 Chinese→ Page views according to
Fotki No No 139,611
GodTube No No 68,909 formerly
Hulu Partially No 3,142,857 Area restriction→
Lafango No No 11,620
LeTV No No 3,459,119 Chinese→
Liveleak No No 1,929,824 No No 0 Page views unknown as the video section is a subset of the whole website, and there is no relevant data about video views online.
Mefeedia No No 173,803 Area Restriction→
Metacafe No No 1,127,049
Mevio No No 52,276
Mobento No No 1,014 Focus on video search
Myspace No No 0 No data about the video views.
MyVideo No No 553,319
MUZU.TV No No 13,801
Nico Nico Douga No No 7,746,478 Japanese
Openfilm No No 8,810
Photobucket No No 0 Mainly sharing photoes (5,263,157 views per day). Not sure about video views.
RuTube No No 601,750 Russian
Sapo Videos No No 0 Portuguese. is a portal website, not sure about the video views everyday.
SchoolTube No No 9,893
ScienceStage No No 10,314
Sevenload No No 7,291
SmugMug No No 542,138 Unknown Unknown 41,323 Area Restrictions→ No No 722,733 only music videos
Trilulilu No No 68,293 Romanian
Tudou Partially No 6,010,928 Chinese→
Vbox7 Partially No 319,303 Bulgarian→
Veoh No No 359,359
Vevo No No 685,358
Viddler No No 122,073
Videojug No No 86,901
Videolog No No 79,687 Portuguese No No 6,697 So few views? I thought it should be more.
Vidoosh No No 3,786 No No 203,628 Allow timed comments
Vimeo Partially No 19,680,000
Vuze No No 54,380 No No 987
Wistia No No 153,331
Yahoo! Video No No 10,000,000 video since 2008. Estimated based on this report.
Youku Partially No 15,277,777 Chinese→
YouTube Partially No 366,666,666

Table 2. The Supported Media Fragment Syntax in Different Video Hosting Services
Hoster Example url Fragment variable
Dailymotion (is this a bug?)
"start" query in seconds
Hulu st query as start time and et query as end time.
Viddler "offset" query in seconds
"t" query or hash in seconds
Tudou "lvt" query in seconds
Youku "firsttime" query in seconds
"t" query or hash in seconds or DDhDDmDDs format


I have examined totally 59 websites, but only 8 of them have implemented some notion of media fragments in their system. The syntax (at least the variable) in each website are different and most of them represent the time fragments in seconds. In addition, none of them actually expose the spatial fragment. From this point of view, the result might be a little bit disappointed (see Figure 1).

Then, let's consider more about the page views per day. In my opinion, this is a very important reference. It shows how many videos, which users actually watch, can be further exposed by media fragments and furthermore, could be shared via social media, indexed by search engines and even linked to named entities on fragment level.

Clearly, in Figure 2. we can see that only 12.2% of the views of the video are not related to media fragments. It's really excited to know that most videos we watched are already (at least partially) media fragment ready. This information can be interpreted in several ways. If I am a user and I want to share only part of the video with my friends, nearly 9 out of 10 chances, I can do it. More importantly, most videos that we watched, can be further indexed on fragment level. This new SEO possibility will definitely bring more traffic to websites whoever implement it. Please keep in mind that, there are still "big players" like Hulu that I haven't investigated yet. So the potential media fragment ready videos could be even larger.

Language Barrier and Area Restrictions

I cannot access some of the websites as they are limited to a certain countries and regions. And some of them are in another language other than English and Chinese. So I don't know how to proceed the methodology to find out the media fragments. Please drop me a comment if you have any update about those websites.

Media Fragment in China

I am glad to see that the largest three video sharing or delivering services in China, Tudou (土豆网), Youku(优酷) and LeTV (乐视) have already partially supported media fragments. Unlike YouTube or Dailymotion, Chinese guys give media fragments a very fashionable name "Chuanyue (穿越)", which means you can magically move something from one place to another. This calling seems confused, but what they (actually "we") want to say is that you can watch the video on any of your device, pause at any time, and start to watch from that time on any other device. So it looks like the video has been moved from one device to another and the status of the video is kept through.

Conclusions and Future Work

In this post, I presented some media fragments URI compatibility investigation for video sharing applications online. The result shows that even though the number of websites who implemented media fragments specification is not large, the major players in this area has already developed the notion of media fragments into their systems, even though they follow different syntax. It's very encouraging to see that nearly 90% of our daily watched videos are "media fragment ready".

Of course, there are many things we need to improve for this investigation, especially collecting more relevant and accurate data:

  1. It's better if we can get the video viewing data instead of page viewing data. After all, you can browser channels on YouTube without actually watching the video.
  2. Find out how many videos each site host. That will be the potential videos that could be exposed by media fragments and indexed on fragment level. Well, not all of them will be treated equally because many of them may not be watched as frequently as others.
  3. It will be also interesting to investigate which website can make timed comments, i.e. comments aligned with the timeline of the video. I know we can do that on YouTube and Viki. So theoretically, if they have timed comments enabled, they should have implemented the player api of jumping to a certain time point of the video. But they may not have exposed it as part of the video landing page url.

This is only a kick start of a more comprehensive and maybe scientific investigation on the media fragment readiness for the video sharing services online. Now, take a wide guess of how we can use the data presented in Table 1 and Table 2. Maybe you have already got some thoughts? Leave me a comment then.

P.S. I again need to mention my work about Synote and the Media Fragment Enricher. They are complete compatible with Media Fragment URI Specification for the video landing page :=)

Tuesday, 8 October 2013

Synote Media Fragment Player v1.0: A hack of sharing media fragments on Tumblr

We have just released a new version of Synote Media Fragment Player (smfplayer) v1.0. Compared with the v1.0-alpha, we fixed a few bugs and did more tests on different platforms. In this blog, I am trying to explain how to use smfplayer in Tumblr, and make it possible to share and highlight media fragments in your Tumblr blog.

The problems of sharing media fragments on social networks

See the following twitter message I have sent last year. I only shared a media fragment in the tweet.

In this case, the video server must be able to handle the media fragment, i.e. parsing the fragment URI and highlight the video fragment. YouTube and Dailymotion has partially implemented this function. If you write click on the YouTube Flash player, you will see a link "copy URL at current time". Unfortunately, the string format of those URL are not compatible with the W3C Media Fragment Specification (basic) and they cannot highlight spatial fragments.

That is the reason we designed Synote Media Fragment Player (smfplayer). In the previous blog, I explained how to use smfplayer in Theoretically, this method could be extend to other blogging or CMS systems such as WordPress, where you have a fully control of the web-page template and the rendering of the template.

Well, I have to admit that smfplayer doesn't work on Twitter and Facebook as the user doesn't have any control of how Twitter and Facebook rendering the embedded video code at all. However, Tumblr is a kind of exception.

Some words about Tumblr

For anyone who doesn't know Tumblr very well, Tumblr is a new microblogging and social network platform. It's sort of like Twitter+Facebook, where you can easily share multimedia posts (images, audio and video) around your followers. You have a blog page, where all your blogs can be viewed, and you can get updates about other users' new sharings on the Dashborad. The blogs on Tumblr are usually short (one sentence, one image, etc) compared with Blogger.

Video sharing is very straight forward on Tumblr. You can embed video code or upload a video yourself. The challenge for me is that is it possible to use smfplayer in Tumblr?

How To?

It's not difficult, but you need to know some programming skills. Firstly, you need to customize your theme and add the smfplayer css and javascript files into the header:

<link href="" rel="stylesheet" type="text/css"></link>
<link href="" rel="stylesheet" type="text/css"></link>
<script src="" type="text/javascript"></script>
<script src="" type="text/javascript"></script>
<script src="" type="text/javascript"></script>

Then the next question will be how can I initialise smfplayer, in the template. This is the tricky part! If you look at the html code of the template, you will see that all the posts, such as text, image, audio, chat and audio, they are placed in {block:something} tag. For example, this is the code for Photo block:

<div class="media">{LinkOpenTag}<img src="{PhotoURL-500}" alt="{PhotoAlt}" />{LinkCloseTag}</div>
    {block:Caption}<div class="copy">{Caption}</div>{/block:Caption}

The code is similar to Video block. So one straight forward thinking will be that we create a video/photo post first, and then replace the div media with smfplayer. But how can we locate the correct div with media class for the video as there might be many such divs? And how can we pass the media fragment URI to initialise smfplayer? To solve these two problems, we need to make full use of the div class="copy" and the HTML editor provided by Tumblr.

Whenever you publish a photo or video post, you can add "caption" for the post and you are allowed to embed HTML into the caption. So I can create a post like this:

The div class="smf-placehoder" specifies that the media embedded in this post needs to be replaced by smfplayer. We also use HTML5 data api to define the actual "mfuri" that will be used to replace the media div. This is because embedding video in Tumblr from YouTube or other resources will result in an iframe, which will depend on the video providers. So it's difficult to traverse the DOM node and find the exact URI of the video. Then, we can add the following code into the html head:

    var placeholders = $(".smf-placeholder");
    if(placeholders.size() >0)
              var smf_div = $(placeholder).closest(".copy").prev(".media");
              var mfuri = $(placeholder).data("mfuri");
                 var player = smf_div.smfplayer({

The code will find the nearest div media and initialise the smfplayer. The following image is a screenshot of the blogpost on tumblr. You can also goto example1 and example2 to see it yourself.

Known Issue

One serious problem of this solution is that Tumblr actually depends much on "Dashboard", which has nothing to do with the template. So your followers won't be able to see any video playing on your blogpost from their Dashboard. What they can see are only captions :( As a temporary solution, we can upload a thumbnail image as a photo post and add a link "View Media Fragment" to lead users to open the actual blog page. This will also be a problem for users on tablets and mobile phones. 

The template of Tumblr may vary from users to users. So the code that is working for my template doesn't necessarily work in your template. However, the basic idea is that you use caption and html data api to save the media fragment uri and initialise smfplayer from javascript.


In this blog, I introduced a way to share media fragments on Tumblr. This solution is still a hack, but should work well as a demo to explore new ways of sharing videos. Again, a more sophisticated solution is that the social network platforms can support the sharing of media fragments natively following the W3C Media Fragment specification.

Tuesday, 2 April 2013

Synote Media Fragment Player: a polyfill of to play the temporal and spatial fragment in a media fragment URI


A very long time ago, I intended to extract the media fragment player we developed for Synote as a individual easy-to-use HTML5 player. Finally, I got the time recently to develop the Synote Media Fragment Player (smfplayer). In general, smfplayer is a jQuery plugin and polyfill for web developers to easily playback the temporal and spatial fragment of different video/audio resources. In Synote, smfplayer only replays the temporal fragment. But, inspired by two great work from Ninsuna and Chris Coyier, I think it's also possible to create a polyfill to highlight the spatial dimension.

The Rational

Nowadays, there are two kinds of multimedia resources we usually want to play on the Web if not considering the live streaming of video/audio: the video/audio files hosted in some web server (Apache, Amazon S3, etc), and multimedia sharing websites (YouTube, Dailymotion and Vimeo). As to the support for media fragment, the Media Fragment Working Group showcase page includes some browsers' latest support for media fragment. Meanwhile, YouTube, Dailymotion and Vimeo all have "copy current time URL" or similar functions, but they are based on the Flash player, not HTML5. So it is still impossible to use a <video> or <audio> tag and the "src" attribute to enable media fragments for all browsers and for all those multimedia resources.

There are zillions of HTML5 video player out there and some of them offers "fallback" compatibility for Flash (or even silverlight) like JW player. Theoretically (I mean forget about the legal issues), any HTML5 player with flash fallback can embed chromeless Flash player from online video sharing websites and provide us a unique UI for all resources. By searching around, I only find MediaElement.js player (Mejs) best suits my requirement. What amazed me most about Mejs is that it has "fallforward" Flash and Silverlight plugins. Based on the browsers' compatibility for a certain file type, Mejs can automatically choose the best way to play the video: HTML5, Flash or Silverlight. More surprisingly, Mejs can directly play YouTube and Vimeo (as well as Dailymotion with a pull request). Everything in Mejs has a unique interface to control all the players (HTML5, Flash and Sliverlight).

So next, using Mejs as the background, we can start to build the polyfill to control the player highlighting media fragments. First thing first, we need a media fragment parser. Fortunately, this library has been developed by Thomas Steiner. After obtaining the start time, endtime and xywh defined in the media fragment URI, we can start highlighting both temporal and spatial fragments.

Highlighting temporal fragment is easy. When Mejs autostart playing or the user click the play button, smfplayer will jump to the start time if it is defined. Smfplayer will pause the video if the currentTime is bigger than the end time. For spatial highlighting, smfplayer create an overlay <div> on top of the video based on the xywh value in media fragment URI. You can define xywh in two ways, pixel and percent, so we need to ask the user what is the original width and height of the video if percent is used. In Ninsuna example, only the spatial fragment area is visible, while smfplayer just circle around the area with some highlight colour (default is yellow). One issue of this solution is that when the video goes to fullscreen mode, the overlay <div> will disappear.

Demo Time!

The following YouTube video is an introduction of Web Platform. Well, Web Platform is exciting for me, but what's more exciting is that the creator of this video somehow decided to call Sir Tim-Berners Lee a WEB DEVELOPER!

I want to share this interesting finding via Twitter. Tim only appears in this video for 4 seconds from 1:30, so I can send this link via twitter:

This is not enough. What I really want my followers to see is the funny title of Tim-Berners Lee on the bottom-left corner. So what I really want to share is this following media fragment URI given the video resolution is 480x320:,00:01:33&xywh=40,220,200,50

And using smfplayer, you can easily write some programmes to playback this media fragment:

This demo is similar one of the features of twitter: the in-page embedded playback of the videos from tweet list page. As a developer, you need to write different code to embed different players from YouTube, Dailymotion, Vimeo and files from web servers. With smfplayer, you can implement this function easily. You can try to play different resources at smfplayer demo page.


Followings are some useful links in case you want to know more about smfplayer:

All in all, smfplayer is a polyfill and temporal solutions for the developers who want to playback media fragments. I really hope some day different browsers can implement those functions natively.

Friday, 31 August 2012

HTML5 Video and WebVTT Support in Mobile Browsers

Since proposed by Opera Software in 2007, the HTML5 video tag has been around for a pretty long time. Video tag is a great step forward to make video the first class citizen of the Web. Unfortunately, different browsers understand the citizenships in different ways. The citizenship of AVI is terminated. WMV needs to hold Sliverlight Visa in order to get the entry clearance to browsers other than IE. FLV's visa application for iOS has been Denied by Apple Empire and Adobe has declared that he's not going to help FLV to re-apply the visa again. H.264 is like a mafia, who asks for protection fees from browsers. You see, there is no Federal Government in HTML5.

Since the smart phone joined the Web, things are getting worse again. It seems to me that there is no systemetic tests for the video compitablity on different mobile browsers. So the goal of our test is to find out which video format could be played in which browsers, or which major versions of the browsers.

Our Test

We need to emphasis something before going into the test.
  1. All the browsers in the test are on mobile phones. They are not desktop versions of the browsers.
  2. We only tested on android and iphones. Browsers for Windows Phone, Blackberry and Palm WebOS are not tested.
  3. All the tests are carried out in mobile phones. Well, yes, you can make a phone call using 7 inch Samsung Galaxy tablets if you don't feel it's too big. But the mobile phones used in this test are 3.0 to 4.8 inches for the screensize.
  4. Not all browsers, as well as a specific version of the browser, could be used in all different mobile devices. For example, there is no Opera Mobile browser on iPhone iOS 5. It is the same case for Firefox. But at the time of testing, we chose the latest version of the browser in that OS.
  5. We suppose that the server which hosts the video file is properly configured in order that the file could be properly delivered.
Finally, we have tried our best to keep the data correct on the phones we have tested. But it's quite likely that some of the test results are not accurate. If you find anything that is not correct, please feel free to let me know. I would be glad to test it again. OK. Let's go!

Test 1, Playable

This test shows if a video file with a certain codec could be played in a certain browser on a mobile phone. We test mp4, ogv and webm on Android 2.3, 4.0 and iOS 5. We designed a test page, which embeds mp4, ogg and Webm videos in one page. Then we open this page in different mobile browsers on different mobile OSs. Table 1,2 and 3 shows the results.

Table 1.HTM5 Video Compatibility in Android 2.3
Browser MP4, H.264 Ogg, Theora WebM, VP8
Opera Mobile 12.0 Yes No No
Opera Mini 7.0.29 No No No
Firefox 14.0.1 No Yes* Yes*
Android Native Browser Yes No No
*Unlike other browsers, the video will not enter full screen when playing, and there is no full screen control in the player in Firefox 14.0.1.

Test Phones for Table 1:

  1. HTC Design S, Android 2.3.5, 3.7-incn touch screen
  2. Sony SK17i, Android 2.3.4, 3.0-inch touch screen
  3. Samsung Galaxy S II, Android version 2.3.3, 4.3-inch touch screen
  4. HTC G10, Android 2.3.5, 4.3-incn touch screen
P.S. Google Chrome is not compatible with Android 2.3.

Table 2.HTM5 Video Compatibility in Android 4.0.4
Browser MP4, H.264 Ogg, Theora WebM, VP8
Opera Mobile 12.0 Yes No No
Opera Mini 7.0.29 No No No
Chrome 18.0 Yes No Yes
Firefox 14.0.1 No Yes Yes
Android Native Browser Yes No No
Test Phones for Table 2:
  1. Samsung Galaxy S III, 4.8-inch touch screen

Table 3.HTM5 Video Compatibility in iOS 5.1.1
Browser MP4, H.264 Ogg, Theora WebM, VP8
Safari 5 Yes No No
Chrome 21.0 Yes No No
Opera Mini 7.0.29 No No No
Test Phones for Table 3:
  1. iPhone 4S, 3.5 inches screen
  2. iPhone 4, 3.5 inches screen
As a summary, here are some interesting findings:
  1. There is no "killer" video codecing format for mobile browsers currently. But MP4 seems compatible with most "default" browsers, i.e. Android Native browser in Android phons and Safari in iPhones.
  2. Opera Mini is pretty disappointing as no video could be played.
  3. Different from desktop browsers (see this table), WebM is not well supported by the native players on mobile browsers. Interestingly, WebM is developed (or more actually, sponsored) by Google, but the Android Native Browser, which is also developed by Google, can't play WebM
  4. Most mobile phones play the video in full screen mode by default and users cannot quit the full screen unless stopping the video. But Firefox is an exception. What's more, if the screen size of the mobile phone is larger than 4 inches, The full screen, if screensize is bigger than 4.3 inches, sometimes, video won't be played in full screen mode automatically.

WebVTT support in Mobile Browsers

HTML5 video tag not only brings video playing natively, but also adds some cool features alongside the video. One of them is the track tag for subtitles. Subtitles are very important for the accessiblity of videos on the Web. Many standards and de facto standards have been around for a long time. SubRip (SRT), Timed Text Markup Language and .sub, to name a few. I personally think the most promising one is Web Video Text Track (WebVTT). WebVTT is specially tailored for videos on the Web and you can also use it for audio description, chapter navigation, etc. Silvia Pfeiffer has given an very good presentation about WebVTT.

WebVTT is still a moving target and there are many things which need to be nailed before the first release. Except for the WebVTT specification itself, as developers, we are really care about the technical support of WebVTT in different browsers and are there any tools out there we can easily use? Silvia summarised the current situation of WebVTT support in different desktop browsers.

A further question we ask is that is there any mobile browsers which support WebVTT natively? After our test, the answer is NO! Table 4 is our test result. "N/A" means a certain format of the video file cannot be played in that browser, so it is meaning less to say if the WebVTT can be displayed or not.

Table 4.WebVTT Support in Different Mobile Browsers
OS Browser MP4, H.264 Ogg, Theora WebM, VP8
Android 2.3 Opera Mobile 12.0 No N/A N/A
Opera Mini 7.0.29 N/A N/A N/A
Firefox 14.0.1 N/A No No
Android Native Browser No N/A N/A
Android 4.0.4 Opera Mobile 12.0 No N/A N/A
Opera Mini 7.0.29 N/A N/A N/A
Chrome 18.0 No N/A No
Firefox 14.0.1 N/A No No
Android Native Browser No N/A N/A
iPhone iOS 5 Safari 5 No N/A N/A
Chrome 21.0 No N/A N/A
Opera Mini 7.0.29 N/A N/A N/A

Of course, there are tons of HTML5 video players on the Web, VideoJS, PopcornJS, MediaEleemntJS, JWPlayer, Kaltura Player, etc. Some of indeed support WebVTT on desktop browsers. But they are all "polyfills", which means the support is not native and they use javascript and CSS to somehow present WebVTT together with the video. We did test MediaElementJS on iphones, but unfortunately, it still failed to display the subtitle. However, MediaElementJS can sucessfuly display WebVTT on iPad.

Conclusions and Future Directions

The test for HTML5 video tag and WebVTT on mobile browsers is important. Developers should be clear what video could be played in the target device. On the server side, the web applications need to host the best format that compitable with the target device. Video sharing services, such as YouTube, must be adaptive enough to deliver videos to different devices and browsers. Subtitles are always important for videos, especially on mobile platforms, where the support for subtitle is not thorough yet.

With each updates or release of new versions of browsers, new codecs might be supported and old formats might be abandoned. WebVTT is currently is not well supported on mobile browsers and we are still waiting for some break-through.

The tests tell us the HTML5 video support is on the way. In the tests above, we are trying to keep things easy, i.e. we didn't bring in a lot of variables and we just used the most popular phones and most popular browsers. However, the real world is not that simple. Between mobile phones and desktop machine, we have got tablets, 7 inches to 10 inches. The screensize of mobile phones are getting bigger and bigger. I am very curious on how different browser kernels choose the display mode for these screensizes. Is there a borderline or threshold? So in the future, it might be necessary to compare the HTML5 video tag in a single browser across different size of devices (phones and tablets). We are also expecting more documents could be released or discussed regarding this aspect.

Again, if you find anything wrong or anything new, please feel free to leave your comments.

Tuesday, 28 August 2012

The Chinese Copy of Web


I have been thinking to write something about the "Chinese Copy" of Web for a long time. Why? Because I think most people outside of China wouldn't know how the Chinese people spend their time on the Web. Of course, language is a problem, but more importantly, the Web in China is a silo isolated  from outside of the world. China has 538 million internet users in July 2012 and it's going to hit 800 million in 2015! (if the Chinese government is not bluffing, I believe they are not). Many of my friends are pretty interested in what a quarter of the internet populations in the whole world is doing on the Web. They ask me questions such as do they go to Facebook, view videos on YouTube? My answer is simple: we have counterpart services in China. Finally, I put my hands on the keyboard and start listing the counterpart services in China. I think this kind of information will be helpful for a lot of people in other countries to know China better.

Let's List!

I will list some of the major counterpart services on the Web in China in different aspects. I won't list many statistical comparisons as there are plenty of them which could be found on the Web.

Social Networks

No one will deny that Facebook is the largest social network website in the world. We have two major applications similar to Facebook. Renren (人人网, means "everybody"), which is formerly known as Xiaonei (校内网, which means "network for students in schools or colleges), and Kaixin (开心网, means "happy"). Facebook has claimed 500 million users in 2011, while Renren has around 137 million users in September 2011 and it's increasing by 24 million to 38 million every month.

Well, if you go to Renren or Kaixin, you will notice that, the user interface looks very similar to Facebook. If you happen to know Chinese and have an account, you will find that the personal pages are similar too. The only difference is the colour.

Search Engine

You know Google? Good! But in China, Google only takes around 16% of the search engine market. The largest search engine in China is Baidu (百度). What Baidu good at is searching Chinese documents on the Web. If you happen to know some Chinese friends, you can ask him or her to explain the video above, which tells the different between Baidu's search and Google's search, even though it explains it in a funny way.

One of Baidu's founders is Robin Li (李彦宏), who has experience to study and work in U.S. He is a kind of legend in China. Using semantic Web, I would say "Robin Li owl:sameAs Bill Gates". Now Baidu has successfully completed IPO on NASDAQ and it has extended its market to Korean and Japan, where the languages are quite similar to Chinese.

Video Sharing 

We all know YouTube, Vimeo, Dailymotion, etc. In China, there are many video uploading and sharing websites. Two famous ones are Tudou (土豆, which means "potato") and youku (优酷). Tudou and youku has merger plan recently, but it's still in progress. After the merger, "Youku Tudou" will become the lagest video site. I am not a business man, so I think both Youku and Tudou serve similar content and they are also similar to YouTube.

Online Shopping

Please gimme the first two websites in your mind that you will go when you want to buy something. Yes, they are eBay and Amazon. In UK, eBay is the best place to sell or buy something quickly in a reasonable price for individuals or small businesses. In China, we have an online shopping website called Taobao (淘宝网, which means "looking for treasures"), owned by Alibaba Group. For Taobao, you only need one sentence to describe it: "you can buy anything there". Jack Ma (马云), who owns Taobao and Alibaba Group, is one of the first entrepreneurs, who brought e-commerce into China. Interestingly, he claims that he doesn't know anything about computer or internet, I mean the technology side. He even find himself hard to send an email!

One of the most widely used products of eBay is Paypal. The counterpart in China is called Alipay (支付宝, which means "a tool can pay for anything any ways"). No surprising, Alipay is developed by Alibaba Group (God! We just copied EVERYTHING, even the way it exists!). By 2010, it is said there are 550 million registers in Alipay, not only in China but around all Chinese communities all over the world.

Next to eBay, we are also familiar with Amazon. But Amazon is not very successful in China, because of the existence of Dangdang (当当网), which was founded by a couple, Peggy Yu and Li Guoqing. Similar, Dangdang started by selling books online and then it extends its market to household merchandise, digital devices, etc.

Instant Messaging

Anybody remembers what is ICQ? Maybe you know MSN Messenger, Yahoo Messenger or Skype. As far as I can recall, ICQ is the first instant messaging application I have ever use. When we no longer (maybe you never did) pay any attention to ICQ, the counterpart of ICQ in China is growing tremendously out of anybody's imagination. The application is called Tencent QQ (腾讯QQ). I personally think it is a copy of ICQ as the original name of Tencent QQ is OICQ, Open ICQ.

Instant messaging application may not be totally categorised as Web-based application as it is originally a desktop application. But I need to mention Tencent, the company, and QQ here, because it's another legend in China. Even though  the CEO of Tencent Ma Huateng (马化腾) has been severely accused for the act of plagiarism, he is still ranked as the second richest person in IT in mainland China, next to Robin Li, in 2010. 

Which application in the world has the largest population of registration? Facebook, YouTube, Twitter, Google plus? No! I wouldn't be surprise to know that in September 2011, there were more than 710 million registered users in Tencent QQ! That is nearly the population of the Europe population, more than twice as much as the US population. Even though it is a copy, I would say it's a very successful copy. In China, you can safely say you don't know who is the president of US, but it would be a shame to tell somebody you don't have a QQ membership. 


The blog application are not dominated by one or two company now in Europe or U.S. After the downfall of MySpace, MSN spaces, the blog applications have been largely replaced by or further integrated with Social networks. In China, the situation is quite similar. Major portal websites all have their blog services, such as Sina blog (新浪博客), Tencent QZone (QQ空间). 

Micro Blog

Twitter makes the information spread quicker than you can ever imagine and that's why twitter is strictly blocked in China. However, entrepreneurs would never be stopped as long as there are potentials of making money. In China, there are several counterparts of Twitter and they are called Weibo (微博). The two major Weibo in China are Sina Weibo (新浪微博) and Tencent Weibo (腾讯微博). Of course, they are monitored and controlled by the government, as well as Baidu I mentioned earlier. Any "sensitive words" will be blocked from searching or publishing. But as a Chinese, I should proudly say that 140 characters in Chinese Characters are far more expressive than English :)

I have just checked the followers (27th of August 2012),the most popular person on Twitter is Lady Gaga, around 28 million followers (I didn't follow her), Meanwhile, the most popular Chinese person on Weibo is Yao Chen, a Chinese actress, who has around 23 million followers, but nearly all of them are Chinese! Yao Chen even has more followers than Britney Spears and Barack Obama. 


Everybody is impressed by Jimmy Wales and his Wikipedia. Wikipedia is now the de facto knowledge hub of, at least, English. You can probably find any facts you want to know on Wikipedia. However, wikipedia is blocked in China, as there are many anti-government content. Well, I don't care if the content is anti-government or not, but we should find a way to manage our own knowledge, right? So our counterpart services are Baidu Baike (百度百科) and Hudong Baike (互动百科). Baike in Chinese means "encyclopaedia". 

Different from Wikipedia, which funded by non-profit organization, Baidu Baike and Hudong Baike are run by for-profit companies, Baidu Inc. and Hudong Inc respectively. The CEO of Hudong is Pan Haidong who got his PhD degreen in Boston University in US.  

Everybody in the semantic Web area knows DBpedia. In the ISWC 2011 conference in Bonn, Germany, I met a few guys from Shanghai Jiao Tong University, who was start creating the Chinese version of DBpedia: (zhishi in Chinese means "knowledge"). It's a very impressive work and hopefully, there will be more progress coming out. Unfortunately, I don't think will be possible to merge with DBpedia, as the data in is not freely available. The 4.1 and 4.2 in terms and conditions in Hudong Haike clearly states that Hudong Baike reserves all the rights for anything uploaded to the website. 


"Intellectual property" is not a very popular phrase until the trademark war launched by Wei Guan Corp. China. I have been to UK for 5 years, and I still cannot find any place to download movies, software, music for free, or even watch freely online. Pirate Bay is an exception. There are some file sharing systems based on Cloud, but unfortunately, none of them lives longer than a few months as they are shutdown quickly. 

In China, this is totally a different story. Xunlei (迅雷, which means "as fast as thunder") and VeryCD (电驴, which means "electronic donkey") host millions of movies, music records, TV programmes, cartoons, even pirate software. Everybody with an account can download whatever there. The only difference between paid users and free users is that paid users can download really fast. Xunlei and VeryCD has recently got sued for intellectual properties issues and many content are no longer available for downloading. What's more, they have somehow limit the downloading service to users from mainland China. 

If you have time and know Chinese, take a look around in Xunlei and VeryCD, maybe you can find Total Recall, The Expendables 2 and other latest HD movies there. 

Integration among these Applications

Like the integration among Facebook, twitter, YouTube, Google Plus, the counterpart services in China also implement the integration functions. If you find a good video in Tudou, you can click the "share button" (in Chinese its called 分享 fenxiang) to share it in Renren or send it via Weibo. When you take a picture using your smartphone, you can put it immediately on Renren or Weibo. 


As a summary, I list all the counterpart services:
For "security" reasons, Facebook, YouTube, Twitter, Wikipedia have been totally blocked in main land China. Google has been moved to Hong Kong because they cannot reach the agreement with the government for the email and sensitive words filtering. eBay and Amazon currently are not in good position in the competition in China. Considering the 710 million registered users of QQ, MSN Messenger has no advantage. Dear me, only Microsoft survives!


OK. Since we have counterpart services in every aspect of Web applications, what are the limitations for these services in China? Let's list a few of them.

The first problem is the openness. The Chinese Web is totally contradict to Tim Berners Lee's vision for the openness of the Web. With in the internet in China, you cannot view YouTube or use twitter. On the other hand, users in outside of China cannot view some of the video on Tudou or Youku as they can be only viewed by users within mainland China. 

Secondly, most the web applications have accessibility problems. They have ads and popping up windows everywhere. To make things worse, many applications still only display well in IE! Thanks to Steve Jobs, more Chinese buy Macbook, more application owners will consider to make them compatible with Safari, Firefox and Google Chrome. I have to say, Facebook, YouTube and Twitter they are very accessible for keyboard and screen readers. But Renren, Kaixin and Tudou are too dazzling for me. I really wonder how a blind user could navigate through the screen. 

Another problem is the security. Many applications, such as Tencent QQ have been accused to illegally collect users' data. For Alipay, you need to install a plugin for the browser in order to use it. I've never heard that Paypal has some browser plugin.


You see, Chinese people are living very well inside the silo. We have millions of users, we have a huge market and many entrepreneurs have devote their passions into this market for decades. The Web eco-system has started to grow in China, even though monitored by the government. It's very good to know that many entrepreneurs in IT had experience of studying and working in US or other developed countries. So they bring back the advanced ideas and start their own career in China. Chinesizing the ideas is good at first, but we should feel the hazard because a market without creativity will not live long. So does the Web in China. Are there any Web applications that origins from Chinese people and have been widely adapted around the world? I don't see any to be honest.

In my own opinion, nothing would develop well without openness, which is the original essence of Web and Internet. Chinese people have been well taught by the proverb "behind an able man, there are always other able men". So "behind an isolated Web,  there are always better Web", we will finally know one day there is a bigger Web outside the silo. 

Monday, 12 March 2012

Napoleon's speech to Java Code

Suddenly find something interested here:

Please don't blame me that most of the characters are Chinese. What attracts me is the Java version of Napoleon's speech:

"My enemies are many,my equals are none. In the shade of olive trees,they said Italy could never be conquered.In the land of pharoahs and kings, they said Egypt could never be humbled.In the realm of forest and snow,they said russia could never be tamed.Now they say nothing.They fear me ,like a force of nature,a dealer in thunder and death.I say I am Napoleon,I am emperor……..Burn it"

So the Java version is:

import java.util.HashSet;

public class Napoleon {

    private HashSet enemies = new HashSet();

    // cdps = Chrysanthemum Damage Per Second
    public int cdpm = 100000;

    Napoleon() {

        enemies.add(new Enemy(“Italy”, 100));
        enemies.add(new Enemy(“Egypt”, 100));
        enemies.add(new Enemy(“Russia”, 100));


    private void speak(){

        System.out.println(“My enemy number: ” + enemies.size());
        HashSet trueEnemies = new HashSet();
        for (Enemy e : enemies) {
             if (e.cdpm > cdpm)

        System.out.println(“The number of enemies who can beat me: ” + trueEnemies.size());

        for (Enemy e : enemies) {
            do {

            } while (e.canBeatNapoleon(this));

        for (Enemy e : enemies) {

        cdpm = 10000000; 
        System.out.println(“I am Napoleon cdpm ” + cdpm);
        System.out.println(“Dispose enemy list…”);
        System.out.println(“Enemy list disposed!”);

   public static void main(String[] args) {
        (new Napoleon()).speak();


class Enemy {

    private String name;

    public int cdpm;

    Enemy(String name, int cdpm) { = name;
        this.cdpm = cdpm;

    public void shout() {
        if (cdpm > 0)
            System.out.println(name + ” says: we are invincible!!!”);
        else System.out.println(name + ” says: ……”);

    public boolean canBeatNapoleon(Napoleon n) {

        if (cdpm < n.cdpm) {
            cdpm = -1;
            return false;
        return true;


And the output is:
 My enemy number: 3
 The number of enemies who can beat me: 0
 Egyptsays: we are invincible!!!
 Italysays: we are invincible!!!
 Russiasays: we are invincible!!!
 Egyptsays: ……
 Italysays: ……
 Russiasays: ……
 I am Napoleon cdpm 10000000
 Dispose enemy list…
 Enemy list disposed! 
Well, someone wants to translate some English speech or poem to Java or javascript? Maybe we can try Martin Luther King's "I have a dream"?

Friday, 12 August 2011

HTML5 + Linked Data + Multimedia + TV experience = An HTML5 Leanback TV webapp from

This post really attracts me:

An HTML5 Leanback TV webapp that brings SPARQL to your living room |

"When you are sat on the sofa at the end of the day relaxing and watching TV, maybe eating food and not in the mood to have to keep constantly making decisions about what to watch you might not think that you are in a situation where Linked Data and SPARQL queries could be useful. Yet the flexibility of the data that can be obtained from data sources supporting these technologies makes them ideal candidates to power a Leanback TV experience."

"...By taking an existing template and an existing, very flexible, source of data we can create a whole new way for people to discover content on offer"

Well, people keep on asking : "where is linked data? How can I feel it?". Here is a good example. Well, you can say it could be done using Web 2.0 mashups. Yes, you can. But in this example, it is SPARQL endpoint from Open University, which distinguish it from normal Web 2.0 ways. Web 2.0 enable you to publish your data using some Web technologies, such as Restful Web Services or Ajax. However, applications still don't understand each other, while experienced developers can "understand". In Web 3.0, you not only publish your data using common Web technologies, but also using Semantic Web technologies. You represent your data in rdf, publish it through SPARQL endpoint, using 3XX and content negotiation to dereference your rdf data, etc, etc. People with creative thinking can then build more powerful applications on it. That's linked data!

It seems to me that the roadmap described by Tim Berners Lee is, step by step, becoming true. You can never image how you can use linked data. You can't! Just because people are so creative.