Multimedia search and the problem with metadata

Posted in Uncategorized on Feb 25, 2006

Check this post, it is a pretty funny joke (and an extremely boring 5 hours video, look how dull Eric Schmidth is making jokes at the begining, he will certainly not be making a living as a comediant). What I found interesting is that it actually raises the issue about how multimedia content is being indexed by search engines. Google seems to be relying on manual tagging by the people uploading videos or just on text around images or videos in the Web sites from where the content is being crawled. That seems like a pretty simplistic approach compared to what Riya is trying to do with face recognition or what community sites are doing with tagging. For text content, the issue of search becomes simpler since you can use the text to extract keywords or derive content (this second option being something no search engine does but that knowledge management companies like Autonomy do) but that obviously does not work for images. However, for video someone could envision a speech to text process that converts to text all the dialog in a video and uses that to create video metadata. I guess the problem is that speech recognition is still not working well enough to do this with enough accuracy (Nuance, formely ScanSoft, seems to be the leading company here but as far as I know, they have not applied their technology to solve this problem).

CD

Share on Facebook

Comments are closed.


  • You Avatar
    I am the director for Internet and Multimedia for Telefónica R&D, based in Barcelona where I managed their R&D center. I have been a bit all over the place for the last 15 years, specially in Tokyo, my favorite town, and finally came back in mid 2006 to my home town. I like everything that has to do with the Internet, computers, software and gadgets, not just the geeky aspect but also the business side. I also love reading (business essays mainly) and TV series and movies as well as having a good dinner and night out with my friends.