Friday, March 5, 2010

Google Search

Here are some search syntax basics and advanced tricks for Google.com. You might know most of these, but if you spot a new one, it may come in handy in future searches.

A quote/ phrase search can be written with both quotations ["like this"] as well as a minus in-between words, [like-this].
Google didn’t always understand certain special characters like [#], but now they do; a search for [C#], for example, yields meaningful results (a few years ago, it didn’t). This doesn’t mean you can use just any character; e.g. entering [t.] and [t-] and [t^] will always return the same results.
Google allows 32 words within the search query (some years ago, only up to 10 were used, and Google ignored subsequent words). You rarely will need so many words in a single query – [just thinking of such a long query is a hard thing to do, as this query with twenty words shows] – however, it can come in handy for advanced searching... especially as a developer using the Google API.
You can find synonyms of words. E.g. when you search for [house] but you want to find “home” too, search for [~house]. To get to know which synonyms the Google database stores for individual words, simply use the minus operator to exclude synonym after synonym (they will always show as bold in the SERPs, the search engine result pages). Like this: [~house -house -home -housing -floor].
To see a really large page-count (possibly, the Google index size, though one can only speculate about that), search for [* *].
Google has a lesser known “numrange” operator which can be helpful. Using e.g. [2000..2005] (that’s two dots inbetween two numbers) will find 2000, 2001, 2002 and so on until 2005.

Google’s define-operator allows you to look up word definitions. For example, [define:css] yields “Short for Cascading Style Sheets” and many more explanations. You can trigger a somewhat “softer” version of the define-operator by entering “what is something”, e.g. [what is css].
Google has some exciting back-end AI to allow you to find just the facts upong entering simple questions or phrases like [when was Einstein born?] or [einstein birthday] (the answer to both of these queries is “Albert Einstein – Date of Birth: 14 March 1879”). This feature was introduced April this year and is called Google Q&A. (See some of the various working Q&A sample queries to get a feeling for what’s possible.)
Google allows you to find backlinks by using the link-operator, e.g. [link:blog.outer-court.com] for this blog. The new Google Blog Search supports this operator as well. In fact, when Google’s predecessor started out as Larry Page’s “BackRub” in the 1990s, finding backlinks was its only aim! However, not all backlinks are shown in Google today, at least not in web search. (It’s argued that Google does this on purpose to prevent reverse-engineering of its PageRank algorithm.)
Often when you enter a question mark at the end of the query, like when you type [why?], Google will advertise its pay-for-answer service Google Answers.
There a “sport” called Google Hacking. Basically, curious people try to find unsecure sites by entering specific, revealing phrases. A special web site called the Google Hacking Database is dedicated to listing these special queries.
Google searches for all of your words, whether or not you write a “+” before them (I often see people write queries [+like +this], but it’s not necessary). Unless, of course, you use Google’s or-operator. It’s an upper-case [OR] (lower-case won’t work and is simply searching for occurrences of the word “or”), and you can also use parentheses and the “|” character. [Hamlet (pizza | coke)] will find pages containing the word (or being linked to with the word) “Hamlet” and additionally containing at least one of the two other words, “pizza” or “coke”.
Not all Google services support the same syntax. Some services don’t allow everything Google web search allows you to enter (or at least, it won’t have any effect), and sometimes, you can even enter more than in web search (e.g. [insubject:test] in Google Groups). The easiest thing to find out about these operators is to simply use the advanced search and then check what ends up being written in the input box.
Sometimes, Google seems to understand “natural language” queries and shows you so-called “onebox” results. This happens for example when you enter [goog], [weather new york, ny], [new york ny] or [war of the worlds] (for this one, movie times, move rating and other information will show).

  • Not all Googles are the same! Depending on your location, Google will forward you to a different country-specific version of Google with potentially different results to the same query. A search for [site:stormfront.org] from the US will yield hundreds of thousands of results, whereas the same search from Germany (at least if you don’t change the default redirect to Google.de) returns... zilch. Yes, Google does at times agree to country-specific censorship, like in Germany, France (Google web search), or China (Google News).
• Sometimes, Google warns you about its results, especially when they might seem like promoting hate sites (of course, only someone misunderstanding how Google works could think it’s them promoting hate sites). Enter [jew], and you will see a Google-sponsored link titled “Offensive Search Results” leading to this explanation.
• For some search queries, Google uses its own ads to offer jobs. Try entering [work at Google] and take a look at the right-hand advertisement titled e.g. “Work at Google Europe” (it turns out, at the moment, Google Switzerland is hiring).
• For some of the more popular “Googlebombed” results, like when you enter [failure] and the first hit is the biography of George W. Bush, Google displays explanatory ads titled “Why these results?”.
• While Google doesn’t do real Natural Language Processing yet, this is the ultimate goal for them and other search engines. A little What-If Video [WMV] illustrates how this could be useful in the future.
• Some say that whoever turns up first for the search query [president of the internet] is, well, the President of the internet. (I’m applying as well, and you can feel free to support me with this logo.)
• Google doesn’t have “stop words” anymore. Stop words traditionally are words like [the], [or] and similar which search engines tended to ignore. Sometimes, when you enter e.g. [to be or not to be], Google even decides to show some phrase search results in the middle of the page (separated by a line and information that these are phrase search results).
• There once was an easter-egg in the Google Calculator that made Google show “42” when you entered [The Answer to Life, the Universe, and Everything]. As I’ve been alerted in the forum, the easter egg only works lower-case.
• You can use the wildcard operator in phrases. This is helpful for finding song texts – let’s say you forgot a word or two, but you remember the gist, as in ["love you twice as much * oh love * *"] – and similar tasks.
• You can use the wildcard character without searching for anything specific at all, as in this phrase search: ["* * * * * * *"].
• Even though www.googl.com is nothing but a “typosquatter” (someone reserving a domain name containing a popular misspelling) and search queries return very different results than Google, the site is still getting paid by Google – because it uses Google AdSense.
• If you feel like restricting your search to university servers, you can write e.g. [c-tutorial site:.edu] to only search on the “edu” domain (you can also use Google Scholar). This works for country-domains like “de” or “it” as well.

0 comments:

Post a Comment