ASP.NET Google Search Query

Google search engine

You want to extract search terms from Google, Bing and Yahoo search referrer strings. Collect these referrer URIs in ASP.NET and use IndexOf on strings to determine the search terms. Use this for viewing queries in real time as users visit your site.

This C# tutorial shows one way you can collect Google and Bing search engine referrers. It requires ASP.NET.

Collect Google search query

First, we look at how you can collect the referrer strings in your ASP.NET site as it runs. The first part of the solution has a static class and method you can call from Page_Load or OnLoad in your ASP.NET pages to collect the list of referrers.

Program that collects referrers (ASP.NET and C#)

using System;
using System.Collections.Generic;
using System.Web;

static public class ReferrerList
{
    /// <summary>
    /// List of referrer strings.
    /// </summary>
    static public List<string> Referrers { get; set; }

    /// <summary>
    /// Add a referrer string from current request.
    /// </summary>
    static public void AddReferrer()
    {
	Uri referrer = HttpContext.Current.Request.UrlReferrer;
	if (referrer != null)
	{
	    string original = referrer.OriginalString.ToLower();
	    Referrers.Insert(0, original);
	}
    }

    static ReferrerList()
    {
	Referrers = new List<string>();
    }
}
List type.

Description. The referrer list is stored in the static property Referrers. The method AddReferrer() adds to this list. Because the class is static, it can be used from any page in your ASP.NET project. Exceptions will be thrown unless you check the Uri object for null.

Uri Class

Parse Google search query

Here we look at a method you can use on the OriginalString to parse the search queries effectively. Google dominates the search business, so Google search queries are the most important. The method I show here is fairly robust and I have tested them on many different referrals, but it could be enhanced with better logic.

GetQuery method implementation [C#]

/// <summary>
/// Get Google search query terms.
/// </summary>
static string GetQuery(string u)
{
    // 1
    // Try to match start of query with "&q=". These matches are ideal.
    int start = u.IndexOf("&q=", StringComparison.Ordinal);
    int length = 3;
    // 2
    // Try to match part with q=. This may be prefixed by another letter.
    if (start == -1)
    {
	start = u.IndexOf("q=", StringComparison.Ordinal);
	length = 2;
    }
    // 3
    // Try to match start of query with "p=".
    if (start == -1)
    {
	start = u.IndexOf("p=", StringComparison.Ordinal);
	length = 2;
    }
    // 4
    // Return if not possible
    if (start == -1)
    {
	return string.Empty;
    }
    // 5
    // Advance N characters
    start += length;
    // 6
    // Find first & after that
    int end = u.IndexOf('&', start);
    // 7
    // Use end index if no & was found
    if (end == -1)
    {
	end = u.Length;
    }
    // 8
    // Get substring between two parameters
    string sub = u.Substring(start, end - start);
    // 9
    // Get the decoded URL
    string result = HttpUtility.UrlDecode(sub);
    // 10
    // Get the HTML representation
    result = HttpUtility.HtmlEncode(result);
    // 11
    // Prepend sitesearch label to output
    if (u.IndexOf("sitesearch", StringComparison.Ordinal) != -1)
    {
	result = "sitesearch: " + result;
    }
    return result;
}

Overview. This code tries to locate the correct query parameters in the referrer string. The string "&q=" is usually the best match in the referrer urls. The StringComparison.Ordinal option is specified for improved performance and accuracy. Next the HttpUtility methods UrlDecode and HtmlEncode are used to fix the formatting of the links for HTML display. Finally, the queries are tested for "sitesearch".

HttpUtility.HtmlEncode Methods

Test framework

This project requires unit testing or other quality control to handle all require inputs. Note that you may have other kinds of queries that you can add to the array of strings we test. The next program I show is a Windows console program that runs through a list of URIs.

Code to test the query string method

//
// Some example referrer strings.
//
string[] urls = new string[]
{
    // Raw URLs omitted.
};

foreach (string url in urls)
{
    string urlEscaped = GetQuery(url);
    Console.WriteLine(urlEscaped);
}
Console.ReadLine();

Possible output of the code

c# dictionary
oo pattern authentication before method call
c# empty string
appendtext in c#
asp.net custom sitemap node
msdn c# read folder and file from directory
page.title with a site map
asp.net urlmappings
asp.net custom sitemap node
oo pattern authentication before method call
c# string split
c# split
slashdot.jp
method not found exception in c#

Results from the above code. Here I show what the above program will output from the test array. This will help us determine if things are working correctly. If you have other URIs you need to test, add them to the array and run the program.

Revisions

The first version of this document contained some problems when it was published. The current version, published on April 15, 2009 has a much enhanced parser that uses IndexOf. The new code is much faster and more reliable. For brevity, some parts of the previous article were removed.

IndexOf String ExamplesWarning

Possible error. A developer who installed this method wrote in to report a bug that occurs. Apparently, the HttpUtility.HtmlEncode can cause a security problem in some cases where non-ASCII characters are received. It would be possible to alleviate this problem by adding a step to the method that removes problematic characters. I apologize for any unexpected problems this code may cause.

Summary

We saw how you can quickly and effectively parse Google search queries and display the search terms in HTML. This is an alternative to or adjunct to Google Analytics and can give you real-time referral diagnostics for your site. You could make interesting widgets for your site with this technique.

ASP.NET Tutorials
.NET