pagerank

Overview

Visibility of a website refers to the ease with which potential customers can find it amongst the myriad competitors. It is a crucial factor in making a success of a business website. The goal of enhancing web visibility is not only to obtain the highest possible number of visitors but, more importantly, to obtain the highest possible number of wanted visitors, directed to the most appropriate pages;on the site.

Search engine optimization (SEO) is the process of making a site more visible to people who user search engines to find the information they need. It offers incredible opportunities for growing e-business by reaching out to just the right people. Content on a website is the core of SEO. Search engine crawlers consider what words are used, where they are positioned, and whether they match the terms under searching.

In this project, we work with Pure Visibility, a specialized SEO agency, to analyze multi-dimensional factors related to web visibility. The primary goal is to design an interactive information visualization system to represent the complex data set, discover tendency, pattern and correlations among each factor and the overall visibility score, and help SEO analysts get insight into how web content factors could influence the visibility result, thus come up with web design strategy to enhance the visibility for clients website. Keyword and key phrases are the most important factors under research.

Users

There are two types of potential users for this information visualization system.

1. The major users are the data analysts in Pure Visibility

    Key Features:
  • Have solid IT background
  • Use web resource and application to collect raw data.
  • Use desktop application and self-developed tools to do extensive calculations, and be skillful at using spreadsheet to present analytical result.
  • After quantitative analysis, could get understanding about how the individual factor could influence the visibility, but have trouble to generate a big picture of all factors and their correlations.
  • Submit page report as the final deliverable for client to present analysis results.
  • In a typical report, the analysis results are displayed in separate tables, from different perspectives. An overall summary might be needed to interpret all data.
  • No visualization is included in the report.

An interactive information visualization system is helpful for them to explore the relationship among various data, view the data from different perspectives, and discover the overall pattern of all factors and the final visibility score.

2. The clients could be potential users for the visualization system.

    Key Features:
  • Clients have different understanding of web content strategy and the competitors in the same domain.
  • Most of them do not know the factors influencing web visibility, and cannot interpret the entire quantitative analysis report.
  • They concern more about the final solutions and recommendations, rather than the process of analysis.

Intuitive and printable graphics to verify the recommendations are desirable.

Information

The current analysis procedure and associated data for each step are listed as follow:

1. Use online services to identify the candidate key phrases and major competitors in the same domain as the client site.

2. Clean up key phrases, and get a comprehensive list for further analysis.

3. Get traffic score for each keyword in three main search engine, the result looks like:

Keyword

Google

Yahoo!

MSN

KW1

TrafficG1

TrafficY1

TrafficM1

KW2

TrafficG2

TrafficY2

TrafficM2

……

……

……

……

KWi

TrafficGi

TrafficYi

TrafficMi

 

The traffic score indicates which keywords are most competitive

4. Cluster the keywords into several categories

5. Get site ranking of each competitors’ site for each keyword on each of the three search engines

KW1

Google

Yahoo!

MSN

          Web1

RankW1G1

RankW1G1

RankW1G1

          Web2

RankW2G1

RankW2G1

RankW2G1

……

……

……

……

KW2

 

 

 

         Web1

RankW1G2

RankW1G2

RankW1G2

          Web2

RankW2G2

RankW2G2

RankW2G2

 

6. Run Perl script to get a power score for each site by taking the sum rankings for a site over all the keywords in all the search engines. Result looks like:

Keyword

Google

Yahoo!

MSN

Web1

PowerG1

PowerY1

PowerM1

Web2

PowerG2

PowerY2

PowerM2

……

 

 

 

Webj

PowerGj

PowerYj

PowerMj

 

The power score indicates which websites are most competitive

7. Power score and Traffic scores are the most important index to determine web’s visibility

Ideally, “the best keyword is of high traffic score but with low competition” (rarely appear in the strongest competitors’ website).

Challenges:

The two key scores are both defined in the spaces with multiple discrete components, including website, keyword, and search engine, and they are related to each other. This proposes many interesting challenges for this project, including:

  • How to effectively combine them together to present the overall web visibility,
  • how to allow analysts to view part of the data from specific perspectives,
  • how to support users to discover ideal keywords through the visualization for visibility enhancement,
  • and how to predict the results if some factors are changed.

And other factors about web content might also be researched in the end of the project, such as:

  • The position of keywords (header, menu, body, hidden attributes, source code)
  • The format of web content (Image, JavaScript, Flash)
  • Content in other web file (.css, .js, .xml)