Many local search sites boast wide and deep data in their business listings. We decided to take a closer look by analyzing the Food & Dining category in New York City on 20 leading local sites.

We focused on two parameters:

1. The number of listings (e.g. restaurants, coffee shops, bars etc.) available in each site.

2. The total number of business details available for the above listings. These include attributes such as description, cuisine, video, reviews, opening hours and wine details. We excluded the names, phone numbers and addresses from the count, and when encountering more than one review per business we counted it as just one attribute.

We found great variance between the sites both in the number of listings that they have and in the richness of the data available per listing. For example, New York Magazine has a relatively low number of restaurants but the highest number of rich details per restaurant (8). In the overall analysis, CitySearch comes out as the local site with the most comprehensive data.

NYC Restaurants Rich Data Comparison

The above data is based in part on Palore’s data extraction system which collects, normalizes and merges local business data from millions of Web pages.

Please share your thoughts and let us know what other data comparisons you think would be interesting – in other verticals (e.g. auto dealerships or realtors), on a geographic basis (e.g. data analysis per city overlaid on a US map) or in any other format.

To acquire rich local business data in any vertical, please contact Palore’s sales team.

Advertisements

Palore Trends Coming Up

April 21, 2008

In the last few months we kept busy acquiring content from hundreds of sources. We crawled, aggregated, normalized, cleaned and… sold our data feeds in various verticals. Nothing out of the ordinary here. But then something interesting happened. While we typically sell our aggregated content feed as a single dataset without breaking it up into its separate sources (our clients like getting a single, unified and normalized feed), about a month ago we were asked to leave the content in its raw format. We shrugged our shoulders and did what the client asked for.

The following week we got an excited call from the customer who said “Guys, did you know that Site X has double the listings of Site Y in the East Coast but very poor data in Southern California?” Our initial response was “Yes, but who cares?” Well, apparently a lot of people do. That got us thinking about sharing this data with the world. Here’s the gist of it:

Palore crawls data from hundreds of local sites and that gives us a good outlook of what content is out there. Just like Comscore or Compete have comparative data about unique users per site, we have comparative data about the depth and width of each site’s content. We also have a good view of the aggregated data that’s out there in ALL of the local sites. For example, we can tell how many auto dealerships are listed on the top 10 auto sites in each state, or which local site has the most information about Sushi restaurants.

In the coming weeks we will share this information. If you’re interested in any specific type of comparative information, let us know!

I recently had a call with a client who asked for a rich content feed covering several top US cities. The feed was in the food & dining space and included cuisine, parking, payment methods, description, reviews, ambiance, menus, health details and a bunch of other business attributes.

During our “sales pitch” Malcolm (our VP of Business Development) and I explained how easy it would be for us to collect, clean up and send the data with quarterly updates. Later that day I had a chat with Nave, our VP of R&D, and we discussed the process involved in getting that data. I won’t bore you with all the details but it was amazing to see how complex it really is.

For example, it turns out that some of the top US websites use different encoding methods – one decided to use English, Swedish and Greek encodings all in the same site. Then there’s the issue of dealing with different formats such as Ajax, cookies and other weird navigation methods. Cleaning up the content into a normalized format from rogue characters and images is a whole new challenge and even when it’s all good and ready, how do you reconcile discrepancies (e.g. one source claimed that a restaurant was European while another insisted that it’s French…)?

After about 30 minutes of going over daunting buzzwords and technical challenges ranging from finding high quality sources to keeping the data fresh, I recalled our insistence that getting the data is “quite easy for us”. It makes one think of Adam Smith’s concept of Division of Labour and thank God that the sales guy doesn’t have to develop the product and that the R&D guy doesn’t have to sell it. 🙂

Headed to the Bay Area

November 8, 2007

I’m heading off to the Bay Area in a few days.  

I’ll be attending the 2007 GeoDomain Expo on November 16-17 in San Francisco, and The Kelsey Group’s ILM:07 Conference on November 28-30 in Los Angeles.  I’ll be spending around three weeks in the area meeting with partners, friends, investors.  

If you want to meet, shoot me an email at hanan [a t] palore.com

We’ve been having many talks with sites that are interested in integrating Palore’s rich local content. As some of you know, we’ve started providing local search sites and IYP sites with feeds of our content that enrich their sites.  

We’re currently working with two types of sites: Local sites that have a variety of information on businesses in specific locations (e.g. Los Angeles, Boston, etc.), and vertical sites that have in depth information on a specific attribute or characteristic of businesses on establishments nationwide (e.g. vegetarian, wi-fi information, etc.). It’s interesting to see how vertical sites (such as menu sites) want reviews and descriptions (basic info on various locations), and local sites (such as city guides) want menus, wi-fi information and wine details to enrich their content.  

Well, you know that there’s an abundance of information out there and it’s a good feeling to know that we can save our partners scraping efforts and multiple business development deals (See Fred Wilson‘s Business Development 2.0 post).

The blog post you were looking for does not exist here anymore.

Palore provides the richest and deepest data about local businesses in any vertical, and we post attractive facts we encounter while gathering the content.

Check out our latest post

I always like speaking with industry leading analysts about what we do. Partners and clients will skew things their way, team members and investors tend to be over-zealous with the product, but leading analysts and journalists will usually give you the cold truth in the right market context.

In the last few weeks I had several interesting talks and meetings with some of the best Local Search analysts and journalists in the field, including:

What helped us the most was their way of crystalizing the value proposition for site owners. For example, Greg Sterling wrote:

“…On the latter point, imagine that your site is being lost in a sea of generic search results. Now imagine you could add a logo or icon to call out your results from the others. This is branding in the context of search results, which is right now not otherwise possible. It presents a range of interesting opportunities to publishers who have been struggling with how to deal with search engines’ hold on consumer attention. If you’re the New York Times, for example, and you’re just one of dozens of publisher sources that come up when news-related searches are performed, this permits you to add your logo and call users’ attention to your content (vs. others) on Google or Yahoo search results.”

There’s always a risk of being criticized by these folks and hearing things you don’t want to hear. But at the end of the day, it’s important to get an objective opinion by people who know what they’re talking about. Thanks for helping out.