Archive for August, 2013

Wajam Wins Gold For Innovator of the Year at 2013 International Business Awards and three more awards

August 14th, 2013 | by alainwong

posted in Awards and Press Mentions, Company, Press

International Business Awards 2013 Gold Winner Martin-Luc Archambault

We’re happy to announce that Wajam won four awards at the 2013 International Business Awards.

Wajam CEO Martin-Luc Archambault won two Gold Stevie’s in the category Innovator of the Year and Executive of the Year (Internet/New Media).

In addition, Wajam Shopping and Wajam Mobile were both recognized. Wajam Shopping won Silver for Best Shopping App, and Wajam Mobile won Silver for Best New Product of the Year in the Software category.

The 2013 International Stevie Award winners will be celebrated during a gala banquet in Barcelona, Spain on Monday, 14 October.

International Business Awards 2013 Wajam Mobile Silver Winner

The Wajam Mobile team wins Silver for Best New Product of the Year

International Business Awards 2013 Silver Winner for Best Shopping App

The Wajam Shopping team wins Silver for Best Shopping App

About the Stevie Awards
Stevie Awards are conferred in four programs: The International Business Awards, The American Business Awards, the Stevie Awards for Women in Business, and the Stevie Awards for Sales & Customer Service. Honoring organizations of all types and sizes and the people behind them, the Stevies recognize outstanding performances in the workplace worldwide. Learn more about The Stevie Awards at www.StevieAwards.com.

Wajam Softball in the Park

August 12th, 2013 | by alainwong

posted in Company, Culture, Events, Montreal, Team Activities

Last year, we won a lopsided softball game against fellow Montreal startup PasswordBox.

Since tripling the size of the team this year, we decided to spark a friendly competition internally by battling it out in a threeway tournament: Wajam vs Wajam vs Wajam. Needless to say who won :)

Check out some of the memorable moments at Jeanne-Mance park.

The Highly Scalable Architecture behind the Wajam Social Search Engine

August 6th, 2013 | by alainwong

posted in Big Data, Engineering

At Wajam, we’re helping you find recommendations from friends you trust, whenever you need them. In order to do that, we constantly innovate with new technologies. Our technical blog posts are meant to share with you what we’ve learned along our high-tech adventures.

Graffiti artwork from the Wajam office

BACKGROUND

Wajam is a social search engine that gives you access to the knowledge of your friends. We gather your friends’ recommendations from Facebook, Twitter and other social platforms and serve these back to you on supported sites like Google, eBay, TripAdvisor and Wikipedia.

To do this, we aggregate, analyze and index relevant pieces of shared information on our users’ social networks. The main challenge stems from the amount of information we collect as well as the speed at which we must display search results back to users. Our goal is to serve requests in <200ms to remain competitive.

EXPLORING SEARCH ENGINE SOLUTIONS

At the beginning, Wajam searches were made in MySQL and soon after using a Sphinx cluster. Sphinx was chosen at the time because some of the first developers at Wajam had experience with it.

However, Sphinx was not suited to be distributed and real-time, so the team had to build a large framework to fulfill these requirements. With time, the solution began to show its age; we encountered low-level and painful cluster management, manual resharding, slow geolocated searches, and high CPU usage.

At the breaking point, we faced a crossroad. Either maintain the current solution by building a bigger framework around it, or go with a new modern solution.

EVALUATION

After examining our options which included Solr, we decided to give Elasticsearch a try.

Elasticsearch is built around Apache Lucene, which is a robust and widely used search engine and has a large developer community associated to it. Elasticsearch is built with scalability and real-time indexing as a prime concern, two of the main reasons why we wanted to move over from Sphinx. In addition, Elasticsearch uses a technique called geohash for geo-located queries which offer a significant performance boost for those kinds of queries.

Another important point is the fact that Elasticsearch is written in Java while Sphinx is written in procedural C++. The advantage of well-maintained and open-source Java code is more than just stability, it also means that we could code features ourselves and give back to the project.

Here is a comparison table between Elasticsearch and Sphinx highlighting the key points according to our needs.

IMPLEMENTATION

At the time of our evaluation in fall 2012, Elasticsearch seemed to gain traction quite rapidly and came out as an interesting alternative that could potentially fill all of our needs.

We were impressed by preliminary benchmarks using our data. At smaller scale, Elasticsearch outperformed Sphinx in most of our use case scenarios. We then decided to fully commit to the transition.

GOALS

  • Scalability : Horizontal scalability is one of our prime concerns as the quantity of data generated by Wajam is growing exponentially, as is our number of users.
  • Performance : We want to improve performance on every query done to the Search API. The initial goal was to bring search latency under 200 ms (median), including query construction, query execution and parsing.
  • Flexibility : Data is often adapted to new features that we build, so we’re interested in having a flexible data scheme to suit our rapid development cycle.

THE PLAN

Our initial idea was to develop the first iteration of the new infrastructure as a direct replacement for Sphinx. That is, keep things simple and use a single index to put in each and every document. As explained later on under pitfalls and lessons learned, this turned out to be a flawed plan.

We already had a defined API, so we built an Elasticsearch search client replacement for the Sphinx one. Following that, we defined the mapping based on what we had in Sphinx and made adjustments to make use of the new Elasticsearch features.

Realtime data comes from Jobs that fetch the data from the diverse social network APIs. The data is backed up on HDFS from the same source. Then we used a Pig UDF (Wonderdog by Infochimps) to bulk load the data into Elasticsearch.

The Backend API and the Jobs were converted to interact with Elasticsearch.

During the transition period, we were pushing data in parallel to Elasticsearch and Sphinx at the same time. To complete the migration, we progressively changed a configuration flag in the Backend to query Elasticsearch instead of Sphinx.

PITFALLS AND LESSONS LEARNED

Elasticsearch is not Sphinx. Trying to replicate an architecture from a system to another for simplicity can be a bad idea. It might work in some cases, but in ours, it almost failed. Putting tens of billions of documents in a single index is against the Elasticsearch philosophy and recommendations. We did manage to make this work, but without replicas and some flaky performance. In this situation, each query would hit hundreds of nodes in the cluster. Theoretically this is using the cluster as its full capacity, but in practice it’s not a good idea because nodes will fail, slow down or even disappear. Losing one node means losing indexing capability and degraded search. Don’t do it.

Feature flag. Since we were doing a transition from a legacy system to a new one, we had the opportunity to build the new system in parallel and progressively switch with configuration. Since we had no previous experience with Elasticsearch and had no idea how it would react to our traffic, simply switching to the new architecture would have been risky. So we built the system with a configuration flag to switch from one search engine to another. This way we could iterate until we were able to support all the traffic or rollback at our convenience. This saved us more than once.

Monitor and track everything. Especially with systems like information retrieval, it’s hard to apprehend the impact of every change you make. Having metrics for almost everything helped us measure the impact of the iterative changes we made to our infrastructure. For this we use metrics, graphite and a nice perl script to visualize key performance indicators.

Design query (filters) that are cacheable. The cache in Elasticsearch is awesome. Elasticsearch will build the filter bitsets in memory, and continually add new segments to each bitset as they become available. From then on, query filtering is fullfilled by the cached bitsets instead of reading the segments every time, delivering better performance as a result.

SUCCESS AND RESULTS

Since then, we iterated many times on the architecture and we are now running a fully replicated Elasticsearch serving all of our users’ searches. We’ve gone through multiple implementations, and are currently using fixed-size monthly bucket indices to give us predictability over cluster behavior and scalability since we’re keeping equal-sized index shards. This way, we can easily model our needs according to our growth rate and add nodes when needed. Moreover, having multiple indices grouped and identified with aliases allow us to be more flexible with the queries we are doing.

During the whole process we have been monitoring the evolution with a large number of performance metrics. Below are some of the latest graphs showing the system response time and indexation rate that we have been able to achieve with the current architecture.

Median search query response time over time

Document indexing over time

Since switching over to Elasticsearch, we’ve enjoyed a 10x improvement in performance compared to Sphinx with a median response time of 150 ms while serving more documents than ever.

WHAT THIS MEANS FOR USERS

At Wajam, we’re always been focused on giving users the best social search experience, and now with Elasticsearch, we’re able to deliver on this promise more quickly and more reliably. You can get recommendations from friends you trust, across all of the sites that we support, both on the desktop and on mobile devices.

Although it was a challenging transition to migrate our data to the new Elasticsearch architecture, we now have a solid foundation from which to scale. Stay tuned for more exciting new features coming soon!

Elasticsearch is launching 0.90.3 today. Check it out here.

I would like to thank personally Shay Banon, the elasticsearch consulting staff and everyone on #elasticsearch on Freenode that helped me on this project. I am often online on this channel as jgagnon. You can also reach me at @jeromegagnon1.

Apache Lucene and Solr are trademarks of the Apache Software Foundation

[VIDEO] Graffiti: How To Add Style To Your Startup Office

August 2nd, 2013 | by alainwong

posted in Company, Culture, Montreal, Startups

This past week, we’ve been adding style to the new Wajam office in Montreal. Check out this unique collaboration with local graffiti artists Axe, Alex Scaner & Earth Crusher.

It’s been great to add some color and life to our workplace.