Ember Observer's Code Search

Ember Observer now has a code search. This feature makes it possible to search through all Ember addon codebases for a search term.

Use cases, or Why would you want to search across all Ember addons?

We primarily want code search to facilitate learning Ember and how to build addons. It's useful to be able to search all of the addons' codebases for particular hooks or methods and see how they are typically used. For example, searching for didInsertElement will let you see how others have used that hook in components.

We also want code search to enable the community to see how widespread usage of an Ember feature is across addons. For example, for the recent RFC to deprecate Ember.K, we wanted questions like "How many addons are using Ember.K?", and "How heavily do those addons use it?" to be easy to answer. To answer these questions, with every search Ember Observer provides a total addon count and total usage count. It also allows sorting results by usage count so that it is easy to see which addons have the most usages of the search term.

We made the search results addon-oriented rather than match-oriented by grouping together search results from the same addon. The actual matches can be viewed, but the initial search result UI shows the list of addons containing matches. This makes it easier to see which addons contain the search term, and to find all addons that might need a PR to fix a deprecation or API change.

An emergent use case is using the code search as a window into addon dependencies. It has made it possible to see which addons and how many addons are using a particular addon or Ember version. We have future plans to surface this information in a more structured manner.

How it works

Ember Observer's code search is powered by google/codesearch, a set of command-line tools for indexing and searching large codebases.

To be able to search the codebases of all addons, an index must be built in advance. To do this, Ember Observer clones all Ember addons with a valid repository URL set in the package.json and then removes folders that we don't want included in the search (e.g. node_modules). google/codesearch is then used to build the index. Addons are fetched and indexed daily, so code search results are no more than a day behind. It is worth noting that the search results are from the main branch of the addon's repository as of some time in the last day; that code may not match the code in latest released version of an addon. Repositories that have become unavailable since the last update are removed from the index.

What's next?

Now that Ember Observer can search codebases, we have a few ideas to improve the existing search and use it to provide additional insight into the Ember ecosystem. In the near future we're planning to support regular expression searches, as well as some additional options for filtering.

In future posts I'll be taking a deeper dive into how google/codesearch works, and the performance optimizations that brought worst-case searches from over to a minute down to only a few seconds.