docs
  1. SCAYLE Resource Center
  2. Developer Guides
  3. Products
  4. Search

Search

General

SCAYLE’s search is based on three pillars:

  1. Implementation.
  2. Configuration & Data Quality.
  3. Analytics.

Only respecting each pillar will provide you with feasible search results.

Implementation

Implementing our APIs properly and your own tracking is the first step for a good search for your users. This page will describe some use cases and examples for each API endpoint.

Configuration & Data Quality

Configuring your shop categories, filterable attribute groups and product attributes significantly impacts your search performance. Learn more about how to configure your search in the User Guide.

Analytics

Analytics is an integral point in understanding what your users are searching for and it enables you to optimize your search configuration as a result of this. We recommend iteratively improving your configuration and data based on the search terms from your users. For example, by adding new categories or synonyms to categories. You can learn how to add synonyms in the User Guide.

The suggestions are based on your categories and filterable attribute groups. Navigation items that have been defined as searchable will also be returned by the endpoint.
While products will only be suggested if there is an exact match between the search term and the product identifiers, typo tolerance can be enabled for categories and navigation items.

Search Logic

SCAYLE’s search follows a category-page first approach, which means we first try to resolve a search term to a specific shop category with matching filterable attribute groups. We only fall back to a generic text-based search result page across your full assortment if we can't determine a matching category. In addition, as the user begins typing his search term, suggested matching categories are presented.

One of the pillars of a great search is Analytics, which enables you to understand what kind of search terms the users are typing in. From our experience with our clients, we see that most search terms are for broader product categories. Utilizing your existing categories for a great search experience has four key benefits:

  1. Exact catalog & sorting Category pages allow you to predefine the precise catalog of products. By using sorting keys, you can define the sorting of a specific category page.
  2. Specific Filters Category pages have detailed filters defined in the SCAYLE Panel, allowing the user to narrow the search further. Search results may be returned with filters to apply. Those filters should be applied in the frontend to narrow the search results down further.
  3. Customized content Category pages can display customized content from a CMS and are more advantageous from an SEO perspective.
  4. Efficient maintenance Category pages regarding filters, content, and sorting are regularly maintained.

Concepts

Learn about search concepts:

Entities

We currently store the following entities in our Search Database:

Primary EntitySecondary Entities
Category (Name)Attributes, Special Filters
Product (ID, Reference Key, EAN)None
Navigation Items (Name)None

Exclusion

Certain products and categories can be excluded from the search endpoints in the SCAYLE Panel. If a product or a category is excluded, they will not appear in any response of /v2/suggestions or /v2/resolve.

You can find more information on how to exclude elements from the search endpoints in the User Guide.

By default, these products will still be included in the v1/products endpoint. If you als want to exclude these products in the results of /v1/products make sure to add the parameter ?filters:not[isExcludedFromSearch]={true} in your request.

Stemming

Stemming is the practice of reducing words to their root form to improve accuracy and matching. For example, stemming converts plural forms to singular or removes tenses from words.

We apply stemming both to the searched content and the search term.

Original WordWord stem
runningrun
runsrun
carscar
Stemming is supported in the following languages:
  • Arabic
  • Armenian
  • Basque
  • Bengali
  • Bulgarian
  • Catalan
  • Czech
  • Dutch
  • English
  • Finnish
  • French
  • Galician
  • German
  • Hindi
  • Hungarian
  • Indonesian
  • Irish
  • Italian
  • Latvian
  • Lithuanian
  • Norwegian
  • Portuguese
  • Romanian
  • Russian
  • Spanish
  • Swedish
  • Turkish

Synonyms

Synonyms are alternate spellings of the same meaning or word; a good example would be different sayings of "Hello" in a particular language or dialect.

For example, "Hallo" has the same meaning as "Moin" or "Servus" in German, so they are synonyms for the same word.

Using synonyms, you can help your users find the correct content if they use different wording or redirect specific terms to a more fitting search result.

SCAYLE supports four types of synonyms:

  • Category synonyms
  • Attribute synonyms
  • Word synonyms
  • Navigation item synonyms

Following the category-page first concept of SCAYLE, we recommend resolving as many alternate spellings with categories as well.
For this, the usage of Category Synonyms is best suited.

See the User Guide to learn more about how you can configure your synonyms.

Example

Imagine you have a category called "Sweaters"; however, in your Search Tracking, you can see a lot of unresolved searches for the term “Pullover“.
Since these are technically two different terms, the SCAYLE Search by default won’t be able to correctly resolve “Pullover“ to the “Sweaters“ category.

However, setting the synonym “Pullover“ on the category will resolve the searches to the correct category, leading to an improved search result.

Tokenization

Tokenization refers to splitting a term into multiple searchable and standalone tokens.

A token is usually a single word; for example, in the term Blue Pants, there are two tokens: Blue and Pants.

We apply tokenization to the searched content and the search term.

The tokenization happens based on a set of separating characters; in our case, these are only whitespace where we split our search term.

We use different tokenizers to optimize our results depending on the use case of the API endpoint.

Typo Tolerance

Typo tolerance, also known as "fuzziness", returns results similar to the search term but not precisely matched.

The Levenshtein Edit Distance defines the difference between two words; it is the number of one-character changes needed to turn the search term (what the user typed in) into the searched content (what is stored in the database).

These changes include changing a character, removing a character or inserting a character.

The search term must match the first character, allowing a specified number of mistakes thereafter:

Search Term LengthMistakes allowed
0-4 characters0
5-9 characters1
10+ characters2

Suggestions

When the searched term would return a match on the resolve endpoint, that match will also rank highest on the suggestions endpoint. Refer to the resolve documentation for a more detailed explanation of that logic.

But the endpoint also works as an autocomplete and will match categories while a user is still typing. The autocomplete matching supports stemming (in supported languages) and supports synonyms as well as phonetic matching. It does not have any typo tolerance, but can match mistyped terms due to phonetic matching and stemming. The more accurately the searched term matches the category, the higher the category will rank.

Example:

  • Category 1:
    • name: "sweatpants"
    • filters: "color"
    • product count: 100
  • Category 2:
    • name: "sweaters"
    • filters: "color"
    • product count: 50

When searching for "blue swet" the returned results would be:

  • Category 1 with filter "blue" applied
  • Category 2 with filter "blue" applied

Both categories would match due to phonetic matching ("swet" and "sweat" match phonetically), but Category 1 will rank higher due to higher product count.

Examples

It is required to use the 16.1.0 version or higher of the @aboutyou/backbone NPM package.

const response = await client.searchv2.suggestions(
  "red Pants",
  {
    with: {
      categories: {
        parents: 'all'
      }
    }
  }
);

console.log(`Suggestions count: ${response.suggestions.length}`);

Restrict the search to a subset of categories

Restricting the results to a specific category can be helpful when you, for example, sell products for women and men; however, in your search, you might only want to get results for women or men, depending on the user's choice.

const response = await client.searchv2.suggestions(
  "red Pants",
  {
    categoryId: 20201,
    with: {
      categories: {
        parents: 'all'
      }
    }
  }
);

console.log(`Suggestions count: ${response.suggestions.length}`);

Resolve

While clicking on a suggestion is one action from a user, it’s not the only one.

Some users will just press "Enter" when they finish typing in their search terms and expect to be redirected to the correct content.
To satisfy that need, we have the resolve API. It should be called with the user's search term, once the user presses "Enter".

The resolve API is designed to return a single result that is the best match to the search term. It should be used when the user types in the full search term and presses "Enter". To be as accurate and correct as possible, the API only works with complete search terms.

The result is based on your categories and filterable attribute groups, and your searchable navigation items.
A matching product is only returned if there is an exact match between the search term and the product identifiers.

Logic

The endpoint will resolve to a category when at least one of the typed words matches a category name or it's synonyms. When the searched term also matches a filter on that category and the combination of that category and applied filter result in at least one product, then the filter is added to the response.

When there are multiple categories that match, then the categories that have a higher product count or the ones that are positioned higher on the category tree will rank higher.

The endpoint supports stemming and will have a tolerance of 1 typo on words longer than 5 characters.

Example:

  • Category 1:
    • name: "sweaters"
    • filters: "color"
    • product count: 100
  • Category 2:
    • name: "sweaters"
    • filters: "color"
    • product count: 50

When searching for "blue sweater", the returned results would be Category 1 with filter "blue" applied. Both categories would match, but Category 1 has higher product count and so will be returned.

Examples

It is required to use the 16.1.0 version or higher of the @aboutyou/backbone NPM package.

const response = await client.searchv2.resolve(
  "red Pants",
  {
    with: {
      categories: {
        parents: 'all'
      }
    }
  }
);

console.log(`Resolved Entity: ${response?.type}`);

Restrict the search to a subset of categories

Restricting the results to a specific category can be helpful. For example, when you sell products for women and men. In your search, you might only want to get results for women or men, depending on the user's choice.

const response = await client.searchv2.resolve(
  "red Pants",
  {
    categoryId: 20201,
    with: {
      categories: {
        parents: 'all'
      }
    }
  }
);

console.log(`Resolved Entity: ${response?.type}`);

For certain search terms, the resolve API cannot match the search term to a specific category or product.

In this case, we offer a simple text-based search using all of your assortment.

We leverage the term parameter of the products endpoint for that.

When using the filters[term] parameter, we don't recommend specifying any sorting parameter.

We then sort the products by how well the product matches the search term which will generally be a better sorting in these cases.

Logic

For example, where name and attribute1 are set as searchable through the SCAYLE Panel:

  • Product 1:
    • name: "christmas sweater"
    • attribute 1: "light blue"
  • Product 2:
    • name: "pants"
    • attribute 1: "blue"
  • Product 3:
    • name: "cardigan sweater"
    • attribute 1: "red"
  • Product 4:
    • name: "pants"
    • attribute 1: "red"

When searching for "blue sweater" the returned results would be:

  • Product 1 - it will rank the highest as both "blue" and "sweater" match
  • Product 3 - it will be returned because the searched word "sweater" matches the name
  • Product 2 - it will be returned because the searched word "blue" matches the attribute 1. By default, it will rank lower than Product 3 as the name is given higher relevance than attribute 1. However, this can be adjusted through SCAYLE Panel.

Product 4 will not be returned as the searched term does not match the name or attribute 1.

Example

// Search for products with "shirt" in their name
const products = await client.products.query({
  where: { term: "shirt" },
  with: {
    attributes: {
      withKey: ["name"],
    },
  },
});

console.log(products.entities[0].attributes.name.values.label); 
// Tom Tailor T-Shirt