Skip to content
Cuelogic
  • Services
    • Services

      Build better software and explore engineering excellence with our industry-leading tech services.

      • Product Engineering
        • Product Engineering
          • Product Development
          • UX Consulting
          • Application Development
          • Application Modernization
          • Quality Assurance Services
          Menu
          • Product Development
          • UX Consulting
          • Application Development
          • Application Modernization
          • Quality Assurance Services
          Migrating application and databases to the cloud, moving from legacy technologies to a serverless platform for a FinTech organization.
          Download ❯
      • Cloud Engineering
        • Cloud Engineering
          • Cloud Services
          • DevOps Services
          • Cloud Migration
          • Cloud Optimization
          • Cloud Computing Services
          Menu
          • Cloud Services
          • DevOps Services
          • Cloud Migration
          • Cloud Optimization
          • Cloud Computing Services
          Building end-to-end data engineering capabilities and setting up DataOps for a healthcare ISV managing sensitive health data.
          Download ❯
      • Data & Machine Learning
        • Data & Machine Learning
          • Big Data Services
          • AI Consulting
          Menu
          • Big Data Services
          • AI Consulting
          Setting up a next-gen SIEM system, processing PB scale data with zero lag, and implementing real-time threat detection.
          Download ❯
      • Internet of Things
        • Internet of Things
          • IoT Consulting
          • IoT App Development
          Menu
          • IoT Consulting
          • IoT App Development
          Building a technically robust IoT ecosystem that was awarded the best implementation in Asia Pacific for a new age IoT business.
          Download ❯
      • Innovation Lab as a Service
        • Innovation Lab as a Service
          • Innovation Lab as a Service
          Menu
          • Innovation Lab as a Service
          Establishing an Innovation Lab for the world’s largest Pharma ISV, accelerating product innovation & tech research while ensuring BaU.
          Download ❯
      • Cybersecurity Services
        • Cybersecurity Services
          • Cybersecurity Services
          Menu
          • Cybersecurity Services
          Big Data Engineering at scale for IAC’s SIEM system, processing PB scale data to help brands like Tinder, Vimeo, Dotdash, etc.
          Download ❯
      • Healthcare IT Services
        • Healthcare IT Services
          • Healthcare IT Services
          Menu
          • Healthcare IT Services
          Upgrading a platform for patients to access doctors via chat or video consultation, modernizing design, & migrating infra to the cloud.
          Download ❯
  • Company
    • Company

      Find out why Cuelogic, a world-leading software product development company, is the best fit for your needs. See how our engineering excellence makes a difference in the lives of everyone we work with.

    • about usAbout

      Discover how Cuelogic is as a global software consultancy and explore what makes us stand apart.

    • CultureCulture

      Read about our free and open culture, a competitive edge that helps clients and employees thrive.

    • Current openingCurrent Openings

      Want to join us? Search current openings, check out the recruitment process, or email your resume.

  • Insights
  • Tell Us Your Project
Tell Us Your Project  ❯
Cloud  10 Mins Read  May 22, 2019  Akash Shah

A step by step guide to setting up Amazon Cloud Search (With examples)

Share Via –
Share on facebook
Share on twitter
Share on linkedin

Home > A step by step guide to setting up Amazon Cloud Search (With examples)

If you browse through a wide variety of websites, one thing you will find in common just about everywhere is the search option. If you are a music lover browsing through your favorite tracks, a food blogger checking out new food trends from the modern city you are visiting, looking for your favorite artist playing their next gig or just doing window shopping on your favorite shopping app, searching is the essential part of any significant data-driven website. Without search or even a suitable searching mechanism your data is virtually inaccessible from the reach of your users.

So, to put it in simple words, your search engine's implementation is one of the essential parts of your web application. This drives users to the content they are looking for in the fastest and efficient way. This also brings up the discussion about selecting the right search tool for your application. There are multiple searches as service provider tools available to choose from.

  • Google Cloud Search
  • Azure Cognitive Search
  • SharePoint Online / Office 365 Search
  • Amazon CloudSearch
  • Amazon ElasticSearch

Amazon

Elasticsearch and CloudSearch are Amazon's cloud-based solutions for search. Elasticsearch is an open source solution, whereas CloudSearch is a fully managed search service. It is quite simple to set up, easy to, and a cost-effective search solution. Amazon CloudSearch is an AWS Cloud managed service that helps users to create fast, scalable, profitable, easy to setup search solutions for their applications. Amazon CloudSearch uses Apache Solr as the underlying text search engine, which supports full-text search, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features, and productive document handling.

You will be able to interact with Amazon CloudSearch through three service layers mentioned below:

  • Document service
  • Configuration Service
  • Search service

Your Identity and Access Management (IAM) policies will allow you to access, manage, and configure your Amazon CloudSearch services. The Configuration service and Document service are for developers to set up and maintain the CloudSearch domain and its data. Whereas Search service is client facing where developers and QA can verify the configuration, indexes, and validate the data. Also, this is exposed to the real world users where it’ll cater all the search requests coming from either a web application or from a mobile app where the search domain is configured. Search service is also responsible for delivering fast, accurate, and real-time data for all the user queries.

cloudsearch architecture

Document service

Document service is used to manage and configure searchable data of the domain. Each domain will have its endpoint. To upload your data, you need to format it into XML or JSON. Each item that you want to be returnable as a search response is called a document. Every object in your search response will have that unique document id and search fields that you requested in your search request. Document service also allows you to add new data anytime you want. Once the information is uploaded, it'll reindex automatically and will be available to search in just minutes.

Configuration Service

Configuration Service allows you to create and manage the search domain. This helps you to index, scale deploy it to multiple availability zones. This has quite a helpful step by step wizard to guide you through all the configuration steps for your search domain. You have to start with a unique name for your search domain, then configure the search index, set your scaling options and availability zone with the instance size.

Search service

Once all this is set up, you'll be able to test out your domain with unique HTTP endpoint and query parameters handled by Search service. Search service handles search and suggestion requests for the area. The search service will have a single HTTP endpoint for querying. You can configure your search results in either XML or JSON format. CloudeSeach supports rich query language that helps users to build search queries with related text search, range search, facet option to filter through options and other options to create composite questions.

Following are the main benefits Amazon CloudSearch provides that compelled me to use CloudSearch over other services:

Simple

If your tech stack is already hosted on AWS, then setting up CloudSearch is quite simple. You can set that up via AWS Management Console, AWS SDKs or AWS CLI. Adding data to your search is as simple as uploading a file(JSON, XML or text file) from AWS Management Console, browse your data and upload it. CloudSearch will automatically do the grunt work, identifies the file type, analyze the data, set up indexes, create a search, sort and facet options. The user has to review it and save changes.

Fully Managed

Amazon CloudSearch is a fully managed custom search service, where you will be able to pick your instance type, select availability zones, provisioning as well as scaling and partition using either AWS CLI or management console. Data uploading, reindexing, suggesting facet search options are all done from the quite intuitive and user-friendly management console, which helps to set up your domain and be ready for start testing it in minutes with simple wizard-based instructions.

Scalable

Scalability is one of the essential aspects when you are growing. The CloudSearch domain will autoscale as your data or query volumes increases. AWS CloudSearch will automatically scale up or down depending on your domain's as per its usage. So if the load increases, it'll scale up to meet the requirements and scale down when there aren't any significant number of queries.

Reliable

Reliability is most important when you are working with data that drives your applications search service. AWS CloudSearch has Multi-AZ options, so your data is secure of any hardware failure and also minimize latency. Search traffic automatically distributes across all available zones to meet the requirements and auto-scales depending on the load. It will make sure if one free zone fails then it'll request to next nearest data source and fetch the data.

High Performance

Performance is one of the most important criteria while choosing your search engine. Faster data delivery is one if the main reason many engineers migrate from having a self build searching mechanism from searching from DB to externally hosted search service like AWS CloudSearch. Automatic indexing, horizontal - vertical scaling, and distributed data give you all the edge for delivering your data with low latency and high performance.

Prepare your test data :

Now we’ll show you how to create your domain on Amazon CloudSearch and set that up with your personal data. As we discussed earlier, you can upload your data in json, XML or text CSV file for your search domain. If you have your data, then you can upload that, or you can download a wide range of pre-formatted data from kaggle datasets. For this document purpose, I have chosen Google Play Store Apps data set. This data set has around 10.8K records; for this example purpose I have truncated it down to about 5K lines. You could do that with the following command.

head -100 oldfile > newfile

The truncated sample file is uploaded to https://github.com/akashs-cuelogic/CloudSearch. Feel free to use that.

Prerequisite:

AWS account !!

  1. Collect data From the AWS home page, navigate to Cloud Search under the Analytics section. This will give you a step by step instructions on how to create your own Cloud Search domain, upload data, set indexes, and start searching. Start by clicking the Create a new search domain shown in the picture below.AWS_create
  2. Create a new search domain and setup size Setting up a name for you CloudSearch domain is relevant because that will also be a part of your search API’s url. Under that, you’ll see a couple of options to set up the size of your instance and replication count. The size of the case and replication count is directly proportional to the size of your data and the volume of your request. If you are working with an extensive set of data, it’s advisable to use a more significant instance type. And if you are expecting a large number of concurrent requests, then increase the replication count accordingly.
    AWS_add_title
  3. Upload index definitions Indexing your data will dramatically increase the performance of your search and Cloud Search will do that automatically when you upload your data. For indexing, you don’t need your the whole set of data, and you need a few samples for Cloud Search to identify the data attributes and how to index them. There are multiple ways to upload sample data. You can upload it from your local machine, from your S3 bucket. You can also provide your DynamoDB data or do it manually. If you are trying things out, then, they also have sets of sample data to choose from and start testing things.AWS_configuring_index
  4. Configuring indexes Index configuration is the essential thing while setting up your domain. In most cases, Amazon CloudSearch will automatically index your data and will be reflected in your results in a few minutes. Once any changes on your domain configuration need reindexing your data and you can run it manually also from either client or the dashboard.Indexing options control how your data is mapped to index fields and what information you can search and retrieve from the index. The data you upload must contain the same fields configured in your domain's indexing options, and the field values must be compatible with the set field types.AWS_review_index
  5. Set up Access Policies Amazon CloudSearch providers you various way to allow and restrict service APIs(Search and Suggester) and the domain(Document) services APIs. There are multiple options you can toggle your accessibility to your services.
    1. Search and Suggester service: Allow all. Document Service: Account owner only.
      This will allow search and suggester service accessible to everyone without any restrictions. And for who maintains domain and data there will be another service where they can upload new data, index them and make all the scaling and optimization they require for the CloudSearch to work efficiently.
    2. Allow open access to all services (not recommended because anyone can upload documents)
      This will all open access to search as well as maintain the data and other options. This is not recommended as it will expose all the data and may not be the secure option if you want to use it in your application where there is sensitive data.
    3. Allow access to all services from specific IP(s)
      This is the same as above, but we are restricting the search, suggestions, and domain request from some particular IPs only. This is a good option where you search domain used in some internal application, and it is not open for other users. Then you can whitelist those IPs and restrict others.
    4. Deny access to all services (No one can access your endpoint)
      Search and document requests must either be submitted through the console or authenticated with your account credentials. The document and search endpoints do not allow anonymous access or accept requests from other AWS users.AWS_access_settings
  6. Confirm Domain info The last step is to verify the domain information. It’ll list all the indexed fields, scaling options, and access policies. You can click on edit options next to it and make changes before confirming. If you are satisfied with your configuration, click Confirm. This will take you to the dashboard page where you will see the status of the domain with other information.AWS_confirming_settings
  7. Domain dashboard The dashboard will have the following information
    Searchable documents: Count of number of records that are available to search from
    Index Fields: Fields which are indexed and either searchable, returnable, or have sort options.
    Search Endpoint: Endpoint of an API where you’ll be querying your data.
    Document Endpoint : It’ll have all the information needed to set up the domain and how to use search services.
    Domain ARN: Amazon Resource Names (ARNs) uniquely identify AWS resources. Every resource in AWS will have a unique identification number with specifying a resource unambiguously across all of AWS.
    Engine Type: Type of search engine (CloudSearch) with the API version (2013). A search engine makes it possible to search extensive collections of mostly textual data items (called documents) to find the best matching results quickly.Note: Search Endpoint and Document Endpoint will be dynamically created from the name of your search domain given in the first step.
    AWS_Domain_dashboard
  8. Upload search documents As you can see above, the count of searchable papers is 0. In the initial steps, we just configured the indexes and added options to make them searchable sortable, etc. But if that is only a schema of search documents, we need the data sets with the same attributes and make them available for search requests. CloudSearch will allow you to upload your data in multiple ways. They are the same options that we choose in part 3, where we were uploading the document for adding indexes. Upload search documents
  9. Review the data Before all the records get uploaded to the service, it’ll first ask you to verify the data fields. Once you confirm that it’ll upload it to the service. Once you click on Upload Documents, it’ll start uploading all the records from the selected data source and index them and make them searchable. Review the data
    After all the data is uploaded, you can start testing out your data ……..
  10. Testing your data Testing your data is quite easy in Amazon CloudSearch. You can put your query text in Test Search input field and click GO. This will search through all the searchable fields that you selected while indexing and give you results as shown below. Testing your dataYou can also query via your Amazon CloudSearch Domain and simple q expression which will do a default search to all the searchable fields.

https://domain_name.com/2013-01-01/search?q=photo&return=app,category&size=10

So these are the steps to set up your Cloud domain with your personal data set and start testing and tweaking with indexes and data configurations. There are multiple options available to do the advanced search options with nested and query suggestions that you can build depending on your search filters. We will try to cover in the next part. Amazon CloudSearch is a complete search solution which will allow you to scale and upload new data realine and make available to search in no time. With Amazon CloudSearch, one should be able to create their search domain, set search attributes, upload the data, and start testing them out in no time. It provides intuitive step by step wizard that will allow you to set that up easily.

Recommended Content
What is Anthos : The multicloud platform by Google {Updated} ❯
The Most Powerful & Practical Write-up on Cloud Cost Optimization ❯
Multi-cloud Spending: 8 Tips To Lower Cost ❯
Go Back to Main Page ❯
Tags
cloud based search solutions Elasticsearch Amazon CloudSearch Cloud Document service Cloud Configuration Service Amazon CloudSearch providers
Share This Blog
Share on facebook
Share on twitter
Share on linkedin

Leave a Reply Cancel reply

People Also Read

Product Development

Low Code Platform: The Future of Software Development

8 Mins Read
Quality Engineering

BDD vs TDD : Highlighting the two important Quality Engineering Practices

8 Mins Read
DevOps

Getting Started With Feature Flags

10 Mins Read
Subscribe to our Blog
Subscribe to our newsletter to receive the latest thought leadership by Cuelogic experts, delivered straight to your inbox!
Services
Product Engineering
  • Product Development
  • UX Consulting
  • Application Development
  • Application Modernization
  • Quality Assurance Services
Menu
  • Product Development
  • UX Consulting
  • Application Development
  • Application Modernization
  • Quality Assurance Services
Data & Machine Learning
  • Big Data Services
  • AI Consulting
Menu
  • Big Data Services
  • AI Consulting
Innovation Lab as a Service
Cybersecurity Services
Healthcare IT Solutions
Cloud Engineering
  • Cloud Services
  • DevOps Services
  • Cloud Migration
  • Cloud Optimization
  • Cloud Computing Services
Menu
  • Cloud Services
  • DevOps Services
  • Cloud Migration
  • Cloud Optimization
  • Cloud Computing Services
Internet of Things
  • IoT Consulting
  • IoT App Development
Menu
  • IoT Consulting
  • IoT App Development
Company
  • About
  • Culture
  • Current Openings
Menu
  • About
  • Culture
  • Current Openings
We are Global
India  |  USA  | Australia
We are Social
Facebook
Twitter
Linkedin
Youtube
Subscribe to our Newsletter

We don't spam!

cuelogic

We are Hiring!

Blogs

Recent Posts

  • Low Code Platform: The Future of Software Development
  • BDD vs TDD : Highlighting the two important Quality Engineering Practices
  • Getting Started With Feature Flags
  • Data Mesh – Rethinking Enterprise Data Architecture
  • Top Technology Trends for 2021
cuelogic

We are Hiring!

Blogs

Recent Posts

  • Low Code Platform: The Future of Software Development
  • BDD vs TDD : Highlighting the two important Quality Engineering Practices
  • Getting Started With Feature Flags
  • Data Mesh – Rethinking Enterprise Data Architecture
  • Top Technology Trends for 2021
We are Global
India  |  USA  | Australia
We are Social
Facebook
Twitter
Linkedin
Youtube
Subscribe to our Newsletter

We don't spam!

Services
Product Engineering

Product Development

UX Consulting

Application Development

Application Modernization

Quality Assurance Services

Cloud Engineering

Cloud Services

DevOps Services

Cloud Migration

Cloud Optimization

Cloud Computing Services

Data & Machine Learning

Big Data Services

AI Consulting

Internet of Things

IoT Consulting

IoT Application Services

Innovation Lab As A Service
Cybersecurity Services
Healthcare IT Services
Company

About

Culture

Current Openings

Insights
Privacy Policy  
All Rights Reserved © Cuelogic 2021

Close

Do you have an app development challenge? We'd love to hear about it!

By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.