Uploading docs in ElasticSearch using Java with async for super-fast processing with examples

Uploading docs in ElasticSearch using Java with async for super-fast processing with examples

I wanted a quick and easy method to dump a lot of objects to my ElasticSearch endpoint and I did the rookie mistake of adding the “official” maven repository for ElasticSearch for JAVA.

That jar is so “huge”. It has pretty much all of server and client code. I tweeted about this but looks like there isn’t going to a fix soon.

Worry not, I was able to find a light-weight ElasticSearch client for JAVA called “Jest”. This is how you install the dependency

<dependency>    <groupId>io.searchbox</groupId>
    <artifactId>jest</artifactId>
    <version>6.3.1</version>
</dependency>

Initializing the client:

I signed up for an account on “elastic.co”. The parent company that owns ElasticSearch. Using AWS ElasticSearch was really hard and had a lot of requirements & steps. elastic.co was one click and they also had a free trial, so I went with it.

JestClient getJestClient() {
    JestClientFactory factory = new JestClientFactory();
    factory.setHttpClientConfig(
        new HttpClientConfig.Builder(
            "https://<your-endpoint>-central1.gcp.cloud.es.io:9243")
            .defaultCredentials("elastic", "password")
            .build()
    );
    return factory.getObject();
  }

The JestClient offers a lot of APIs but since I want to just dump a lot of documents to my endpoint, I used their bulk async methods. This is how I use it.

First create a list of items of “Index” items.

List<Index> indexList = new ArrayList<>();
String jsonString = objectMapper.writeValueAsString(feedDao);
Index index = new Index.Builder(jsonString).index("feeds").type("doc")
              .id(feedDao.getFeedId()).build();
indexList.add(index);

My obejct is called “FeedDao” and I convert that to json using objectmapper and just saving it to an indexlist.

Then you need to create a Bulk request object like this. Make sure you have created your “index” already on ES. You can create index this simple API: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html

This is the Bulk Object using builder pattern:

Bulk bulk = new Bulk.Builder()
     .defaultIndex("feeds")
     .defaultType("doc")
     .addAction(indexList)
     .build();

The way to call an executeAsync is a bit different. You need to be able to handle the failure and successes. You can do whatever you want with the results.

jestClient.executeAsync(bulk, new JestResultHandler<JestResult>() {
      @Override
      public void completed(JestResult result) {
        log.info(result);
      }
      @Override
      public void failed(Exception ex) {
        log.error(ex);
      }
});

That’s it. I hope this was useful to you. Feel free to bookmark this for later use of you can use the Search of this portal to find this later.

If there is an error somewhere, please let me know in the comments.

No Comments

Post A Comment