Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- Detect AWS SDK `Apache5HttpClient` in `AwsSdk2Transport` body-method guardrail ([#1903](https://github.com/opensearch-project/opensearch-java/pull/1970))
- Support Jackson 3.x release line ([#1810](https://github.com/opensearch-project/opensearch-java/pull/1810))
- Added `equals()` and `hashCode()` implementations to `FieldValue` ([#1998](https://github.com/opensearch-project/opensearch-java/pull/1998))
- Add document lifecycle guide and runnable sample ([#2017](https://github.com/opensearch-project/opensearch-java/pull/2017))

### Fixed

Expand Down
294 changes: 294 additions & 0 deletions guides/document_lifecycle.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,294 @@
- [Document Lifecycle](#document-lifecycle)
- [Setup](#setup)
- [Index a document with an ID](#index-a-document-with-an-id)
- [Handle duplicate documents](#handle-duplicate-documents)
- [Index or replace a document](#index-or-replace-a-document)
- [Index a document with an auto-generated ID](#index-a-document-with-an-auto-generated-id)
- [Get a document](#get-a-document)
- [Filter source fields](#filter-source-fields)
- [Get multiple documents](#get-multiple-documents)
- [Check whether a document exists](#check-whether-a-document-exists)
- [Update a document](#update-a-document)
- [Update a document with a script](#update-a-document-with-a-script)
- [Update documents by query](#update-documents-by-query)
- [Reindex documents](#reindex-documents)
- [Delete a document](#delete-a-document)
- [Delete documents by query](#delete-documents-by-query)
- [Clean up](#clean-up)

# Document Lifecycle

This guide covers common document lifecycle operations with the OpenSearch Java client: indexing, retrieving, updating, reindexing, and deleting documents.

You can find a working version of the code in [DocumentLifecycle.java](../samples/src/main/java/org/opensearch/client/samples/DocumentLifecycle.java).

## Setup

Create a client and the indices used by the examples below.

```java
final HttpHost[] hosts = new HttpHost[] {
new HttpHost("http", "localhost", 9200)
};

final OpenSearchTransport transport = ApacheHttpClient5TransportBuilder
.builder(hosts)
.setMapper(new JacksonJsonpMapper())
.build();
OpenSearchClient client = new OpenSearchClient(transport);

String index = "movies-document-lifecycle";
String reindexedIndex = "movies-document-lifecycle-reindexed";

client.indices().create(c -> c.index(index));
```

The examples use this `Movie` document class.

```java
public static class Movie {
private String title;
private Integer year;

public Movie() {}

public Movie(String title, Integer year) {
this.title = title;
this.year = year;
}

public String getTitle() {
return title;
}

public void setTitle(String title) {
this.title = title;
}

public Integer getYear() {
return year;
}

public void setYear(Integer year) {
this.year = year;
}
}
```

## Index a document with an ID

Use the create API when the document must not already exist. OpenSearch returns an error if another document already has the same ID.

```java
Movie movie = new Movie("Beauty and the Beast", 1991);

CreateResponse response = client.create(
c -> c.index(index)
.id("1")
.document(movie)
.refresh(Refresh.WaitFor)
);
```

## Handle duplicate documents

A second index request with the same ID returns a `409` conflict. Keep expected errors inside a `try/catch` block so the sample can keep running. Depending on the transport and error conversion path, the conflict may be raised as a transport `ResponseException` or as an `OpenSearchException`.

```java
try {
client.create(c -> c.index(index).id("1").document(new Movie("Beauty and the Beast", 1991)));
} catch (ResponseException e) {
if (e.status() != 409) {
throw e;
}
} catch (OpenSearchException e) {
if (e.status() != 409) {
throw e;
}
}
```

## Index or replace a document

Use the index API when you want to create or replace a document. If the document with such ID already exists, OpenSearch replaces the stored document.

```java
IndexResponse response = client.index(
i -> i.index(index)
.id("1")
.document(new Movie("Beauty and the Beast: Special Edition", 2002))
.refresh(Refresh.WaitFor)
);
```

## Index a document with an auto-generated ID

If you do not provide an ID, OpenSearch generates one and returns it in the index response.

```java
IndexResponse response = client.index(
i -> i.index(index)
.document(new Movie("The Lion King", 1994))
.refresh(Refresh.WaitFor)
);

String generatedId = response.id();
```

## Get a document

Use the get API to retrieve a document by index and ID.

```java
GetResponse<Movie> response = client.get(g -> g.index(index).id("1"), Movie.class);

if (response.found()) {
Movie movie = response.source();
}
```

## Filter source fields

Use source includes or excludes to control which fields OpenSearch returns in `_source`.

```java
GetResponse<Movie> titleOnly = client.get(
g -> g.index(index)
.id("1")
.sourceIncludes("title"),
Movie.class
);

GetResponse<Movie> withoutYear = client.get(
g -> g.index(index)
.id("1")
.sourceExcludes("year"),
Movie.class
);
```

## Get multiple documents

Use the multi get API to retrieve several documents in one request.

```java
MgetResponse<Movie> response = client.mget(
m -> m.index(index).ids("1", generatedId),
Movie.class
);

for (MultiGetResponseItem<Movie> item : response.docs()) {
if (item.isResult() && item.result().found()) {
Movie movie = item.result().source();
}
}
```

## Check whether a document exists

Use the exists API when you only need to know whether a document is present.

```java
boolean exists = client.exists(e -> e.index(index).id("1")).value();
```

## Update a document

Use the update API with a partial document to change selected fields.

```java
UpdateRequest<Movie, Map<String, Object>> request =
new UpdateRequest.Builder<Movie, Map<String, Object>>()
.index(index)
.id("1")
.doc(Map.of("year", (Object) 1995))
.refresh(Refresh.WaitFor)
.build();

UpdateResponse<Movie> response = client.update(request, Movie.class);
```

## Update a document with a script

Use an inline script when the update should be computed from the current document state.

```java
UpdateRequest<Movie, Object> request = new UpdateRequest.Builder<Movie, Object>()
.index(index)
.id("1")
.script(s -> s.inline(i -> i.source("ctx._source.year += 5")))
.refresh(Refresh.WaitFor)
.build();

UpdateResponse<Movie> response = client.update(request, Movie.class);
```

## Update documents by query

Use update by query to update every document that matches a query.

```java
client.index(
i -> i.index(index)
.id("future")
.document(new Movie("Future Movie", 2025))
.refresh(Refresh.WaitFor)
);

Query newerThan2023 = Query.of(
q -> q.range(r -> r.field("year").gt(JsonData.of(2023)))
);

UpdateByQueryResponse response = client.updateByQuery(
u -> u.index(index)
.query(newerThan2023)
.script(s -> s.inline(i -> i.source("ctx._source.year -= 1")))
.refresh(Refresh.True)
);
```

## Reindex documents

Use reindex to copy documents from one index to another.

```java
ReindexResponse response = client.reindex(
r -> r.source(s -> s.index(index))
.dest(d -> d.index(reindexedIndex))
.refresh(Refresh.True)
.waitForCompletion(true)
);
```

## Delete a document

Use the delete API to remove one document by ID.

```java
DeleteResponse response = client.delete(
d -> d.index(index)
.id("1")
.refresh(Refresh.WaitFor)
);
```

## Delete documents by query

Use delete by query to remove every document that matches a query.

```java
DeleteByQueryResponse response = client.deleteByQuery(
d -> d.index(index)
.query(newerThan2023)
.refresh(Refresh.True)
);
```

## Clean up

Delete the sample indexes when you are done.

```java
client.indices().delete(d -> d.index(reindexedIndex));
client.indices().delete(d -> d.index(index));
```
Loading
Loading