[AWS SDK] sdk(maven) 를 이용하여 Athena query 하기

by Rainbound-IT 2022. 6. 23.


    aws ahtena 직접 들어가서 보면 좋긴하겟는데

    쿼리를 웹같은 앱에서 직접 볼수 있도록하면 더좋아서 사용.


    1. 보안설정

    aws configure -proifle 을 이용하여 iam 유저를 로컬에서 등록시킵니다.

    (물론 권한이 Athena, s3가 있어야합니다.)


    2. maven 샘플 프로젝트 생성

    maven 프로젝트를 생성해줍니다(window에서 입니다. linux는 -D뒤에 ""를 빼주셔야합니다.

    mvn -B archetype:generate  -D"archetypeGroupId"=org.apache.maven.archetypes  -D"groupId"=athenaJ2Example  -D"artifactId"=athenaJ2Example



    3. 생성된 pom.xml 변경

    다음과 같이 생성된 pom.xml을 바꿔 줍니다.

    <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">


    4. Athena에 query할 코드를 app.java를 변경 및 test 코드 변경, config.properties 추가


    app.java 변경

    maven 프로젝트를 생성하면 

    athenaJ2Example/src/main/java/athena2Example/app.java가 있을 것입니다.

    저는 편의상 app 이름을 샘플과 맞추기 위하여 StartQueryExample.java로 변경하고 aws 샘플코드를 복붙하여 넣엇습니다.

    package aws.example.athena;
    import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider;
    import software.amazon.awssdk.regions.Region;
    import software.amazon.awssdk.services.athena.AthenaClient;
    import software.amazon.awssdk.services.athena.model.QueryExecutionContext;
    import software.amazon.awssdk.services.athena.model.ResultConfiguration;
    import software.amazon.awssdk.services.athena.model.StartQueryExecutionRequest;
    import software.amazon.awssdk.services.athena.model.StartQueryExecutionResponse;
    import software.amazon.awssdk.services.athena.model.AthenaException;
    import software.amazon.awssdk.services.athena.model.GetQueryExecutionRequest;
    import software.amazon.awssdk.services.athena.model.GetQueryExecutionResponse;
    import software.amazon.awssdk.services.athena.model.QueryExecutionState;
    import software.amazon.awssdk.services.athena.model.GetQueryResultsRequest;
    import software.amazon.awssdk.services.athena.model.GetQueryResultsResponse;
    import software.amazon.awssdk.services.athena.model.ColumnInfo;
    import software.amazon.awssdk.services.athena.model.Row;
    import software.amazon.awssdk.services.athena.model.Datum;
    import software.amazon.awssdk.services.athena.paginators.GetQueryResultsIterable;
    import java.util.List;
     * Before running this Java V2 code example, set up your development environment, including your credentials.
     * For more information, see the following documentation topic:
     * https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-started.html
    public class StartQueryExample {
        public static void main(String[] args) throws InterruptedException {
            AthenaClient athenaClient = AthenaClient.builder()
            String queryExecutionId = submitAthenaQuery(athenaClient);
            waitForQueryToComplete(athenaClient, queryExecutionId);
            processResultRows(athenaClient, queryExecutionId);
        // Submits a sample query to Amazon Athena and returns the execution ID of the query.
        public static String submitAthenaQuery(AthenaClient athenaClient) {
            try {
                // The QueryExecutionContext allows us to set the database.
                QueryExecutionContext queryExecutionContext = QueryExecutionContext.builder()
                // The result configuration specifies where the results of the query should go.
                ResultConfiguration resultConfiguration = ResultConfiguration.builder()
                StartQueryExecutionRequest startQueryExecutionRequest = StartQueryExecutionRequest.builder()
                StartQueryExecutionResponse startQueryExecutionResponse = athenaClient.startQueryExecution(startQueryExecutionRequest);
                return startQueryExecutionResponse.queryExecutionId();
            } catch (AthenaException e) {
            return "";
        // Wait for an Amazon Athena query to complete, fail or to be cancelled.
        public static void waitForQueryToComplete(AthenaClient athenaClient, String queryExecutionId) throws InterruptedException {
            GetQueryExecutionRequest getQueryExecutionRequest = GetQueryExecutionRequest.builder()
            GetQueryExecutionResponse getQueryExecutionResponse;
            boolean isQueryStillRunning = true;
            while (isQueryStillRunning) {
                getQueryExecutionResponse = athenaClient.getQueryExecution(getQueryExecutionRequest);
                String queryState = getQueryExecutionResponse.queryExecution().status().state().toString();
                if (queryState.equals(QueryExecutionState.FAILED.toString())) {
                    throw new RuntimeException("The Amazon Athena query failed to run with error message: " + getQueryExecutionResponse
                } else if (queryState.equals(QueryExecutionState.CANCELLED.toString())) {
                    throw new RuntimeException("The Amazon Athena query was cancelled.");
                } else if (queryState.equals(QueryExecutionState.SUCCEEDED.toString())) {
                    isQueryStillRunning = false;
                } else {
                    // Sleep an amount of time before retrying again.
                System.out.println("The current status is: " + queryState);
        // This code retrieves the results of a query
        public static void processResultRows(AthenaClient athenaClient, String queryExecutionId) {
           try {
               // Max Results can be set but if its not set,
               // it will choose the maximum page size.
                GetQueryResultsRequest getQueryResultsRequest = GetQueryResultsRequest.builder()
                GetQueryResultsIterable getQueryResultsResults = athenaClient.getQueryResultsPaginator(getQueryResultsRequest);
                for (GetQueryResultsResponse result : getQueryResultsResults) {
                    List<ColumnInfo> columnInfoList = result.resultSet().resultSetMetadata().columnInfo();
                    List<Row> results = result.resultSet().rows();
                    processRow(results, columnInfoList);
            } catch (AthenaException e) {
        private static void processRow(List<Row> row, List<ColumnInfo> columnInfoList) {
            for (Row myRow : row) {
                List<Datum> allData = myRow.data();
                for (Datum data : allData) {
                    System.out.println("The value of the column is "+data.varCharValue());


    Testcode 변경

    이것도 app.java와 마찬가지로 


    AppTest ->AmazonAthenaTest 로 변경후

    코드를 입력해줍니다.

    import aws.example.athena.*;
    import org.junit.jupiter.api.*;
    import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider;
    import software.amazon.awssdk.regions.Region;
    import software.amazon.awssdk.services.athena.AthenaClient;
    import java.io.*;
    import java.util.Properties;
    import static org.junit.jupiter.api.Assertions.*;
    public class AmazonAthenaTest {
        private static AthenaClient athenaClient;
        private static String nameQuery;
        public static void setUp() throws IOException {
            athenaClient = AthenaClient.builder()
            try (InputStream input = AmazonAthenaTest.class.getClassLoader().getResourceAsStream("config.properties")) {
                Properties prop = new Properties();
                if (input == null) {
                    System.out.println("Sorry, unable to find config.properties");
                //load a properties file from class path, inside static method
                // Populate the data members required for all tests
                nameQuery = prop.getProperty("nameQuery");
            } catch (IOException ex) {
        public void whenInitializingAWSAthenaService_thenNotNull() {
            System.out.println("Test 1 passed");
        public void CreateNamedQueryExample() {
           CreateNamedQueryExample.createNamedQuery(athenaClient, nameQuery);
           System.out.println("Test 2 passed");
        public void ListNamedQueryExample() {
           System.out.println("Test 3 passed");
        public void ListQueryExecutionsExample() {
            System.out.println("Test 4 passed");
        public void DeleteNamedQueryExample()
            String sampleNamedQueryId = DeleteNamedQueryExample.getNamedQueryId(athenaClient, nameQuery);
            DeleteNamedQueryExample.deleteQueryName(athenaClient, sampleNamedQueryId);
            System.out.println("Test 5 passed");
        public void StartQueryExample() {
            try {
                String queryExecutionId = StartQueryExample.submitAthenaQuery(athenaClient);
                StartQueryExample.waitForQueryToComplete(athenaClient, queryExecutionId);
                StartQueryExample.processResultRows(athenaClient, queryExecutionId);
                 System.out.println("Test 6 passed");
            }catch (InterruptedException e) {
        public void StopQueryExecutionExample() {
            String sampleQueryExecutionId = StopQueryExecutionExample.submitAthenaQuery(athenaClient);
            StopQueryExecutionExample.stopAthenaQuery(athenaClient, sampleQueryExecutionId);
            System.out.println("Test 7 passed");

    config.properties 추가

    test코드에서 사용하는것같은데 이것을 활용해서 여러 변수를 넣어줄수 있습니다.

    그러나 폴더 및 파일생성 후 간단하게 샘플코드처럼만 넣도록 하겠습니다.


    nameQuery = sampleQuery



    5. Maven package

    이제 패키징 해볼 차례입니다.

    프로젝트 폴더 (athenaJ2Example) 로 들어가서 mvn package라고 명령어를 입력합니다.

    아래와 같은 에러가 발생합니다.



    non-static variable Exampleconstatns cannot be referenced from a static context symbol: variable ATHENA_DEFAULT_DATABASE(ATHENA_OUTPUT_BUCKET, ATHENA_SAMPLE_QUERY ) 에러

     에러가 발생합니다. 

    당연한 말이지만 지금 쿼리를 입력을 안했는데 무슨 결과 값이 나올까요? 나올게 없습니다.

    StartQueryExample.java 에 쿼리 및 DB값을 넣어줍니다.


    ATHENA_OUTPUT_BUCKET : 결과값 출력할 s3 주소


    ATHENA_DEFAULT_DATABASE: query할 Athena database명

    package aws.example.athena;
    public class ExampleConstants {
        public static final int CLIENT_EXECUTION_TIMEOUT = 100000;
        public static final String ATHENA_OUTPUT_BUCKET = "s3://bucketscott2"; // change the Amazon S3 bucket name to match your environment
        //  Demonstrates how to query a table with a comma-separated value (CSV) table.  For information, see
        public static final String ATHENA_SAMPLE_QUERY = "SELECT * FROM scott2;"; // change the Query statement to match your environment
        public static final long SLEEP_AMOUNT_IN_MS = 1000;
        public static final String ATHENA_DEFAULT_DATABASE = "mydatabase"; // change the database to match your database




    위와 같이 수정한 후 mvn package 했는데 다음과 같은 에러가 발생합니다.

    java.lang.IllegalStateException: To use assumed roles in the [role NAME] profile, the 'sts' service module must be on the class path.

    제가 role을 사용해서 이런에러가 발생하는건지 잘 모르겠네요.

    이건 dependency 에러로 pom.xml에 다음을 추가합니다.


    그리고 mvn package를 하면

    위와 같이 test를 거친후 빌드를 성공하는것을 볼수 있습니다.



    애플리케이션 실행

    mvn exec:java -D"exec.mainClass"="aws.example.athena.StartQueryExample"

    test와 같은 결과를 얻을수 있고



    s3에 결과값이 저장된것을 볼수 있습니다.





    aws sdk for java 시작하기



    를 시작합니다.AWS SDK for Java2.x - AWS SDK for Java

    당신은 것입니다.아니보안 액세스 키를 다운로드하거나 복사할 수 있는 다른 기회가 있습니다.



    Ahtena 코드 샘플



    Java (SDK V2) Code Samples for Amazon Athena - AWS Code Sample

    Thanks for letting us know this page needs work. We're sorry we let you down. If you've got a moment, please tell us how we can make the documentation better.



    the 'sts' service module must be on the class path 에러관련



    STS not on class path when using parallel stream · Issue #2123 · aws/aws-sdk-java-v2

    Running EcrClient.listImagesPaginator() in a Collection.parallelStream() using Spring Boot sometimes results in WebIdentityCredentialsUtils.factory() not finding STS on its class path. Describe the...




    aws athena sample 설명




