tonylee2016 / matlab-aws-athena

MATLAB Interface for AWS Athena

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MATLAB Interface for AWS Athena

MATLAB® Interface for Amazon Web Services Athena™ Service. Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. This package provides a basic interface to a subset of Athena features from within MATLAB.

Requirements

MathWorks products

3rd party products

  • Amazon Web Services account

To build a required JAR file:

Getting Started

Please refer to the Documentation to get started. The Installation Instructions and Basic Usage documents provide detailed instructions on setting up and using the interface. The easiest way to fetch this repository and all required dependencies is to clone the top-level repository using:

git clone --recursive https://github.com/mathworks-ref-arch/mathworks-aws-support.git

Build the AWS SDK for Java components

The MATLAB code uses the AWS SDK for Java and can be built using:

cd matlab-aws-athena/Software/Java
mvn clean package

Once built, use the matlab-aws-athena/Software/MATLAB/startup.m function to initialize the interface which will use the AWS Credentials Provider Chain to authenticate. Please see the relevant documentation on how to specify the credentials.

Using the interface

% Create some data needed in the examples
dbName = 'MyAirlines.airlines';
resultBucket = 's3://testing/airlineresult';
distLimit = 1000;

% Create the client object and authenticate using 
%   the AWS Default Provider Chain
ath = aws.athena.AthenaClient();
ath.Database = dbName;
ath.initialize


% Create a SQL statement and execute it (asynchronously)
queryFar = sprintf('SELECT UniqueCarrier, distance FROM %s WHERE distance > %d;', ...
    dbName, distLimit);
resultIDFar = ath.submitQuery(queryFar, resultBucket);

% Check the status, and make sure it says 'SUCCEEDED'
status = char(ath.getStatusOfQuery(resultIDFar));

% At this point, we can read the results by using a MATLAB datastore
resFile = sprintf('%s/%s.csv', resultBucket, char(resultIDFar));
ds = datastore(resFile);
ds.NumHeaderLines = 1;
farResult = ds.readall();

Supported Products:

  1. MATLAB (R2017b or later)
  2. MATLAB Compiler™ and MATLAB Compiler SDK™ (R2017b or later)
  3. MATLAB Production Server™ (R2017b or later)
  4. MATLAB Parallel Server™ (R2017b or later)

License

The license for the MATLAB Interface for AWS Athena is available in the LICENSE.md file in this GitHub repository. This package uses certain third-party content which is licensed under separate license agreements. See the pom.xml file for third-party software downloaded at build time.

Enhancement Request

Provide suggestions for additional features or capabilities using the following link:
https://www.mathworks.com/products/reference-architectures/request-new-reference-architectures.html

Support

Email: mwlab@mathworks.com

About

MATLAB Interface for AWS Athena

License:Other


Languages

Language:MATLAB 100.0%