lrakai / azure-u-sql-data-lake-analytics

Submitting a U-SQL Job to Azure Data Lake Analytics

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

azure-u-sql-data-lake-analytics

Submitting a U-SQL Job to Azure Data Lake Analytics

Environment Diagram

Getting Started

An Azure RM template is included in infrastructure/ to create the environment:

Using Azure PowerShell, do the following to provision the resources:

.\startup.ps1

Alternatively, you can perform a one-click deploy with the following button:

Following Along

  1. Upload data/searchLog.tsv to the Data Lake Store created by the template.

  2. Create a Data Lake Analytics job and use the following U-SQL query:

    @searchlog =
        EXTRACT UserId   int,
            TimeStamp    DateTime,
            Language     string,
            Query        string,
            Duration     int,
            Urls         string,
            ClickedUrls  string
        FROM "/searchLog.tsv"
        USING Extractors.Tsv();
    
    @out =
        SELECT TimeStamp, Query, Duration
        FROM @searchlog
        WHERE Duration > 800;
    
    OUTPUT @out
        TO "/output.tsv"
        USING Outputters.Tsv();
  3. Inspect the output.tsv file in the Data Lake Store.

Tearing Down

When finished, remove the Azure resources with:

.\teardown.ps1

Acknowledgments

Thanks to Microsoft for the sample search log data.

About

Submitting a U-SQL Job to Azure Data Lake Analytics

License:MIT License


Languages

Language:PowerShell 100.0%