ThomVanL / blog-2021-10-azure-purview

Keeping an eye on your data estate.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Azure Purview

Keeping an eye on your data estate.

Last year, on December 3rd 2020, Microsoft announced the public preview of Azure Purview at the "Azure Data and Analytics digital" event. At the time I didn't have the time to closely inspect what the service was all about. At first glance, it became clear that the service was a data governance tool, one I quickly marked as "one to watch".

Azure Purview has been generating lots of buzz, too. I would receive updates, just about every month since its public preview announcement, about new features that had been implemented.

At the end of August 2021, I had signed up for an event called "Maximize the value of your data in the Cloud", which would take place on the 28th of September 2021. During this event, Rohan Kumar (Corporate VP Azure Data) would announce the general availability of Azure Purview. That was the last straw, I had to make some time so I could take a closer look! 🙃

Feel free to read the full blog post!

Getting started with Azure Purview

Deploy to Azure

To make it easier to test some of Azure Purview's features, I have created an ARM template that deploys the following resources:

  • An Azure CosmosDB account with a SQL API database and an empty collection
    • Serverless compute tier
    • You could import a sample DB set, such as CosmicWorks.
  • An Azure SQL Database, using the AdventureWorkLT sample db
    • Serverless compute tier
    • ⚠️ When testing connections in the GUI you may encounter a timeout, this is likely due to the serverless compute's warm-up time. If this happens, simply give it another attempt.
  • An Azure Data Lake Gen 2 storage account.
  • An Azure Data Factory with a configured sample pipeline, datasets and linked services
    • The sample pipeline will export all tables from the AdventureWorksLT database to the data lake.
  • A Key vault, with some secrets and access policies
    • Secrets
      • Azure Cosmos DB primary readonly key
      • Azure SQL Server admin password
      • Azure Storage Account primary key
    • Access policies
      • Azure Data Factory managed identity: allow secret get and list.
      • Azure Purview managed identity: allow secret get and list.
  • An empty Azure Purview account
    • You will need to register data sources, credential links with Azure Key Vault and a link with Data Factory to get data lineage support.

All of the above is useful for demonstration purposes only and not production ready, as you would likely need to apply additional security hardening techniques. At any rate, with all of those resources deployed we can have a look at some of Azure Purview's foundational elements...

About

Keeping an eye on your data estate.

License:MIT License