Skip to content

πŸš€ Build an automated data pipeline with Azure Data Factory and Databricks to efficiently process and analyze trip transaction data using Medallion Architecture.

License

Notifications You must be signed in to change notification settings

techop9045/Azure-Data-Factory-and-Databricks-End-to-End-Project

Repository files navigation

🌟 Azure-Data-Factory-and-Databricks-End-to-End-Project - Your Simple Solution for Data Analytics

Download

πŸš€ Getting Started

This project provides a straightforward way to analyze trip transactions using a data engineering pipeline that follows the Medallion Architecture (Bronze-Silver-Gold). We use Azure Data Factory, Databricks, and Delta Lake to create an automated ETL process, offering real-time monitoring and email notifications through Logic Apps.

No programming knowledge is needed. Just follow the steps below to set it up.

πŸ“₯ Requirements

Before you begin, ensure your system meets these requirements:

  • Windows or MacOS
  • Internet connection
  • Azure account (You can create a free account here)

πŸ“¦ Download & Install

To get started, visit the releases page to download the project.

Visit this page to download

  1. Go to the link above.
  2. Look for the latest release.
  3. Click on the file titled AzureDataFactoryDatabricks.zip.
  4. Download the file and save it to your computer.
  5. Once the download is complete, extract the contents of the zip file to a folder of your choice.

βš™οΈ Setup Instructions

After downloading the files, you need to set up the project in your Azure environment.

  1. Sign in to Azure. Open your browser and go to the Azure portal. Log in using your Azure account credentials.

  2. Create resources:

    • Data Factory: In the Azure portal, search for "Data Factory" and create a new instance.
    • Databricks: Search for "Databricks" and set up a new workspace.
    • SQL Database: Search for "SQL Database" and create one to store your data.
    • Storage: Ensure you set up an Azure Data Lake Storage Gen2 account for data storage.
  3. Configure ETL Process:

    • In the files you extracted, find Instructions.txt. This file contains detailed steps to configure the data factory and connect it to Databricks.
    • Follow the steps carefully to set up the necessary pipelines and datasets.

πŸ› οΈ Features

This project includes the following features:

  • Automated ETL: The project automates the data extraction, transformation, and loading processes.
  • Real-Time Monitoring: Get notifications for data pipeline status and errors through Azure Logic Apps.
  • Medallion Architecture: Organizes data into different layers (Bronze, Silver, Gold) for better management and reporting.
  • Analytics Ready: Designed for business intelligence applications to analyze trip transaction data.

πŸ”§ Troubleshooting

If you encounter issues during setup or use, check the following:

  1. Azure Permissions: Ensure you have the right permissions to create resources in your Azure account.
  2. Network Issues: Check your internet connection.
  3. Follow Instructions: Review Instructions.txt for any missed steps.

For further assistance, consider visiting the Azure support page or referring to community forums.

πŸ”— Additional Resources

Here are some helpful resources to help you understand the project better:

πŸ“ž Support

If you have questions or need help, feel free to contact us through the project's GitHub page. You can open an issue for support or suggestions.

πŸ‘₯ Contributors

We welcome contributions! If you would like to contribute to this project, please follow the guidelines in the project repository.

🀝 License

This project is licensed under the MIT License. Check the LICENSE file for details.

Feel free to get started with this project to streamline your data analysis process. For any questions or issues, please open an issue or check the provided resources. Enjoy exploring the world of data!

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •