mani nekkalapudimaninekkalapudi.hashnode.net·Nov 1, 2022A Typical Data PipelineIntroduction Hello people, Hope you are doing well. As data engineers, we build data pipelines to collect data from different source systems and place it in an analytics system i.e., a data warehouse/data lake. The data is usually sourced from syste...Discuss·33 likes·197 readsData Engineering dataengineering
Marwane Chahoudmar1.hashnode.net·Jan 30, 2023Unlocking the Power of AIOPS with ChatGPT and ElasticsearchIntroduction As you may have noticed, since the end of 2022 and the beginning of 2023, everyone has been talking about ChatGPT. People who have tried it, including me, have been amazed. I started using it as an assistant tool and found that it works ...Discuss·21 likes·2.5K readsdataengineering
Mayur Saidbuildprojectswithmayur.hashnode.net·Mar 9, 2023Building ETL Pipeline in Google Cloud Platform: A Project-Based Guide with PySpark and AirflowETL (Extract, Transform, Load) is a process of integrating data from various sources, transforming it into a format that can be analysed, and loading it into a data warehouse for business intelligence purposes. Building an ETL pipeline can be a daunt...Discuss·11 likes·287 readsGoogle Cloud Platform
Rohan Anandrohan-anand.hashnode.net·Apr 23, 2023Getting Started with Data Engineering: A Step-by-Step GuideAre you interested in becoming a data engineer but don't know where to start? In this post, I'll walk you through the basics of data engineering using a simple project that involves fetching data from GitHub and visualizing it through Looker. Steps:-...DiscussPython
Stephen David-Williamsspiritman7.hashnode.net·Apr 15, 2023SOLID principles in data engineering - Part 1SOLID principles are a set of principles that guide the software engineering process aiming to make code easier to read, test and maintain. This is a concept under Object Oriented Programming that was made popular by Robert Martin (commonly referred ...Discuss·2 likes·553 readsPython
Ujjwal Tyagityagidatawizard.hashnode.net·Apr 13, 2023Unleashing the Magic of Job Schedulers: How to Tame Your Code and Save Your SanityOnce upon a time, in a land far, far away, software engineers were manually running their code on their machines like it was the Wild West🐎. But then, a hero emerged - the job scheduler!🔫 A tool that revolutionized the way developers manage their t...Discuss·1 likescheduler
Rati Kushwahadiscoveringmyself.hashnode.net·Apr 7, 2023Data Engineering with DatabricksIn Databricks, data engineering refers to the process of collecting, storing, processing, and transforming data to make it available for analysis and decision-making. Databricks offers a unified analytics platform that simplifies data engineering by ...Discussdataengineering
Hritika Palhritika.hashnode.net·Apr 3, 2023Module 1: Data fundamentals you need to knowHey troubleshooters! Welcome to the new series where we gonna explore the basics of data terminologies which would ultimately make the trajectory of stepping into the data world bit easier. In order to get a better understanding you can check out the...Discuss·2 likesdataengineering
VIVEK RAJYAGURUvivekrajyaguru.hashnode.net·Apr 1, 2023Building a Scalable Data Warehouse on AWS: A Comprehensive GuideData warehousing is the process of storing, organizing, and managing large volumes of structured and unstructured data in a centralized repository, typically optimized for fast querying and analysis. Data warehousing plays a vital role in business in...DiscussData Science
VIVEK RAJYAGURUvivekrajyaguru.hashnode.net·Apr 1, 2023Building a Strong Foundation: Understanding Key Data Engineering ConceptsData engineering is a critical component of the data management process, serving as the foundation for many downstream analytical processes. It involves the development and deployment of infrastructure that enables efficient, reliable, and secure dat...Discussdataengineering
Vignesh MMvigneshmm.hashnode.net·Mar 28, 2023A Brief Overview of SingleStore DB - A High Performance, Distributed SQL DatabaseSingleStore DB is a distributed, in-memory, SQL-based database that is designed for data extensive applications that require high performance, scalability, and real-time analytics. In this blog post, we will briefly explore SingleStore DB and highlig...Discuss·40 readsdataengineering
Mayur Saidbuildprojectswithmayur.hashnode.net·Mar 9, 2023Building ETL Pipeline in Google Cloud Platform: A Project-Based Guide with PySpark and AirflowETL (Extract, Transform, Load) is a process of integrating data from various sources, transforming it into a format that can be analysed, and loading it into a data warehouse for business intelligence purposes. Building an ETL pipeline can be a daunt...Discuss·11 likes·287 readsGoogle Cloud Platform
Aditya Guptaitsadityagupta.hashnode.net·Mar 6, 2023Setting up the development environment on Google Virtual MachineI'm participating in this year's cohort of the Data Engineering Zoomcamp 2023. This is a community-led, free data engineering course of about 8 weeks. In this blog, I'll summarise the steps to configure a Google Virtual Machine to make it ready for t...Discuss·263 reads#dezoomcamp