Coding: Long number divisible by 8

Starting a new series here. When applying to a Software Engineer or Data Engineer role it's a standard to have a live or asynchronous coding interview. Most of the challenges you will need to crack don't reflect the day to day work you will do but are a great way to assess you problem solving skills and coding style.

As I believe most of these exercises are great puzzles and fun to solve I will start bringing my own version of some of these. Who knows if one of these days you won't be asked to solve it.

This time I am bringing you the classic divisible by 8 problem.

In this exercise you are asked to check if a long number can de divisible without using the division remainder of a number (Eg: IF number % 8 == 0 THEN True ELSE False ) and if by permuting any number you can obtain a number that is divisible by 8.

Continue reading “Coding: Long number divisible by 8”

Implementing a serverless data lake with AWS S3, Glue, and Athena using only the console (Videopost)

This the first part of a series of data engineering posts I am preparing, related to ingesting, storing, processing and exposing data using only AWS services. I will start by introducing the different services that will be used, make the implementation using the AWS portal and thus its wizard and will keep evolving until we achieve full automation, migrating from configuration-based development to code-based development using Python.

SCOPE: In this tutorial, we will build a serverless data lake using AWS services from scratch. Not just the storage layer in AWS S3 but also create an ETL job that will query a dummy MySQL database table full of customers information, clean the data and store it back into our S3 data lake. To make this first part easy to follow I will stick to creating the ETL job for only one of our source system tables.

Continue reading “Implementing a serverless data lake with AWS S3, Glue, and Athena using only the console (Videopost)”

Performing 1-1 meetings with your teams (1-Ns) with the black box model

Performing one-on-ones is a basic management tool for engineering manager in order to:

  • Understand the mood and motivation of employees
  • Gather and share important information
  • Set and evaluate continuous improvement behaviors and training for every one of them.

However, when it comes to the continuous improvement of your teams, I believe this only gives you the individual perspective of it, we are lacking the team perspective. We also need a tool to perform 1-Ns on top of the 1-1s.

Continue reading “Performing 1-1 meetings with your teams (1-Ns) with the black box model”

Star Schema 101

For all of you joining the world of analytic’s, I would like to share a small tutorial on star schema that you can use to optimize your analytic queries and data storage in databases. Although its one of the most used dimensional modelling techniques I would like to reenforce that this data organization architecture is not just for data warehousing as you can use it as a reference model to create for example powerpivot models based in excel files, that will be this way faster and more optimized.

Continue reading “Star Schema 101”

BI Concepts and Topics to Explore

“The beginning of wisdom is a definition of terms” 

There are days in which I decide to surf the web on a quest for new knowledge, typically I start by “googling” for some topic and let my will guide me trough the articles/books I find interesting. Of course this can be useful sometimes but others I end up depressed with the amount of new topics, concepts and architectures I find in the Business intelligence field.

This way I would like to share with you guys some of my latest finds and invite to research more on these as they can most probably affect the way we see and build a Data warehouse:

Continue reading “BI Concepts and Topics to Explore”

Download My SSIS eBook for Free [US-EN]

I have been the author of this book which was written for Syncfusion, to increase their offer on the Succinctly series.

Capture

SSIS Succinctly

SQL Server Integration Services is part of Microsoft’s business intelligence suite and an ETL (extract, transform, and load) tool. It does more than just move data between databases. It can be used to clean and transform data so that it can be used by data warehouses or even OLAP-based systems. With SSIS Succinctly by Rui Machado, you will learn how to build and deploy your own ETL solution in a drag-and-drop development environment by using SSIS packages, control flows, data flows, tasks, and more.

Continue reading “Download My SSIS eBook for Free [US-EN]”

Download My PowerShell eBook for Free [US-EN]

I have been the author of this book which was written for Syncfusion, to increase their offer on the Succinctly series. Two more books are on stack to be released so wait for more news in the following weeks.

Capture

PowerShell Succinctly highlights some of the PowerShell programming model’s many benefits, specifically for .NET developers and system administrators. Author Rui Machado guides readers through time-saving methods that simplify code testing by eliminating the need to create a new application in Visual Studio. Also included are tips for using additional services, such as PowerGui, WMI, and SQL Server, to get the most out of PowerShell. Even if you don’t already use scripting languages to manage your machines, PowerShell Succinctly will show you just how easy it is to automate activities, work with databases, and interact with a variety of file types with this useful model.

Continue reading “Download My PowerShell eBook for Free [US-EN]”

DW TIP: Get next or previous value with SQL Analytic Function [EN-US; PLSQL]

While querying our databases we might face a typical problema which is getting the next or the previous value of an atribute according to some rule applyed to a certain dataset. This happens more if you deal with datawarehouses and need to retrieve this kind of analytical information. The typical solutions involve several “group by” and sub queries to achieve the same result. This way SQL has a powerful feature  which are the Analytical functions.

Analytic functions compute an aggregate value based on a group of rows. They differ from aggregate functions in that they return multiple rows for each group. The group of rows is called a window and is defined by the analytic_clause. For each row, a sliding window of rows is defined. The window determines the range of rows used to perform the calculations for the current row. Window sizes can be based on either a physical number of rows or a logical interval such as time.

Continue reading “DW TIP: Get next or previous value with SQL Analytic Function [EN-US; PLSQL]”